Survival Analysis When No One Dies: A Value-Based Approach
A
generalized version of Kaplan-Meier allows to model a continuous value (like
money) instead of a binary signal (like survival)
Survival
analysis is a statistical approach used to answer the question: “How long
will something last?” That “something” could range from a patient’s lifespan to
the durability of a machine component or the duration of a user’s subscription.
One
of the most widely used tools in this area is the Kaplan-Meier estimator.
Born
in the world of biology, Kaplan-Meier made its debut tracking life and death.
But like any true celebrity algorithm, it didn’t stay in its lane. These days,
it’s showing up in business dashboards, marketing teams, and churn analyses
everywhere.
But
here’s the catch: business isn’t biology. It’s messy, unpredictable, and
full of plot twists. This is why there are a couple of issues that make our
lives more difficult when we try to use survival analysis in the business
world.
First
of all, we are typically not just interested in whether a customer has
“survived” (whatever survival could mean in this context), but rather in how
much of that individual’s economic value has survived.
Secondly,
contrary to biology, it’s very possible for customers to “die” and
“resuscitate” multiple times (think of when you unsubscribe/resubscribe to
an online service).
In
this article, we will see how to extend the classical Kaplan-Meier approach so
that it better suits our needs: modeling a continuous (economic) value
instead of a binary one (life/death) and allowing “resurrections”.
A
refresher on the Kaplan-Meier estimator
Let’s
pause and rewind for a second. Before we start customizing Kaplan-Meier to fit
our business needs, we need a quick refresher on how the classic version works.
Suppose
you had 3 subjects (let’s say lab mice) and you gave them a medicine you need
to test. The medicine was given at different moments in time: subject a received
it in January, subject b in April, and subject c in
May.
Then,
you measure how long they survive. Subject a died after 6
months, subject c after 4 months, and subject b is
still alive at the time of the analysis (November).
Now, even
if we wanted to measure a simple metric, like average survival, we would face a
problem. In fact, we don’t know how long subject b will
survive, as it is still alive today.
This
is a classical problem in statistics, and it’s called “right censoring“.
Right
censoring is stats-speak for “we don’t know what happened after a certain
point” and it’s a big deal in survival analysis. So big that it led to the
development of one of the most iconic estimators in statistical history: the
Kaplan-Meier estimator, named after the duo who introduced it back in the
1950s.
So,
how does Kaplan-Meier handle our problem?
First,
we align the clocks. Even if our mice were treated at different times, what
matters is time since treatment. So we reset the x-axis
to zero for everyone — day zero is the day they got the drug.
📌 Visit Us:
🌐
Website: //statisticsaward.com/
🏆
Nomination: //statisticsaward.com/award-nomination/?
📝
Registration: //statisticsaward.com/award-registration/
Comments
Post a Comment