It is no secret that customer retention is a top priority for many companies; acquiring new customers can be several times more expensive than retaining existing ones. Furthermore, gaining an understanding of the reasons customers churn and estimating the risk associated with individual customers are both powerful components of designing a data-driven retention strategy. A churn model can be the tool that brings these elements together and provides insights and outputs that drive decision making across an organization.
In its simplest form, churn rate is calculated by dividing the number of customer cancellations within a time period by the number of active customers at the start of that period. Very valuable insights can be gathered from this simple analysis — for example, the overall churn rate can provide a benchmark against which to measure the impact of a model. And knowing how churn rate varies by time of the week or month, product line, or customer cohort can help inform simple customer segments for targeting as well.
However, churn is often needed at more granular customer level. Customers vary in their behaviors and preferences, which in turn influence their satisfaction or desire to cancel service. Therefore, a cohort-based churn rate may not be enough for precise targeting or real-time risk prediction. This is where churn modeling is usually most useful.
The output of a predictive churn model is a measure of the immediate or future risk of a customer cancellation. This is what the term "churn modeling" most often refers to, and is the definition we will adhere to in this post.
In a business setting, churn can be broadly characterized as either contractual or non-contractual. It can also be characterized as voluntary or non-voluntary depending on the cancellation mechanism.
Note that the rows in the above matrix are not mutually exclusive: Involuntary churn can be present in either contractual or non-contractual settings.
Churn is especially relevant in contractual circumstances, which are often referred to as a "subscription setting," as cancellations are explicitly observed. However, non-contractual businesses also benefit from modeling churn. The challenge, in those case, lies in defining a clear churn event timestamp. This is often done by finding a certain threshold for a period of inactivity and using it as a definition for the churn event.
On the other hand, voluntary and involuntary churn might be caused by different underlying factors. Voluntary churn is often more prevalent than accidental churn due to events such as payment failures. It is also more difficult to determine the root cause of voluntary customer cancellations, which is why most churn literature focuses on voluntary churn events. While both voluntary and non-voluntary cancellations have a clear revenue impact, it is best to focus a churn model on only one type of churn.
The probability of churn can be predicted using various statistical or machine learning techniques. These methods process historical purchase and behavior data in order to predict the probability of cancellation per customer.
A well-constructed model can inform a wide range of decisions and flow into numerous internal tools or applications. For example, some common use cases for a churn model are:
Measuring feature impacts on the likelihood of churn in order to understand why customers choose to leave, which can inform long-term retention initiatives
Creating churn risk scores that can indicate who is likely to leave, and using that information to drive retention campaigns
Predicting the probability of churn and using it to flag customers for upcoming email campaigns
Integrating outputs with internal apps, such as a customer call center, to provide relevant real-time churn risk information
Discounting strategically with promotion campaigns to customers with a high cancellation risk
And many more…
So, where do we begin when creating and using a churn model? Building a successful model happens in several broad stages, from concept to deployment:
Understand your use case
Establishing a clear use case for a model is always the first and most important step. This process will not only determine who will use the model output and how, but it also dictates the data scientists’ choice of modeling method.
Identify users and stakeholders from each team
Identify stakeholders within your organization who will touch the churn model output. Consider this simple example: A customer service representative would like to see whether it is reasonable to offer a promotional price to a customer currently on a call. One way to do this is to have your data scientists train a churn model and give it to the engineering team to deploy. Once the outputs — in this case, churn risk scores — are integrated into the call center software, the customer call center representatives can use this information to make informed decisions about discounts. Keep in mind: This process will be a lot easier if you gather feedback from the involved parties early on to inform the model-building process.
Identify key metrics optimization
Think about the scope and the metric being optimized. For instance, if the costs associated with your retention campaigns are high, then your model should be focused on reducing the number of false-positive hits (i.e., minimizing the number of low-dollar customers who are being enrolled in your campaign). Identifying the right metric will help to measure the model’s impact and corresponding return on investment.
Finally, take action! Execute on the initial goal and start using your model output.
In our upcoming posts, we will dive deeper into the world of churn modeling, including the difficulties most often encountered by a modeler, an overview of common churn modeling techniques, and more.
Sign up today to receive the latest DataScience content in your inbox.