Businesses are increasingly interested in how big data, artificial intelligence, machine learning, and predictive analytics can be used to increase revenue, lower costs, and improve their business processes. In this blog post, we describe how we’ve developed a data-driven machine learning method to optimize the collections process for a debt collection agency.

According to the Federal Reserve Bank of New York, over $600 billion of household debt in the U.S. is delinquent as of June 30th, 2017. Of this amount,$400 billion is delinquent for more than 90 days. This not only poses a significant problem for consumers as the debt often accumulates interest, but also for the companies that own the debt as it significantly cuts into their revenue. For these companies, being able to collect as much of the debt as possible will significantly increase their bottom line.

### Debt Collection Process

The collection process usually follows a predefined schedule of letters, emails, and phone calls that communicate with increasing urgency the need to repay the debt over time. Ultimately, if the debtor refuses to repay the debt, then legal action can be taken by the collection agency to force repayment. Legal action is expensive and often outside of the collection agency’s control, so it is only viewed as a last resort and avoided as much as possible. In contrary to popular belief, debt collectors generally prefer to cooperate with debtors to repay their debt by offering interest-free extensions, repayment plans, or in some cases waiving parts of the debt if the debtor is truly unable to repay. However, this is only possible if the debtor is cooperative and responds to the collectors’ communication attempts (e.g., answers the phone or replies to email).

Letters and emails are mostly automated, but phone calls still require human collectors to physically dial a number and have a conversation with the debtor. This is integral to the collection process because debt collection is highly emotional, and an experienced collector is able to decipher the needs and problems of the debtor and determine the best course of action to maximize the likelihood of repayment.

However, debt collection agencies generally have a large number of open cases and the number of phone calls that it can make is limited by human resources. Under these constraints, it becomes infeasible to call every debtor and a method to select debtors to call becomes necessary. Not calling a debtor who needs human persuasion results in further delinquency and greater risk for non-repayment, but calling a debtor who doesn’t need additional persuasion results in wasted effort. Our goal is to identify under which conditions phone calls are most effective in eliciting eventual repayment, and to create an optimal schedule of calls to each debtor while abiding by the capacity constraints faced by the collector.

### Optimizing Calling Schedules: A Well-Known but Infeasible Solution

A common approach to such a problem is to build a mathematical model of the collection process and then calibrate its parameters using historical collection data. Such a model often takes the form of a Markov decision process (MDP), which takes all presently available information encapsulated within a state space for the complete debtor portfolio, and computes the long-run value of calling each debtor for each realization of the state space. It works by starting from the end state for each debtor (e.g. complete repayment or writing off the debt) and then traversing backwards through all actions and events that occurred to attribute the value of this result to everything that had happened.

Although this approach is theoretically sound, two major problems prevent it from being feasible in practice. First, the collector doesn’t know how many future cases to expect and what kind of information will be provided with those cases. This means that calling schedules optimized for the current portfolio of debtors today could be disrupted by a large group of newly arriving debtors a week from now. Therefore, what’s necessary is an adaptive solution that is always flexible to fluctuations in the debtor portfolio and calling capacity. The second and more challenging problem is that solving an MDP suffers from the curse of dimensionality.

The computational time of solving an MDP is exponential with respect to the number of states that we use to encapsulate the presently available information, and there is a relatively large number and variety of possible events that can happen throughout the collection process. We compiled a set of 25 variables that could potentially affect the likelihood of debt repayment (e.g. amount of debt outstanding, whether the debtor has promised to repay, and the number of days since the last phone call to this debtor). The combination of unknown future debtor arrivals and intractable computation suggests that a different solution should be considered.

### Predicting Call Values: A Simplified but Feasible Solution

We decided to simplify our approach. Instead of deriving optimal calling schedules for each debtor, we just focused on ensuring that all phone calls made on a given day add the most value in terms of collected debt. This means that all possible phone calls should be ranked based on value and the collectors should call debtors from high to low until capacity for the day is exhausted. In order to compute the value of phone calls, it is still necessary to consider the actions and events that lead to eventual repayment, thus the curse of dimensionality still exists since the state space remains unchanged. Here is where machine learning comes into play!

Large-scale MDPs are common in practice and many techniques have been developed to find approximate solutions rather than forcing an intractable problem. In our modified approach, instead of considering all possible future actions and events leading to eventual repayment, we use machine learning to directly predict the eventual collection outcome from any point in time. This then allows us to estimate the value of calling a debtor at any time by calculating the difference in expected eventual repayment with and without calling the debtor.

### Machine Learning Framework

To build the machine learning model, we started with a dataset of 80,000 debtors of a single insurance company between 2014 and 2016. The data initially consisted of some basic information regarding the case and a log of interaction history between the debtor and the collector. We processed the raw data into a tabular format where each debtor is a row containing the 25 variables that define the state of a debtor, and then labeled the outcome as yes if the debtor repaid in full and no otherwise, so it’s a binary classification problem.

The machine learning algorithm that we used is LightGBM, which works extremely well in practice and is often considered together with XGBoost as the best algorithm for the predictive analytics competitions hosted on Kaggle. Furthermore, it is easy to use and doesn’t require complex feature engineering to achieve good performance. By splitting the dataset into a train and validation set, we’re able to see that although difficult, it is possible to predict debtors’ repayment likelihoods. The figure below plots the Receiving Operating Characteristic (ROC) curve of the eventual repayment predictions for debtors who are 25 days into the collection process. Since the area under the ROC is only 0.6385, prediction performance isn’t great. However, it still shows that eventual repayment is predictable, and hence there is still potential in this approach.

We can then estimate the value of calling the debtor by predicting the change in repayment likelihood with and without making an additional phone call from their current state. The optimal decision is then to call the debtors where the value of calling is the highest for as much as capacity allows.

Contrary to some belief, machine learning is not a black box, and it’s always possible to analyze the predictions made with respect to the feature values used to make the predictions. We performed an analysis of estimated phone call effects and found that a number of features can be linked to better calling efficiency. First, the value of calling increases as time passes since the previous interaction with the debtor. This suggests that it’s better to wait a few days before calling the debtor again. Second, calls perform better when there have been more previous interactions between the debtor and collector. Because the collection process remained largely unchanged within our historical data sample, we believe that the number of debtor-collector interactions is actually a proxy for the length of time the debtor has been in the collection process. This suggests that there is a good chance that new debtors are going to repay their debt early on in the process without needing to be called, so it’s better to wait and see before calling them. Finally, calls tend to perform better when the debtor has previously answered a call. Note that these insights overlap in some aspects and contradict in others. Our analysis is simply meant to better understand the model outcomes than to make definitive conclusions on the optimal debt collection process.

### Implementation and Controlled Field Experiment

The best way to validate any decision policy is to test it in practice. We implemented our solution with the industry partner that initially provided us the data—a Dutch collection agency that handles over 250,000 collection cases annually totaling €120 million of principal. Our implementation takes the form of a simple Python script (run time of ~15 mins) that reads in the raw data in csv format and outputs a ranked list of debtors to call that day.

To test our policy, we conducted a live experiment that randomly assigns newly arriving debtors into two groups. The first group (466 cases), which we’ll call IP for Incumbent Policy, was the control group and followed the existing policy used by the collection agency. The second group (455 cases), which we’ll call GOCP for GBDT Optimized Collection Policy, was treated by the data-driven policy proposed by our algorithm. At the start of each day, all of the cases meeting the rules specified by IP and the top 20% of the cases as ranked by GOCP were put into a central pool of outstanding cases for collectors to call that day. The collectors were unaware of the experiment and performed their duties without knowing that the cases came from two different groups. Finally, the cases were tracked for a minimum of 60 days and a number of performance indicators were calculated at the end of the experiment (see table below).

We can see that GOCP was able to improve collection performance. It was able to fully collect a greater percentage of the cases (62.6% vs 59.0%), a greater percentage of the total debt value (65.2% vs 57.2%), and earlier repayment for the cases that repaid fully (20.3 days vs 22.2 days). More importantly, GOCP was able to greatly reduce calling effort as a decrease of 21.5% in the number of calls was observed (1,064 vs 1,355).

When we analyze the calling behavior of the two policies, it is clear that GOCP waits longer than IP to call debtors, and also allocates a greater percentage of calls to debtors that end up not repaying the debt. From this, we can infer three rules that seem to improve collection performance: 1) there’s lots of room to reduce calling efforts, 2) give debtors more time before calling them, 3) keep reaching out to difficult debtors to hopefully work out a solution.