Consumers who are likely to switch mobile phones in the near term are a desirable advertising audience for the competitive telecommunications and mobile device market. Telecoms and mobile device manufacturers understand who their customers are but sometimes lack understanding of their competitors’ customers. Incomplete information leaves their internal churn predictions focused on internal customers instead of identifying prospects for advertising campaigns. To meet this market need, the data science team at Oracle Data Cloud developed a solution that identified people who were likely to switch phones in the near future.    

The Setup

On the surface, the machine learning setup seems simple. Gather raw data (demographic and purchase data) from the past year for a feature set, define a dependent variable of people who recently switched devices, fit a model, and use that model to identify people most likely to switch phones. Easy enough, right?

telecom1

If we take a step back and look at the business problem at hand, however, predicting people who are likely to switch phones explicitly requires the need to be forward looking. The above setup does not solve our business issue—in fact, it does the opposite. This requires us to revise the setup to the following:

  • A feature set containing demographic and retail purchase data within a year prior to a given date
  • A dependent variable containing people who switched phones within three months after a given date

telecom

Now our setup aligns previous behavior with a future switching event and meets the business need. 

Additional Data

The demographic and retail purchase data included above helps identify people likely to switch phones, but are there more relevant data that we can include to better predict the outcome?  Theory suggests the longer a person has a phone, the more likely they are to get a new one in the future. What about device type? If a person has an Android or an Apple device, does that influence when someone switches phones?

To find out, we created features for the following:

  • Length of current device ownership as of a given date
  • Manufacturer of current device as of a given date

Now our features are more relevant in addition to being time aligned. When compared to the model that included demographic and retail data only, the addition of mobile device ownership yielded a 16% increase in predictive power as measured by the area under the receiver operating characteristic curve. 

Feature Importance

After performing our standard Quality Control checks, we wanted to understand the most important features in our model. There is significant business value in understanding feature importance. Insights derived from feature importance enables better communications with the client and relates the solution to their problem. Delivering insights is especially important when promoting data science products. Unlike taking a new car for a test drive, data science products are not tangible and cannot be evaluated in the same way. We looked at the feature importance from our model and observed the following:

  • The longer a person has a device, the more likely the person is to switch in the next three months.
  • Device type influences the type of device the person switches to. For example, Apple owners are more likely to stick with Apple than switch to other brands.
  • Households with higher income levels purchase new phones more often.
  • Gen Xers are more likely to switch phones than Baby Boomers.

By making the enhancements described above, we have expanded our understanding of the model, provided transparency, and have created insights for our sales teams and clients. Data scientists who arm client-facing teams with such insights enable a more effective sales pitch and builds trust with clients. It supplements technical conversations and provides assurance that the product solves their business issue.

Conclusion

Data science products provide powerful solutions for our clients but we can’t allow the lack of a tangible product to hinder the delivery of those solutions. When using a technique such as machine learning, it’s important to provide transparency and relate the solution to the problem at hand. Once data scientists start to peel back the curtain surrounding our products, we will deliver even more impact. 

 

Nate Klyn
Author
Nate Klyn

Nate Klyn is a Senior Data Scientist at Oracle Data Cloud. Over the past four years, he has worked in client facing and technical roles in the digital marketing ecosystem. Currently, he is focused on creating data science products that provide value to Oracle Data Cloud's customers. Prior to his working career, Nate earned a MA of Economics at Miami University in Ohio and an Economics degree from the University of Northern Iowa.