There's a lot of chatter and excitement around deep machine learning, neural networks, A.I., and all the tools that will shed light on deeply-hidden insights. You might think that your organization is ready to scale data science and build a culture where data drives all business decisions. But scaling data science is a complex, constantly-evolving goal that requires a multi-pronged approach. It necessitates conversations with business stakeholders, IT teams, and data scientists around the current and future state of data, business needs and goals, and costs and benefits.
Here are four important questions to answer before you start the difficult task of scaling data science.
1. What does data science mean to you?
This is an obvious but important question to answer. Most organizations are still in the infancy of their analytic growth and far from building machine learning algorithms. Perhaps you're in a similar situation and simply looking to create new reports and automate existing ones. Or, perhaps you want to reconcile different data sources and unify the current KPIs and report structure. You may also want more custom analytics like customer segmentation, churn and loyalty models, attribution reports, etc. Or, maybe you're in the market for a deep machine learning model that predicts customers' propensity to buy your product.
Wherever you are on the spectrum, there's an immense gap between streamlining reports and building neural networks. Each require different resources and have vastly different implications for your organizations, so be realistic and practical in how you define data science.
2. How will data science benefit your organization?
This question naturally follows the first, and can consequently determine the path your organization takes to scale data science. If your data science team is small, you will need to know what the trade-off is between time-intensive custom projects and more routine analytics. Can the team easily scale, modify, and apply one-time projects to other business scenarios? Do the long-term benefits of time-intensive custom analyses justify the upfront investment? What is the risk of not scaling data science? Can you achieve more with what you already have? All of these are important questions that give a better idea as to how data science should be incorporated in your business.
3. Do you have the right infrastructure?
A critical third step is to assess the state of your data. In all likelihood, your organization's databases have evolved to meet changing business needs and there are multiple legacy systems. Documentation may be scarce, inconsistent, or inaccurate and concentrated within certain user groups. In many organizations, the analysts also act as data engineers and spend a lot of time wrangling with and reconciling multiple data sources.
The first thing that you need to do is understand the nuances of your data: where the data is housed, the rules that govern the extract-transform-load (ETL) process, compatibility between various internal systems and newer software, automation capabilities, documentation around processes and databases, etc. Then, ask yourself what’s needed to create an infrastructure that fosters data science. It can be anything from a data environment adaptable to changing business needs to protocols for integrating new data streams and software. Or, it could be the creation of separate production and test environments that allows the data scientists to build and test models before moving them to production and iterating in the future. Your organization’s data science needs will dictate the database infrastructure as well.
4. Do you have the right people?
It’s common to think you just need to hire more data scientists if your current ones aren’t delivering reports and projects quick enough. But the shortfall is rarely with the analytics team. A lot of skilled statisticians and data scientists find themselves building dashboards and cleaning data. They rarely build complex models or algorithms because they're too busy with database maintenance or dashboard creation.
As you work through the first three questions, you might realize that you already have the right data scientists—you just need to find ways to support them. The goal is for data scientists to focus their energies on projects that add the most value to the business rather than on regular day-to-day tasks. Instead of data scientists, you may actually need to hire are more business analysts, IT support, and data engineers.
Scaling data science is a complex, evolving concept unique to your organization. You can succeed in creating a data-driven decision-making culture only when you build from the ground up and clearly define your data goals from the start. In the end, you may find that scaling data science is less about doing more, and more about doing things right.