I've had several conversations recently with people I know in the data science space that always start out about business and then drift to the state of data science as a whole. One theme constantly comes up in these conversations: There are a lot of people currently running data science teams at large organizations and the vast majority of them — I believe we are talking about 80-90% — want to leave their jobs. Why is that? Within smaller organizations, the number isn’t as great. So what is going on in larger organizations that is causing such a mass exodus? Having worked in and with many large organizations in a data science leadership capacity, I have a few theories.
Academia Can't Do For-Profit
Many large companies have fallen into the trap of believing that you need a PhD to do data science. You don’t. Of the top five data scientists I have ever worked with, only one had a PhD and it wasn’t even in stats or data science — it was in biophysics. I call this the "academia trap." There are some smart people who know a lot about a very narrow field, but data science is a very broad discipline. When these PhDs are put in charge, they often discover they are out of their depth. They were never taught how to run a P&L, manage a team, or deal with people, competitive intel, market assessments, building a business case, etc.
Add to this the world they came from: Many peer-reviewed papers in the academic world that are really good don’t see the light of day. Why? The reviewers may have a competing theory and don’t want their ideas to get superseded. It is shocking how often this happens in the data science space. I have always found the academic world to be more political than the corporate one. When your drive is profits and customer satisfaction, the academic mindset can be more of a liability than an asset. I have yet to see a data science program I would personally endorse. In my opinion, most are run by people who have never done the job of data science outside of a lab. That’s not what you want for your company.
If you’re not willing to give data science managers the training and support they need — or create an organizational structure that brings people with business backgrounds into data science strategy — what will happen is your budget will grow and your results will drop. I’ve seen leadership teams spend $10 million plus a year with very little return. I have never spent that kind of money, yet I have built billion-dollar data products. I’m also not a PhD. The difference is, I know business. PhDs do great work, but the type of work that they excel in isn’t what most companies are asking them to do in this new age of enterprise data science. Don’t make this mistake.
Doing data science and managing data science are not the same, just like being an engineer and a product manager are not the same. There is a lot of overlap, but overlap does not equal sameness. Sometimes, I envy data scientists — much of their time is spent cleaning data sets, testing algorithms, and researching new methods. Compared to the job of someone who needs to run the practice, a data scientist has it pretty good.
"Doing data science and managing data science are not the same."
The leader of a data science practice needs to focus on data governance, MDM, compliance, legal issues around the use of algorithms, and documentation just in case someone sues for wrongful use. There are hiring issues and staffing problems to deal with, budget and funding to gain, P&L to run, business cases to build, market research to conduct, vendor meetings to hold, tech life-cycle management, the evangelizing of projects (both internal and external), and turning that work into data products that sell — all this while trying to ensure profit for the company. Big difference.
Most data scientists are just not ready to lead teams. This is why the failure rate of data science teams is so high right now. Often, companies put a strong technical person in charge when they really need a strong business person in charge. I call this person a data strategist. Right under that data strategist is the strong technical person; there needs to be a very solid and strong interpersonal relationship between the two. If they are competing, it will cause friction and less-than-desired results.
Agile has taken the tech world by storm. It works fairly well for software development and, as a result, many companies enforce it on data science. But data science is not software development — it’s really a field of discovery, whereas software development is about assembly. I have worked with companies that demand agile and scrum for data science and then see half their team walk in less than a year. You can’t tell a team to solve a problem in two sprints. If they don’t have the data or tools, it won’t happen.
"You can’t tell a team to solve a problem in two sprints. If they don’t have the data or tools, it won’t happen."
Data science is a discipline that requires its own methods. In addition, most companies are still treating data products like they do physical products — but the economics are not the same. When I build a recommendation engine, my cost per unit is pretty much zero, unlike a physical product which certainly has a per unit cost. I can make a million product recommendations with that engine or just one and, other than the electricity, my cost per recommendation would be the same. The cost of a physical product would be a lot more and involve many different variables if I wanted to increase the number from one to million.
We have to understand that the economics of data products are different. A lot of large companies don’t even have this conversation, which can cause a lot of frustration for those in charge of data products. In essence, they have one hand tied behind their back.