In 2017, machine learning engineers and data scientists snagged the top two spots in LinkedIn’s fastest-growing jobs list. This was due, in part, to the steadily increasing use of complex technology like artificial intelligence in business applications, a trend that isn’t confined to just tech and IT companies. According to another recent study, industrial firms are now responsible for 37% of data scientist hirings, with financial and healthcare companies following close behind.
It’s clear that most companies are far past the “will they or won’t they” stage of data science adoption. In 2018, teams will be laying the groundwork for complex, collaborative projects that make use of deep learning or machine learning to streamline operations and improve customer experiences. Below are five big changes that are likely to accompany this shift.
The world’s most successful companies will have chief data officers who report to the CEO.
The role of the chief data officer (CDO) is now essential to building a company-wide strategy for managing, leveraging, and securing the 2.5 quintillion bytes of data being produced worldwide each day. In fact, Gartner estimates that 90% of large companies will have a CDO in a year’s time — with most of them learning on the job, according to the research firm.
Driving the adoption of this relatively new role is an unprecedented number of requests for data access from business stakeholders. Previously, department heads relied on aggregate data in static reports; now, they want to identify new opportunities in data that may have gone unnoticed in the past. According to Experian Quality Data, 53% of CDOs believe lack of data access is the biggest barrier to success for their companies. In 2018, this challenge will be addressed with increasing speed and deftness thanks to the addition of this specialized role to the C-suite.
Enterprise data science will coalesce around a broad — but well-defined — set of open source tools.
Proprietary tools were once the mainstay of enterprise data analysis, but with 66% of analytics professionals in 2017 preferring open source languages Python and R to SAS — a 4% uptick from the year prior — more companies are seeing the value in letting data scientists work how they want. Doing so eliminates the need for data scientists to learn new tools and for companies to pay hefty fees for software upgrades. Additionally, open source solutions are often at the forefront of innovation thanks to their community-driven nature, while proprietary offerings regularly lag behind.
This shift is also being supported by the rise of the data science platform, a central software hub that encompasses the entire data science project lifecycle from ideation to model deployment. Data science platforms are projected to be a $183,688 million market by 2023, and most offerings are designed to integrate with open source tools like Spark, TensorFlow, and other popular packages, languages, and frameworks.
Crowd-sourced data will be at the center of innovation in artificial intelligence.
Smartphone applications like Waze have already demonstrated the power of crowd-sourced data: 65 million monthly users have access to real-time navigation that is driven by input from other users, like accident or road construction reports. Last year, Mozilla, the non-profit behind the open source web browser Firefox, asked its users to donate audio samples to build an open-source voice recognition system to rival Alexa and Siri. Also in 2017, UK researchers used artificial intelligence (AI) to mine Twitter and mobile phone applications for information that can provide early flood warnings.
AI and crowd-sourced data pair well together because AI performs better as it learns, and teaching an AI system to recognize images or sounds requires huge amounts of data. In fact, in order for AI to beat a human player in Go for the first time last year, Google had to provide 30 million moves to the system and then pit it against itself. In 2018, crowd-sourced data will help companies build better AI systems, and those systems will change how they do business for the better: According to a study conducted by Capgemini Consulting, 74% of companies that embraced AI saw at least a 10% increase in sales.
While companies strive to embrace data science, they will also need to more closely consider data privacy.
With the European Union (EU)’s General Data Protection Regulation going into effect in 2018, companies across the world will have to prove they have state-of-the-art data security. That’s because if the data a company is collecting belongs to an EU citizen, it must comply with rules on privacy rights and data control and governance — regardless of the business’s location. For those still trying to level up their data management skills, compliance could be a big challenge.
These new requirements would appear to be at odds with the growing mentality among business stakeholders that more data access is key to success. However, as mentioned above, the role of the CDO will be integral to navigating this increasingly complex regulatory landscape. In fact, data security topped the list of business challenges in 2018, with 63% of CDOs citing it as a concern.
Applying distributed artificial intelligence to open genomics datasets will produce healthcare breakthroughs.
AI is becoming increasingly valuable in healthcare applications, with researchers now being able to pinpoint genetic variants that can be linked to certain diseases, treatment outcomes, and more. Six years ago, a report from the McKinsey Global Institute estimated that AI could save pharmaceutical and other healthcare-related companies a whopping $100 billion annually; in 2017, the industry began to see some very real AI-related successes: chat bots became adept at monitoring mental health, a neural network matched 21 board-certified dermatologists in identifying cancerous cells, and hospitals started using algorithms to detect diabetes-related eye problems.
In December, Google released an open source version of DeepVariant, an AI tool designed to perform highly accurate genomic sequencing, opening up the door for more healthcare breakthroughs in 2018. Often, identifying specific mutations or sequences across genomes is a “needle in a haystack”-type task — the Google Brain Team hopes its tool, the most accurate of its kind according to the Food and Drug Administration’s precisionFDA challenge, will “solve real world problems” and “satisfy the needs of the largest genomics datasets,” which are increasingly open source.
Looking forward: Companies will make better use of data science in 2018
By 2020, Forrester predicts businesses that use data effectively will be collectively worth $1.2 trillion, up from $333 billion in 2015. But getting there will require better management and governance, as well as a good deal of creativity on the part of data science teams. This year, it will be imperative for existing and up-and-coming industry leaders to make better use of the data they already collect — as well as identify new data that presents new opportunities. For that reason, 2018 is certain to yield more data science innovation in the enterprise than has ever been seen before.