As enterprise companies strive to become more agile, many are turning to DevOps as a way to deploy code quickly and efficiently. Unlike traditional IT, DevOps takes a holistic approach to fixing, updating, and deploying systems. It can also be used to successfully deploy data science models into production. But this format isn’t right for every business.
To understand why larger companies are embracing a DevOps mentality, we sat down with Pam McCaslin, data scientist and principal DevOps lead at Amgen, to talk about how DevOps and data science fit together — and how they’re changing the biotech and healthcare space.
What is the relationship between DevOps and traditional IT, and why are larger companies starting to embrace a DevOps approach?
Both of these approaches involve work that applies to a company’s information technology structure. However, traditional IT was set up to have disparate groups doing that work. What I mean is, in traditional IT, you have a group that handles all the development work and a second group that does operational work. The development team fixes broken systems and pushes those fixes into an environment where the operational team can put them into production. The operational team is also the group that identifies broken systems and notifies the development team. They co-exist, but they don't usually work together — except for moving code back and forth.
In many cases, this type of setup can result in slower deployment of software releases, whether they include new functionality or fixes. Larger companies are moving to DevOps as it provides a path to fast implementations.
What are the benefits that come from being more agile in this space, in your opinion?
First of all, with your traditional software like SAP, or any of those large scale systems, DevOps won’t make sense. Those systems don’t change much. However, when you are talking about the work you do with a website, where you’re constantly trying to learn and improve, you could do 30 implementations in a day. Those 30 implementations can cause problems. It makes sense for the development and operations teams to be the same team because they're constantly changing, fixing, re-deploying, etc. That's where DevOps works. But it all depends on your use case; there's no easy answer.
In my group, Digital Health at Amgen, we create innovative technologies to improve health. This type of work lends itself nicely to a DevOps model. We just have to deal with the Food and Drug Administration (FDA) component of that, which makes it a little bit harder.
How does a company decide whether DevOps is the right approach for the work it’s doing?
Everybody has a software development lifecycle, some just complete it faster than others. Companies that use systems of record generally tend toward traditional IT and a waterfall approach to project management. Project timelines can take anywhere from six months to a year. In comparison, DevOps leans on agile or scrum project management with fast iterations of requirements, design, development and then deployment. Projects that rely on a DevOps methodology tend to have release dates in the day, week, or month timeframe.
If you’re picking between the two, the deciding factor is the structure of how you're rolling things out. You shouldn’t just adopt DevOps because it’s the new thing. Startups generally are running a DevOps environment because they can — what they're doing is fast ideation and deployments. They want to learn and deploy as quickly as possible.
You recently started working as a data scientist in addition to your role as principal DevOps lead at Amgen. How do those two groups work together?
Data science and DevOps are two separate parts of a bigger process. When a data scientist builds a model, he or she goes through many iterations to determine the best model for the job. That process takes time, and at the end of it, you have a working model that can be deployed as an API. That entire process is independent of DevOps.
Of course, once you have that API, you can wrap it up in some piece of software. For example, when you shop on Amazon there’s a model in the background making recommendations to you about other products you would enjoy. The model running in the background is a small piece of the software running on the site. DevOps allows a company like Amazon to quickly deploy better, faster, and smarter models to improve the customer experience. One of the biggest advantages is the ability to deploy software (which might contain data science models) in minutes rather than days.
What made you decide to move into data science, and how are you bringing your IT/DevOps expertise and data science education together in your role?
Early in my career, I had the opportunity to turn a regression model into software. I worked closely with a data scientist (back then he was called a statistician) to roll his model into a production CRM system. That peaked my interest and I ended up doing the same thing for my next company. I didn’t take it seriously until about five years ago when I attended an IBM conference; they were talking about the “job of the future,” which was a data scientist. I had some of the skills and loved the concept of being able to predict outcomes.
Now, I can be both the IT person and the data scientist. I can acquire the data, load it, clean it, prep it, and then work on it as a data scientist. I can wear many hats — especially in the IT space. I can take a model from development to deployment and build a steady state framework. I get the best of both worlds that I enjoy; data discovery and leveraging the best technology to get the job done.
Where do you think data science is going to make the biggest impact in biotech, pharmaceuticals, or healthcare?
I’m really passionate about this area, mainly because I see the strength of modeling and how it can improve patients’ lives; if we can predict that a patient will have a life-threatening event before it happens, that’s huge.
One project we’re working on right now is aimed at improving the prediction tools for common health issues. There are so many other areas where data science can be used all through the drug development lifecycle to benefit both our company and patients in general.
There are a ton of challenges, but if we don’t start now, we’ll never get there. Thanks to the Affordable Care Act (ACA), we now have health data in mass and in an electronic format, which makes predicting outcomes much more accurate. We also have no limits on infrastructure with services like Amazon Web Services and Google Cloud. This allows us to work with huge amounts of data in a fast and cost-effective manner. We are primed to make big changes in healthcare.
What change could DevOps, IT, or data science teams make today that would improve overall collaboration and the process of getting the results of data science work to decision makers?
Data science is all about building models that predict outcomes. DevOps only comes in once the model is finalized and needs to be incorporated into some form of software. IT, on the other hand, is needed to assist both of these groups.
IT helps data scientists get data to a point where they can work with it — by handling data curation, for example — and provides the toolsets or environments data scientists need. IT also either works with the DevOps team — or is the DevOps team — that gets models into production. For that reason, it’s important for IT and DevOps to be either partnered together or sitting in the same space so they can collaborate and communicate.
More than anything, it’s vital that IT and DevOps understand the entire lifecycle of a model, from creation to testing to validation and deployment. One misstep I see most frequently is that IT and DevOps focus on project initiation without looking at what they are trying to accomplish. A high-level plan that does not commit to deliverables, timelines, and costs will be challenging to execute.
About Pam McCaslin
Pam McCaslin is a data scientist and principal DevOps lead at Amgen. She has 28 years of IT and product management experience across multiple industries, including biotech and pharma, travel and entertainment, and direct marketing and analytics. Her expertise lies in leveraging DevOps, data science technologies, and platforms to deliver results in a timely, cost-effective manner.