Recently, Saturn Cloud held a Twitter Survey asking Data Scientists which hat they’d like to wear the least. The reality is, they wear a lot of hats but aren’t necessarily fans of all of them.

 Data Scientists, Ready To Stop Wearing Too Many Hats?

The demand for data scientists is very high and is only expected to increase. According to the U.S. Bureau of Labor Statistics, from 2018 to 2028, the number of jobs is expected to increase by 16%. For reference, the average growth for other professions is 5%.

Further, the current demand of data scientists greatly surpasses the supply and that gap is only expected to widen. This all points to one conclusion, data scientists’ time is extremely valuable. But data scientists often find themselves taking on roles out of their scope to achieve their ultimate goal— driving business value through insights. To learn more about the sentiment of data scientists, Saturn Cloud created a poll asking data scientists to vote on which hats they least like to wear.

 Data Scientists, Ready To Stop Wearing Too Many Hats?
The original tweet poses the question, “As a #DataScientist are you wearing too many hats? If so, what’s your least favorite hat that you have to wear?”

37% of the 70 respondents said security was their least favorite hat to wear, followed by DevOps 24%, Data Engineering 21%, and Business Analyst 17%.

This article intends to address how and why data scientists support these other functions. Let’s start by defining what data scientists think they should be doing.


What Hat or Hats Should a Data Scientist Wear?

In terms of their objectives, they leverage analysis, statistics, and machine learning, and visualization to identify insights and produce findings.

Data science lives at the intersection of three core disciplines: Computer science, math/statistics, and business/domain knowledge. As with other cross-functional disciplines, the definition of what is their role and what is someone else’s is murky. And how data scientists work is different across organizations, which can make it challenging for organizations to understand what is needed to allow a data scientist to excel.

In an ideal state, they have these three things:

1. Enough domain knowledge to understand the questions they’re hoping to answer

2. Engineering support needed to source data and productize their results

3. Communication tools that allow them to easily share results. Allowing data scientists to divide their time between building models, analyzing the results, and communicating their findings.

Building models, analyzing the results, and communicating their findings is ideally where a data scientist should spend most of their time. 

Now that we’ve discussed what data scientists should do in their roles, let’s cover what data scientists don’t like doing in their current roles.

If you’re a data scientist, you might find this self-affirming to hear that data scientists share your struggles and hopefully you’ll have some talking points to help address some of these challenges. If you’re not a data scientist and you’re wondering why you’re concerning yourself with what data scientists don’t like, it might be worth giving this article a read. Brooks-Bartlett hashes out why data scientists leave their jobs. A lot of it comes down to organizations not understanding what the work of a data scientist is and how that frustration results in, data scientists spending 1-2 hours a week looking for a new job (according to the Financial Times). Not an inspiring statistic to hear when thinking about retaining talent that is one of the hardest to acquire.

Let’s start with the least favorite:


Security

What is it? And why do companies care about it? 

Since this is the function data scientists like the least, we’ll give it the most attention.

Before addressing where data scientists fit into this conversation, let’s consider why data security matters.

  1. To be compliant – Even though the US doesn’t have a unified data security act, there are state by state regulations and industry-specific regulations (HIPAA). And as states like California lead the charge with new regulations like the California Consumer Privacy Act, companies are needing to enhance their data privacy.
  2. To maintain a good reputation –Showing a commitment to customer protection improves a company’s image
  3. To sustain competitive advantage – In data-driven organizations that pride themselves on making analytically informed decisions, data is an asset. A massive one.
  4. To avoid massive fines – To provide a few examples, Facebook had a $5 billion penalty from the FTC or Equifax had a $275 million penalty from the CFPB and various states.

 Why do data scientists get pulled into this?

As companies become more data-centric, they become increasingly exposed to risks around security so all teams who work closely with data are likely spending more time on security than they would like (at least at companies who take data security seriously).

Why do data scientists not like it? 

Building in security safeguards – encryption, masking, erasure, resilience – takes time and is often seen more as a requirement than a game-changing opportunity. There is an inherent tension here. Whether or not the perception is correct, security, compliance, and governance have historically been recognized as stifling innovation, so it’s understandable why over a third of data scientists who participated in this survey, found it to be the least enjoyable work they do.

Many data scientists got into the field, because they wanted to drive decisions in the organization with their insights, not because they wanted to encrypt personally identifiable information (PII) in their models.

What can be done about it?

There are two organizational shifts that need to take place. First, leadership needs to place more emphasis on the importance of data security as a core principle- instilling in data scientists, and the rest of their companies, that security isn’t just a task, but rather a business-critical activity. If you search for jobs in data science, you likely won’t find any mention of data security in the job description. If it’s a requirement for the role to protect the data they work with, then it should be explicitly stated, encouraged, and rewarded. Second, if a data scientist is in an environment where they spend a lot of their time on data security, organizations need to identify solutions to support them.


DevOps

What is the function of DevOps?

 DevOps teams concern themselves with continuous delivery. They help propel engineering teams forward, functioning as a plug and play multidisciplinary team members. 

Why do data scientists get pulled into this?

DevOps can support data scientists in several ways, but we will focus on getting the models into production as a prime example.

It’s worth noting that data science began in academia. They focused on understanding a particular question, preparing their data, creating their models, and producing their findings in the form of a clean report. In academic environments, it is more common for research to be owned by a single person. They identify a particular topic and make sense of it by themselves.

In companies, people collaborate. People share knowledge across teams and use it to inform their work. In a company, data scientists need to understand a broader set of considerations – what data the company has at their disposal, what senior leaders want to learn from their data, what broader business challenges impact the company, and how their insights can drive business value. They can’t own the entire process and manage this wide range of expectations. This is where DevOps should enter. DevOps is used to bridging the gap between the business owners and the developers. But few organizations have data scientists and DevOps working closely together.

What can be done about it?

Data science teams need to adapt to more agile work environments, and DevOps needs to learn more about what data scientists are working on. There will be short term costs, but over time, DevOps will become more aware of the challenges of data scientists and the products of data scientists will improve. Or alternatively, dependencies must be eliminated so data scientists can more easily share their insights across the organization. They espouse the same idea: the value of a data scientist’s work is in what action their insights propel.


Data Engineering

 What is the function of data engineering?

Data engineers design, build and maintain the data infrastructure of a company.

Why do data scientists get pulled into this?

If the data infrastructure in the organization isn’t mature enough or the types of questions data scientists are looking to answer leverage data sets that should be in the data warehouse, but aren’t yet, the data scientist will likely need to be heavily involved in structuring the data they’re working with.

One of the overlapping skills that data scientists and engineers have is manipulating and structuring data. While it isn’t ideal for a data scientist to go to spend time engineering the data, until someone does it, they won’t be able to do their job.

What can be done about it?

The most obvious solution is to hire enough data engineers, but they’re also very in-demand in the market. The tougher but more practical solution is to have data engineering and data science aligned on company-wide objectives so the work of the data engineers feeds into the work of the data scientists.


Business Analyst

 What is the function of a business analyst? 

They seek to improve efficiencies. They straddle the line between tech and operations and focus largely on how they can make gains across Key Performance Indicators (KPI).

Why do data scientists get pulled into this?

Of all of the hats data scientists wear, this might be the least logical one. The overlap in their skills is marginal and the data scientist’s time is highly underutilized if this is how they are spending their day. Unlike the other work that data scientists get pulled into, this type of work isn’t addressing a dependency that needs to be completed for them to do their job, so more than any of the other hats, it’s a distraction from their function entirely. There are two reasons they can get pulled into a business analyst role. First, leaders in the organization misunderstand the difference between the two roles. Second, the organization just needs more business problem-solving support.

What can be done about it?

This is a recurring theme, but worth restating, awareness around what data science is is critical. And the good news for the second part, business analysts are cheaper and easier to find in the market, so a big first step in getting that kind of help is knowing that you need it.


Summary

In order to maximize the impact of data scientists, keep them happy, and reduce churn, organizations need to invest in supporting data scientists, so they can focus on the work they were hired to do. If your data scientists are wearing too many hats, it’s worth considering why they’re wearing these hats in your organization and if it’s the most effective use of their time. Data scientists might be capable of owning work out of their scope, but if it’s negatively impacting their key objectives and overall satisfaction, it might be time to reevaluate the extra work they’re taking on.

Learn how you can efficiently save time when it comes to DevOps, engineering, and more with Saturn Cloud.

Stay up to date with Saturn Cloud on LinkedIn and Twitter.

You may also be interested in: Automatic Version Control For Data Scientists

Written By: Megan Moore