Roles and Skills in the Data Team | Nikhil Goyal


same but different (Source: Image via Shutterstock under license to Nikhil Goyal)

Psst: Your “Data Person” is not a thing

Building a Data Team can be tricky business. There exist dozens of loosely used titles to indicate the set of data related tasks (recognized and often unrecognized) which need to be performed in the organization. Data Scientist and Business Analyst are perhaps the most known data roles on the market with Data Engineers becoming hot cakes over the last 12–24 months. A lot of companies which hired droves of Data Scientists in search of data nirvana were ultimately disillusioned and at a loss to understand why analytics has not delivered the promised impact. Research from Gartner (my previous organization) reveals that more than half of marketing leaders are disappointed by the impact or lack of from the analytics investments (Gartner Survey of Senior Marketing Leaders).

I do not endorse this infographic. (Source: Image by Marketing Distillery,

It’s not wrong per se to describe a data scientist as possessing those 4 dimensions of skills but it gave at least some non-technical business leaders especially in the marketing world, a false sense of completeness and substitutability for processes, infrastructure and skills required to build a data team by hiring the unicorn Data Scientists. It also set people up for unrealistic hiring expectations and ultimately buyer’s regret.

At some point, smart folks corrected this and started hiring Data Engineers and if you haven’t heard yet, hiring a seasoned Data Engineer these days is infinitely worse than trying to find the snitch in a quidditch game (Data Engineering Jobs grow 50% YoY in 2020). When I was asked while interviewing for LeafLink, how many Data Scientists I will hire in 2020, my response was “probably 0”. Instead what I tried to explain is the chart below:

Foundational work like building a data warehouse and standardizing reporting of key metrics reduces ad-hoc reporting tasks freeing up time of the Data Team to solve for higher order business questions. (Source: Image by Nikhil Goyal)

More recently there has been enlightened thinking on the subject by Fishtown Analytics as part of the dbt project. By recognizing the need for analysts to borrow from the software development playbook and careful consideration of the use cases and problems that data analysts run into (which are very different from those of a backend engineer), dbt has given definition to a whole new data role in the market — the role of an Analytics Engineer (What is an analytics engineer?). It is brilliant because it systematizes the role that most organizations find they need early in their journey of building a data stack but don’t quite know what to call. In my view, the Analytics Engineer is the ultimate data generalist and evolution of the BI Analyst role. Someone who can understand business logic, represent it in dimensional models, script the SQL to compute metrics and leverage dbt’s awesome capabilities to abstract, express, template and version manage all of it.

Even with all these role definitions, data work doesn’t materialize into actions often because there is a communication gap between those who perform the analysis and those who take the actions. On paper, Data Scientists were supposed to have outstanding communication skills but market dynamics of demand and supply lead to an equilibrium at a lower than optimal expectation (read: the elusive unicorn Data Scientist). McKinsey suggests that to bridge this gap, data teams require an Analytics Translator (Analytics Translator — the new must-have role). If I’ve ever heard of a niche role, it’s this one — someone who can go and speak to someone else’s work. Frankly, this is one role the data analyst in me loathes the most. At the risk of inflaming the passions of many, I’ll add that I would rather train analysts to write recommendations and deliver a story than hire an analytics translator to interpret the work of a data scientist or a BI Analyst.

To go back to where we began, between BI Analysts, Data Scientists, Data Engineers, Analytics Engineers, Analytics Translators and myriad more (Database Administrator, ML Engineer, BI Developer, Data Visualizer), where is one supposed to begin hiring? All of these roles and nuances, leads to further questions for hiring managers in the data profession, especially when explaining to the non-technical stakeholders like Finance, HR and other Business Partners who sponsor or support hiring of Data roles. Questions like:

  • What’s the difference between a BI Analyst, Data Scientist, Data Engineer and Analytics Engineers? I see so much overlap of skills on all their resumes- They all know SQL, Python, R and Tableau!
  • How soon will we realize the impact of all these roles?

The truth is that as the data ecosystem evolves with a business, you’ll need all of them. Not recognizing the breadth of areas and skill sets that fall under the umbrella of Data Operations leads to uncoordinated and incomplete efforts (and please excuse the cliche) to harness the power of data and bad hiring decisions. What confounds the picture a lot is the fact that data professionals often have a tapestry of experience touching various parts of data operations making everyone look like a jack of all. As a result, everyone on the data team is a . . . “Data Person”. The easy analogy from the Engineering world here is that of the “Full Stack Engineer”. In an ideal world, every hire would be an expert at the full stack. An entertaining reductio ad absurdum of this analogy would be to hire only full stack employees — one role that can manage the product, do the sales, run customer experience, perform collections, bookkeeping etc. and …. hire and fire themselves. Imagine a company of full stackers. I’m being deliberately ridiculous. But let’s just say for the sake of argument, someone is an expert at a multitude of skills, will they have the time to perform all of those roles?

A helpful way to think of which Data role you need is to go back to the often repeated question of — “what you need to hire for’’. Ask the question — which skill sets are most critical for the role and what part of the data value chain will the role need to work the most.

For myself and the benefit of my recruiting partners, I made a simple which skills matrix to understand and explain the different roles:

Column A: Broad skillset area, Column B: The tool or technology leveraged to perform the work, Columns C-G: Key Data Roles mapped to area of focus and level of expertise. (Source: Image by Nikhil Goyal)

To be clear, I’m not saying that A Data Scientist can’t have the skills of a Data Engineer or an Analytics Engineer cannot be an expert BI Developer. Skills in the data domain are highly lucid so nothing stops a data engineer from also being an expert at data analysis and visualization or a data scientist from having expert level data workflow orchestration skill sets. Picking up minimum proficiency in all broad areas of data operations is pretty much the norm of the trade. Some brilliant folks learn to become experts in multiple areas. Yes, unicorns exist. I know a few. But unicorn hunting often leads to inappropriate delays most times and massive disappointment at other times.

For aspiring data professionals, think of your own areas of focus and expertise as a rainforest. What does the canopy of skill sets look like? And what do the deep roots look like?

This is an opinion. I would love to hear and learn from the views and experiences of professionals in the domain and evolve my thinking.

P.S.: LeafLink is hiring Data Engineers, Analytics Engineers, Data Scientists and BI Developers. Check out our careers page and apply on our site or simply drop me a note !

Read More …


Write a comment