Guide To Building A Data Science Team
A data scientist requires a varied set of skills that is often hard to find in a single individual. Even if you succeed in finding a person who has all the skills of an ideal data scientist, you would have to spend quite a heavy sum of money to avail their service.
Now assume you are not in a large market like North America, meaning that you do not have many options to choose from. Your search would definitely grow even more complicated. This is the exact challenge that Vodafone NZ had to face when they decided to build a data science team. The country has less than 5 million people and it isn’t surprising to know that the national telecom provider would have the largest Hadoop cluster and the most ambitious Big Data Analytics program. According to the Analytics and Data strategy manager of Vodafone NZ, David Bloch, the traditional approach of finding the right candidates by considering the years of experience they have as the benchmark does not work in a small market, as the strategy could leave aside many people you would want to talk to.
Instead, he encouraged a startup mentality by conducting events like hackathons and meetups for people who were enthusiastic about data science. Replacing traditional interviews with less formal processes offers an easy way to find these people. Bloch has the experience of working with various startups before he became a part of the Vodafone team and the things he talked about really made sense.
He made clear definitions of various roles in order to form a reliable data science team by encompassing hackers, analysts, engineers, change agents, and storytellers. The well-defined roles did not necessarily mean that each was mapped to separate individuals. In some cases, the same individual would play two roles.
The engineer is more like the automation expert of the team. They come with an ETL or DBA background and collaborate with the hacker to build data flows and ensure that everything works without any delay. In many teams, the role is played by the data engineer. The hacker is the R or python developer who performs the role of building rough models even if they lack an understanding of the science behind the model. The latter is not regarded as a hacker’s job but a statistician’s. The statistician is a deep thinker who has the knowledge about scientific models that are necessary for identifying and validating models.
You would need a subject matter expert or data explorer in your team and the role is played by the analyst. The job role of an analyst resembles that of a business analyst and they have to be well versed in writing SQL. Finally, there should be someone with proven creative skills (for example working with visualization tools like Tableau) to play the role of a storyteller and a change agent. The person who plays the role of change agent has to be the influencer who networks with executives, builds business models, and ensures that the developed models have a significant impact on business processes.
Consistency of all the processes is the key to making things work out, and for that, different members have to collaborate and make collective contributions at different stages. The first step is the identification of the business challenge, a task that requires the analyst and change agent. The next stage is exploration and ideation where hackers and analysts play a significant part. It is followed by prospecting for data, with the analyst, engineer, and the statistician, all play a role in the process. The next step is the testing and development of the model. The change agent, analyst and statistician collaborate in this step. The home stretch stage involves the storyteller and analyst. The final stage is making the results actionable and it is the hacker and change agent who work together during this phase.