Approaching Data Science with a Product Mindset
Carly Gundy, Data Scientist
Data science is arguably one of the hottest job fields of this century. Though still in its infancy as a discipline, it has already evolved a great deal. While a data scientist’s skillset used to center on modeling and statistics, there is now a greater demand for “full stack” professionals who possess a more technical, computer science skillset, as the need to scale processes and deliver results quickly has risen. This shift in skillset also requires a shift in mindset – from a project- to a product-based one – to focus less on answering individual questions and more on solving long-term problems.
A project mindset, in this context, can be understood as a one-time analysis with a set deadline that usually results in business insights, rather than a product or tool. It suits cases where a set of questions are focused on a specific purpose whose requirements / business needs are clearly defined up front and not subject to change. A typical data science project lifecycle will follow the below steps:
However, this project-based approach is subject to numerous pitfalls – specifically when questions are not clearly defined or needs change by the time the project is delivered. This can breed ambiguous approaches within the project lifecycle and can require more follow-up work or ad-hoc requests. It can also lead to silos and rework, as projects, once complete, are usually closed and sometimes forgotten about when similar questions arise later on, since they are not regularly maintained.
A product mindset, on the other hand, takes a broader approach, and instead of answering a specific question for one client, seeks to generalize its solutions to scale to many different use cases. This is done by implementing design thinking.__ Design thinking__ (among other aspects) involves 1) having empathy with the users and identifying pain points that the product can solve; 2) engaging with stakeholders early on and frequently to ensure that the product is driving business value; and 3) testing and prototyping early in order to be able to pivot, should issues arise.
This approach requires more up-front work and can thus appear to be slower at first, but produces more value later on, when the product is in production and can handle numerous requests and developers can continuously build in new features. A product-based workflow is thus structured in a cycle, rather than a straight line, as seen below:
The product-based approach not only keeps products relevant to current business needs but also automates the individual project steps to produce quicker results. In this way, the process is not limited to the speed and number of people available to execute the analysis, but is scaled based on technological capabilities. This shift empowers data scientists to further enhance processes as opposed to only executing them.
There are of course some difficulties involved in adopting a product mindset within data science, specifically when it comes to custom analytics. In many cases, the development of a custom model is needed or specific insights must be extracted from the data that aren’t easily found in standard aggregations and visualizations. In other cases, an exploratory analysis is needed, for example to assess a new data source or to further define the scope of the project. Nonetheless, many steps within an exploratory analysis can still be automated or facilitated with products and code packages to speed up the analysis and enable data scientists to focus on what truly needs to be customized.
As data science continues to evolve and expand, it’s important to evaluate which aspects of the product mindset are useful and which could be challenging to adopt, and why. The field can benefit from many computer science concepts and methods, such as unit testing, code reusability, and functional programming. Other CS concepts may not be as straightforward to apply as in software engineering, such as breaking processes down into parallelizable tasks to be worked on and recreating consistent output. Whether each data science project is conducive to being “productized” or not, a product mindset can still offer benefits: it requires that current processes be re-evaluated, and it results in a better understanding of what the purpose of each one is. This will enable businesses to better assess their gaps and understand where products can help them scale faster by removing repetitive tasks and sparking creativity and innovation in other areas, giving new potential to actual custom analytics projects.