Przemysław Pala
Marcin Dobosz
Why 80% of data science projects are doomed to failure
Let’s start with the most common opinion in the general public: only big companies have Big Data, and Data Science instantly brings enormous benefits by generating magical data insights.
For example, online stores and processing centers, as well as popular media (blogs, social networks accounts, web magazines, etc.) and other brands that work closely with a lot of SMM channels and active users (subscribers).
In this case, the data is a “byproduct” of the core business of such companies rather than a significant value. As a rule, monetizing Big Data is an exciting idea that can be tried “at leisure” without too much distraction from current tasks. It is this formulation of the question on the part of the business
“We have data, there is a lot of it and different, we need something to do with it, preferably more profitable and faster” is one of the main reasons for the “failure” of various Data Science projects.
Thus, a company makes the global mistake of starting to implement Data Science tools without clearly identifying the business need that they will be able to meet. Even if there is no significant investment in IT infrastructure, such as purchasing and configuring Big Data clusters with Apache Hadoop and other big data collection, storage, and analytics tools, it pays to hire expensive analysts and machine learning specialists.
However, with the deep technical competencies of these Data professionals, most focus only on the data, missing the main thing – the business problem statement – behind many specific details. Of course, identifying needs and developing solution requirements is the business analyst’s responsibility. Still, knowledge of the result and understanding its impact on corporate operations should be the responsibility of every team member.
In addition, digitalization, digital transformation, and data-driven organization are, above all, about the organizational maturity of operational processes and IT infrastructure, as well as the data they contain. Therefore, before trying to predict the future using complex Machine Learning models or seeking unknown insights in the stacks of raw data, hiring a Data Engineer and Data Scientist, try to make the most of off-the-shelf Business Intelligence (BI) systems and DaaS services – cloud platforms that provide services for collecting, processing and analyzing Big Data (Data as a Service).
In the following, we will consider why these big data analytics tools are more beneficial for many businesses than their Data Science projects, covering most of their needs without significant investments of time, money, and human resources.
5 reasons why big data bi-analytics is more profitable than data science
Reason#1: Cheaper
There are many off-the-shelf BI systems, the most popular of which today are MS Power BI, Qlik, and Tableau, as well as their various free and commercial analogs, including complex DaaS services, such as Talend Data Cloud, Azure Open Datasets, Google DataStudio, etc. As a rule, they can be used either locally or in the cloud according to a subscription model, when a fee is charged for a time period or the number of resources used. In any case, it will be more cost-effective than deploying your own Big Data infrastructure and developing unique Machine Learning algorithms to analyze logs, JSON, XML, and other raw data files.
Reson#2: Faster
Since off-the-shelf solutions are designed for mass use, they already contain sets of algorithms for processing and analyzing big data that are most in demand in practice. For example, the clustering of customer segments and their visualization with clear graphs, charts, and tables. Thus, the TTM (Time To Market) idea will be significantly less than in the case of the complete CRISP-DM cycle in Data Science projects.
When you first need to develop a business hypothesis, implement its prototype with Machine Learning algorithms, train and test the ML-model, and deploy it all in stable production quality.
Reason#3: Clearer
Built-in visualization modules of ready-to-use BI and DaaS solutions visualize the most relevant indicators for the business, such as the number of visits, conversions, sales level, total costs, and revenues, as well as in the context of individual items. The interface “speaks the language of the business,” remaining clear not only to the IT specialist but also to the manager, as well as marketing, functional manager, and other specialists.
Reason#4: Practical
BI systems and DaaS solutions for big data analytics visualize the most critical business indicators and automate many “back-office” processes needed to create a coherent picture. For example, cleaning up “raw” data (removing omissions, outliers, and incorrect values), generating reports according to corporate standards, preparing data in the correct format for sending to other systems, etc.
At the same time, most of the ready-made solutions are extensible, providing APIs or visual editors for creating your own functions, scripts, and other unique settings. In dedicated Data Science projects on creating a fundamentally new solution “from scratch,” most of the resources are usually spent on research work, the search for new models, interesting algorithms, and “games” with the optimization of neural network parameters.
Reason#5: More accessible
in addition to the fact that ready-made BI and DaaS solutions do not require significant investments, as we mentioned above, they do not require an established team of Data Professionals (engineer, analyst, architect, developer, Data Scientist), as well as a full-fledged Big Data infrastructure. Moreover, since BI- and DaaS platforms are primarily used to monitor the essential business indicators, they are more tolerant of the managerial maturity of the current business processes than the data-driven approach.
Thus, BI/DaaS implementation can be viewed as the initial stage of digitalization without deep reengineering of corporate activities, which implements the so-called “evolutionary strategy” of gradual improvement.
Conclusion
However, BI systems can be called a universal foundation on which the deep Data Science and applied analytics of big data stands and which is suitable for practical use in almost any business. In the following article, we will look at the main challenges that every digitalization director faces in the digital transformation of private companies and public enterprises and talk about possible ways to address them. And we will analyze the applied case of a big data analyst in a small business using Big Data technologies in this material.
In case you liked our article, take a look at similar readings
Dedicated Payment Solution – Benefits and Downsides
Big Data Analytics in Retail: Use Cases, Benefits, and Best Practices