Data is the fuel of all analytics, whether you want to provide accurate timely information to decision-makers on the front lines, provide a comprehensive customer dashboard to your executives, do a predictive demand forecast or simply complete your annual budget.
If you don't have the right data for your goal, you won’t get it done. By the RIGHT data I mean … well-
It sounds obvious, doesn’t it? In our experience companies rarely have the right data ready to go, and about 75% of the time - they are surprised to find that out!
There are three primary reasons for this confusion:
The situation is exacerbated by over-eager (and sometimes sincere) software vendors who either claim that their solution will address this magically or take the most optimistic view possible of data requirements.
Here is our advice: if it sounds too good to be true, it probably is.
It is difficult though. There is SO MUCH hype about the wonders of technology and you cannot possibly research everything yourself, so that it becomes natural to doubt your own judgement and be ready to take a leap of faith. After all, why would so many Venture Capitalists invest in this technology if it weren’t any “good”?
This is where it is important to listen to your inner voice of reason. The technology may be excellent in a narrow sense but to be successful in your context, it will need to operate within your unique environment. This includes integrating with data residing on other systems, as well as human business processes.
Ultimately, it’s all about data preparation. Here is an analogy:
I love to EAT delicious food, which is why I cook. Cooking is usually easy, once all the ingredients have been prepared. Ingredient preparation, like cleaning and chopping, are the parts of the process I like the least. If you are making an Asian-style stir-fry you can easily spend 40 minutes prepping, 5 minutes cooking … and about 1-minute gobbling it all down (if you are my kid).
Data preparation is like food preparation. If you don’t prep well, there will be a price to pay. According to the internet, Abraham Lincoln once said: “Give me six hours to chop down a tree and I will spend the first four sharpening the axe.”
Data scientists (the people who build the fancy models) sometimes joke that 95% of an analytics project lies in data preparation. Data preparation is not just a data engineering exercise of building data pipelines that transport and transform data, but it includes iterative cycles of data exploration and hypothesis testing which inform the scope and goals of the project.
In one example (described in this case study) the client was looking for a way to create a financial forecast while the COVID-19 pandemic was raging. Traditional predictive demand forecasting techniques were not applicable because historical patterns had been rendered irrelevant. Instead, they were looking to understand the causal factors driving their business. When the project began, they had “gut-feelings” about what those causal factors would be. These were ultimately proved wrong through a series of iterative explorations and statistical analyses of the data. Once they understood what the true drivers were, they were able to explore scenarios based on tweaking those drivers.
In another example, a global financial consolidation client assumed that the different accounting (ERP) systems (including Oracle Financials and JD Edwards) used by subsidiaries across the globe had consistent definitions of their charts of accounts. It turned out after some iterative data exploration that there were significant inconsistencies, and that there was a need to:
Note that neither (1) nor (2) in this example are problems that can be solved by technology, but you can solve the problem more quickly if you have the right technology tools at your disposal. Conversely, denying the existence of the problem can be costly. This example also shows the interplay between data prep and human business processes.
In summary:
The sad truth about data problems is that most of the horror stories about cost overruns and delayed or failed projects could have been avoided by taking a realistic and common-sense approach to data preparation from the start.