(9 min read – Published on 4th August 2019)
Whether you’ve seen the 1967 movie The Graduate or not, you must have already been ecstatic about something new at least once in your life, only to realise shortly after there was much more to it, putting you in a more complex situation than you thought. That’s exactly what happened to Ben and Elaine at the end of The Graduate, after running away from Elaine’s wedding and jumping into a municipal bus together, literally beaming, only to realise a few seconds later the consequences of their actions, leaving the spectator baffled as the movie ends on the Sound of silence. On a similar note, the data analytics field is now at crossroads after reaching its peak of inflated expectations in the past few years.
From “Data is the new oil” to “Data scientist is the sexiest job of 21st century”, you must have seen countless of these over-enthusiastic headlines. But do you really believe that data analytics will solve all our problems? That you just need to hire a bunch a data scientists and embrace a data-driven strategy to succeed? Here are six inconvenient truths about data analytics that you should know to forge your own opinion.
1. Data is not the new oil
“Data is the new oil! […] Data is just like crude. It’s valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc. to create a valuable entity that drives profitable activity; so must data be broken down, analysed for it to have value.” – Clive Humby, ANA Senior marketer’s summit, Kellogg School, 2006.
You must have heard “Data is the new oil” countless times in the last decade, but the reality is that this analogy doesn’t live up to its expectations past the comparison with the refinement process. Yes, data needs to be transformed to be usable – just like oil – but unlike oil which is a fossil energy, data is truly unlimited. Unlike oil, data doesn’t have a global standard price. And finally, data doesn’t have any viable substitute, unless you want to run your business on gut instinct (hint: you don’t). Nevertheless, what’s for sure is that data is at the core of its own industrial revolution, just like oil or electricity a few centuries ago. And in the same way we had Chief Electricity Officers at the time to “figure out what this electricity stuff was about”, Chief Data Officers are now all the rage but it’s still unclear as to what exactly their role is about until we finally master this new resource which seems so hard to grasp, as we did with oil and electricity through the last century.
2. Only a happy few are doing it right
“Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it…” – Dan Ariely, Professor of Behavioral Economics, 2013.
Despite being six years old, Dan Ariely’s comment couldn’t be more on point since it’s very difficult to separate the grain from the chaff when it comes to analytics. Lots of companies are praising their great analytical capabilities and products, but no one really knows what’s happening when you take a closer look under their hood. In fact Gartner released a compelling number earlier this year: 80% of analytics insights will not deliver business outcomes through 2022. What’s more surprising is that the gap between understanding the importance of analytics and effectively applying it has never been more important, with a new Harvard Business Review survey finding that “86% of organizations find that the ability to extract new value and insights from existing data/analytics applications is very important, but with only 30% being very effective at doing so”. The moral being, if you want to implement a successful analytics strategy, focus on what you’re doing, step by step, without paying too much attention to what others say because chances are they are not as successful as they’re claiming to be.
3. Hiring data scientists is not the solution
“The worst mistake a company can make is to hire a cadre of smart data scientists, provide them with access to the data, and turn them loose, expecting them to come up with something brilliant.” – Are You Setting Your Data Scientists Up to Fail?, HBR, 2018.
One of the common misconception is that hiring data scientists will solve all your problems, and will ensure your company a bright future. But what’s more important than having data scientists is to have analytics translators within your company.
“Success with analytics requires not just data scientists but entire cross-functional, agile teams that include data engineers, data architects, data-visualization experts, and — perhaps most important — translators.” – You Don’t Have to Be a Data Scientist to Fill in this Must-Have Data Science role, HBR, 2018.
Translators are people who understand analytics without having the deep technical expertise in programming or modeling that data scientists have. They are very close to the core business, they understand and can translate the impact of analytics to key decision-makers in the company.
In short, translators impact the analytics value chain process in almost every step of the way (3 out of 4). From identifying and prioritizing business problems (1) which can be solved with analytics, to then formulating concrete data questions (2) to model these exact business problems, and finally, translating the data answers into actionable insights and recommendations to formulate a business solution (4).
4. Chasing unicorns is not worth it
If hiring data scientists isn’t enough, why not looking for unicorns? Those rare individuals who excel in all the areas that encompass data science, those who are analytics translator as well as data scientists, data engineers, and even data-visualization experts. Unfortunately unicorns don’t exist, and even if they do it’s not worth chasing them because the best strategy is to “build data science teams with complementary talents”.
The Data Science skills are very broad, from the hard skills of programming, modeling and statistics, to the soft skills of problem solving and communication. It’s impossible to find someone who master each of these, and even if you do, one person won’t be enough to scale analytics within your company. But you can achieve greater results by building a complementary team of curious people who on aggregate master all these skills. Don’t waste your time and money chasing that unicorn.
5. Being data-driven is not always the best option
The difference between being data-driven and being data-informed might be subtle, but it can help you avoid catastrophes. With the former, data is at the center of the decision-making process, while in the latter, data is treated as an independent piece of information like any other. Being data-driven can be a losing strategy in a few cases when the data quality may be questionable, when your data may not be representative of what you’re trying to forecast, or merely because of human error – which might be the most dangerous case scenario.
One key example is the story of Charles Reep – the father of soccer analytics – who made “one big mistake that changed the course of English soccer for the worse”. Reep was one of the first person to collect data on soccer matches in England back in the 1950s, and thanks to his analysis, he came to an accurate conclusion that “most goals in soccer come off of plays that were preceded by three passes or fewer”. Therefore Reep’s recommendation was simple: the best way to win a football game was to use long balls in his “no more than 3 passes” strategy.
“If a team tries to play football and keeps it down to not more than three passes, it will have a much higher chance of winning matches. Passing for the sake of passing can be disastrous.” – Charles Reep interview with the BBC, 1993.
The only difficulty was that Charles Reep’s strategy – even if it was based on some good math – was wrong. What Reep failed to take into account is that in soccer, all plays are of short possession and of small number of passes, whether they end up in a goal or not. But if you compare more closely the plays that do end up in a goal versus those that don’t, you realise that in fact “a team’s probability of scoring goes up as it strings together more successful passes”, making Reep’s strategy absolutely counter-productive. In this case, Reep’s data-driven approach ended up backfiring, what would you think if it was your company’s strategy at risk?
6. There are lots of analytics charlatans
“It is the mark of a charlatan to explain a simple concept in a complex way.” – Naval Ravikant on Twitter, 2016.
If not already, you will encounter many charlatans trying to sell you their next level machine learning, artificial intelligence or big data products. More often than not, they will explain to you how complex and amazing their product is, trying to convince you to buy it. But the reality is that data science and its different terms can be explained in plain language, here is my attempt to it.
- Definition: a set of instructions to complete a predefined task
- Examples: cooking recipes, or following your GPS to go from point A to point B
- How it is used: algorithms are used to tell computers what to do, each steps and rules have to be explicit and coded within the system
- Definition: method of teaching computers to identify data patterns, and build models to explain and/or predict things
- Examples: Netflix content recommendation engine, Google search rankings, or Facebook “people you might know”
- How it is used: it is mainly used in three different ways:
- ‘Supervised learning’: when the programmer gives a specific objective to the computer. For instance analyzing your browsing cookies and serving you with the most relevant ad on Facebook.
- ‘Unsupervised learning’: when the programmer doesn’t know where to look and what to find, the computer is then looking for patterns within the data. For instance Airbnb grouping its housing offers into groups of similar properties so that user can navigate listings more easily.
- ‘Reinforcement learning’: when the programmes teaches a computer how to learn from data based on a reward-punishment methodology. It’s like when you teach your dog how to sit, you give him a treat each time he completes the task correctly, repeating the process until he fully understands.
- Definition: teaching computers how to mimic human cognitive functions by using machine learning algorithms
- Examples: Siri speech recognition, Gmail spam filtering, language translation, playing games such chess, Go, Dota2, or more recently multiplayer poker.
- Definition: range of new and massive datasets
- Examples: any dataset which fulfills each of the “3V” criteria:
- ‘Volume’: needs to be big, usually over 1 Terabyte of data
- ‘Velocity’: speed at which the data is generated and processed, usually real-time
- ‘Variety’: type and nature of data, usually a mix between text, images, audio and video
- Definition: science of analyzing data
- Examples: analytics covers Artificial Intelligence, Machine Learning, and Algorithms altogether, but also includes any type of analysis that you might do on your excel spreadsheets
- How it is used: Analytics is used in three distinct ways:
- ‘Descriptive’: provide insights into the past and answer the question “what has happened?”
- ‘Predictive’: forecast the future and answer the question “what could happen?”
- ‘Prescriptive’: advice on possible outcomes and answer the question “what should we do?”
- Definition: art of extracting value out of data
- Examples: encompass all the definitions above-mentioned, from Analytics to Big Data, Artificial Intelligence, Machine Learning, and Algorithms.
As Ben and Elaine realized at the end of The Graduate, you now understand that there is no reason to believe in the fairy tale anymore. But as much as it is true for them as well, this doesn’t mean that you cannot make it work, and with the right translators, the right data analytics team, and especially the right approach, there is no reason to believe that you cannot turn your analytics narrative into a successful outcome. Better late than never!
Author’s note: This is a personal article. Any views or opinions represented in this article are personal and and do not represent those of people, institutions or organizations that the author may or may not be associated with in professional or personal capacity, unless explicitly stated.