You don’t need ‘perfect’ data for analytics
Deloitte Consulting Chief Data Scientist James Guszcza debunks several persistent misconceptions about analytics and suggests simple ways to encourage trust in predictive models.
As chief data scientist for Deloitte Consulting LLP, James Guszcza helps clients use analytics to address their most vexing problems, applying advanced statistical learning methods as well as principles from behavioral economics. Below, Guszcza addresses several misconceptions about analytics, including the myth of “perfect” data and growing concerns about technology-induced job loss—issues that may be top-of-mind for CMOs and their C-suite colleagues as they implement data-driven initiatives.
What misconceptions do executives have about analytics?
Many organizations think they need “perfect” data to proceed with analytics. In fact, many of our clients tell us at the outset of an engagement, "We’re not like your other clients. Our data is terrible. You’ll meet your match here." We’ve yet to meet our match, and people are often very surprised with how much you can do with imperfect data.
What do you mean by ‘imperfect’ data? When it comes to data quality, what’s acceptable or unacceptable?
There’s no absolute standard that describes what data is sufficient vs. insufficient. When I talk about “imperfect” data, what I mean is, people’s notions about the type of data required to produce value from analytics is often way too restrictive. They’re frequently surprised to find that even a limited data set can help make valuable classifications or predictions.
What other misconceptions do executives have?
I frequently observe executives conflating analytics with a specific technology. To do analytics, they think they need big data or a specific software package or a machine learning algorithm. So they begin their analytics journey by procuring and implementing technology, which can lead to very disappointing results. It’s rarely the case you can implement a piece of software and be done. Moreover, different problems and use cases require very different types of data, tools, and methods. A more effective approach involves identifying up front the decisions or business processes an organization wants to improve, then asking: What kind of model or data product can we build to improve those decisions? What data do we need? How do we convert the raw data into actionable insights or predictive model indications? And how do we get people to appropriately act on those insights?
Within an organization, is there one function that should be responsible for analytics?
Because analytics has a heavy-duty data and technology component, some CIOs and CTOs think the analytics function should sit in IT. While IT has a critical role to play in terms of supplying data and turning a predictive model into a functioning piece of software, analytics is first and foremost a strategic capability, and one that frequently cuts across multiple functions in an organization. It’s a multidisciplinary process that requires a blend of domain knowledge; versatility with math and statistics; fluency with data science methodologies; and experience with tools and techniques for cleaning, processing, and programming with large data sets. Technology is necessary but not sufficient to do analytics.
Predictive models are powerful decision-making aids, and yet, the subject matter specialists for whom these models are developed often reject them. Why is that?
In many cases, it’s because those subject matter specialists weren’t involved in designing, building, or implementing the model. As a result, they don’t understand how it was made or how it works, and they don’t trust it. I think their skepticism is often understandable. Skepticism, after all, is actually the essence of scientific thinking. In my experience, data science works best when domain experts, data scientists, and change management experts achieve a common language and collaborate throughout the project.
Do people sometimes resist employing the recommendations generated by predictive models because they fear the model will eventually replace them?
A lot of people are worried about computers replacing humans in increasing numbers of jobs, and certainly, there are cases where predictive models have replaced people—loan officers in banking, for instance. But there are many cases where models make humans more effective. Think of predictive models as eyeglasses for the mind. They’re an aid to making better decisions in the same way that glasses are an aid to seeing more clearly.
I’ve noticed that working with models, in an ironic way, can make us more human. Letting the model do what computers are good at—processing hundreds of pieces of information, consistently, in a computationally efficient way, at any time of day without getting tired—frees up humans to tap into their creativity and empathy, and to spend more time understanding context and nuance. Steve Jobs once said that computers are like bicycles for the mind. I have a similar notion about data science. As data scientists, our goal shouldn’t be to build models that replace humans, but rather to build data products that expand our frames of reference while making us smarter, more creative, more empathetic, and more effective in our work.