Big Data’s not-so-hidden cost
We recently spent several days in Chicago at the Technology Business Management Conference, an event sponsored by Apptio that draws in close to a thousand C-level executives focused on the business aspects of IT management. Visibility was the key theme and Blazent demonstrated advanced analytic solutions in the areas of Governance, Audit and Compliance, Cost Reduction, Risk Management, and Agility and Transformational capabilities. The operational context for this was the rise of inaccurate (or dirty) data, and its effect on the IT ecosystem.
The media context for this has been a steadily increasing amount of press coverage on the rise of the Internet of Things, and it’s associated effect on Big Data. IP addresses can now be attached to nearly anything, and the net result is that we are moving from a world where information was created by humans (e.g. Facebook) to one where information is being created by machines (e.g. sensor networks). The scope of this shift in information technology is breathtaking, as we are rapidly transitioning from a reference framework of billions, to one of trillions.
Every little data nit that comes from every single connected device is traversing a network with multiple hops, leaving a digital paper trail that needs to tracked, correlated, contextualized and acted on. The tricky part is that this data morass is a combination of both human and machine data; making mistakes is one of the core elements of being human, not making mistakes is a core element of being a machine, unless a human was involved in some capacity, in which case errors (or dirty data, the current popular industry term) is endemic to the system.
The implications of this are huge:
- Dirty data is estimated to cost the US economy over 3 trillion (there’s that word again) dollars per year[1].
- At a company level, the cost of dirty data averages out to around $13 million per year
- Dirty data cost the health care industry north of $300 billion per year
- Experian estimates business lose 12% of their revenues due to dirty data[2]
The list goes on and on, and it all revolves around deliverables that are normally associated with IT Services. This is not just an inconvenience, this is a genuine stranglehold on corporations globally, the people who work for them, and the customers they serve. It affects any company’s financial performance, customer satisfaction levels, and internal performance metrics. When this doesn’t work, everyone suffers, when it does work, everyone wins.