1880 would seem an unusual place to start the big data revolution, yet here we find ourselves for the birth of what seems like a very modern problem.
The year is 1880, the place; USA. Following the previous controversy of 1870 the US Census Bureau takes its tenth census on June 1. The data is simple - a persons name, their occupation and some basic health and geographical information (other parts of the census asked about social, agricultural and manufacturing data). The results would turn out to be far from simple.
The 1880 census recorded data from a growing population of approximately 50 million people. With at least 4 pieces of data per person this resulted in more than 200 million pieces of data. This amount of data took the US Census Bureau 8 years to tabulate, and predictions for the 1890 census were that they would take more than 13 years using the methods of the time.
The US Census Bureau and US Government had found themselves with what seems like a very 21st century problem. As with data collection today, the US Census Bureau had to find a new way to tabulate its data (the result being the Herman Hollerith tabulating machine), but unlike today - the US Census data was difficult to collate and difficult to analyze. The field of automatic data processing had to be invented in preparation for the 1890 census.
Today’s companies have access not only to growing numbers of data on their customers and their business, but also to a plethora of tools that analyze and visualize this data. But what does this mean for companies, and is it really helping?
Enter data visualization
As the topic of big data is talked about more and more, so too is the topic of data visualization. Data visualization is positioned as a panacea to big data, the solution to an ever growing and ever complex store of data. But is it delivering on this promise?
The purpose of data visualization is to explain and to entertain, but the ultimate goal will always be to promote action.
John Snow’s original visualization of the Broad Street Cholera outbreak offered a new and innovative method to view simple data (image below). The plotting on a map gave readers (including Snow himself) the ability to consider the effect geography had on the outbreak - a previously unconsidered element. The result of this visualization ultimately led to understanding the cause of the outbreak and the action to remedy this (disabling the Broadwick Street water pump).
Some four years later, Florence Nightingale went on to create her ‘Diagram of the causes of mortality in the army in the East’ (image below). This data shows no more information than could be displayed in a simple table, but its value comes from its ability to entice the reader by engaging them, before educating them on the causes of mortality and the effect time has on this.
While both of these examples are now over 150 years old, they show that - no matter how new or old the technology - the purpose of data visualization remains unchanged. Its purpose, as it has always been, is to promote action to help an individual or a company.
As the field of data visualization grows more and more over time, the method of evaluating visualization should remain unchanged. Companies should judge data visualizations not on how they look and how impressive they appear - but rather with the core questions: how does this help me, how does this help my company and how does this help my customers?
If the answer to any of these questions is not clear, you could be buying an expensive case of style over substance.
The US census was by no means the first, with some of the earliest records indicating the Babylonians undertook censuses some 6000 years ago. What made the 1880 census unique was that the seven years it took to tabulate the data, meant that the results were obsolete by the time they were available. The resulting fallout led to the development of tabulating machines that were able to process the 1890 census in under a year.
Through a series of mergers, Herman Hollerith’s Tabulating Machine Company would go on to form the International Business Machines Corporation (IBM)