[Crossposted at SpreadingScience]
This post originally appeared on Cumulus Partners. It’s republished with permission.
Quentin Hardy’s recent post in the Bits blog of The New York Times touched on the gap between representation and reality that is a core element of practically every human enterprise. His post is titled “Why Big Data is Not Truth,” and I recommend it for anyone who feels like joining the phony argument over whether “big data” represents reality better than traditional data.
In a nutshell, this “us” versus “them” approach is like trying to poke a fight between oil painters and water colorists. Neither oil painting nor water colors are “truth”; both are forms of representation. And here’s the important part: Representation is exactly that — a representation or interpretation of someone’s perceived reality. Pitting “big data” against traditional data is like asking you if Rembrandt is more “real” than Gainsborough. Both of them are artists and both painted representations of the world they perceived around them.
Data by itself has no meaning. It does not if it is big or traditional. Data simply exists.
It requires interaction with human beings to be transformed into information, humans to provide context, humans to provide understanding. It requires interactions between human being to transform data into information and beyond onto knowledge.
As I wrote “Information that is held by an individual, which is never revealed or acted upon, has no value. The greatest medical discovery in the world does little good if it dies with the discoverer.”
All big data is allow humans to examine data that is too large, too complex or too difficult to examine by traditional means.
But the problems with any data – confirmation bias, cherry-picking, etc. – do not simply go away because the data is big. It still requires humans to transform this data into meaningful knowledge.
That still requires open and transparent communication between people to function best.