As computers, databases and networks come to intermediate almost every aspect of modern human life, the volume of electronic breadcrumbs generated from our daily activities and transactions has exploded. Academic researchers, marketers, governments, and social scientists are all clamoring to record, save and crunch all that information to gain new insights in order to move society, as well as their own interests, forward.
The big questions: How should they responsibly handle it all, and how can they intelligently interpret what they find?
It’s a question that is so profound that it’s going to take decades to figure out, if at all. Some people are calling this next stage of the information revolution the “industrial revolution of data.” The Economist notes, “The effect is being felt everywhere, from business to science, from government to the arts.”
Indeed, two influential researchers Danah Boyd of Microsoft Research and Kate Crawford of the University of New South Wales in Australia predict that “Big Data,” as the phenomenon is being called, will realign our fundamental assumptions and operations in life, just as Henry Ford’s revolutionary automation of making products “produced a new understanding of labor, the human relationship to work, and society at large.”
“Big Data,” and its implications for society are a big theme among the technorati these days. This week, for example, the theme for the big Web 2.0 Summit in San Francisco was focused on this very issue. When introducing the conference on Monday, conference co-chair and publisher Tim O’Reilly used a vivid example to illustrate the point: Google’s self-driving car is only able to do what it can do because all the equipment it is decked out with is gathering and processing huge amounts of information to navigate.
“Google’s self-driving car is tethered to a database,” he said. “It’s a Google Maps driver driving the car.”
Another illustration of how huge sets of data can be used to change something fundamental in everyone’s life comes from Microsoft’s Chief Strategy and Research Officer Craig Mundie and Google Chairman Eric Schmidt, who sit on a presidential task force to reform American health care.
“Early on in this process, Eric and I both said: ‘Look, if you really want to transform health care, you basically build a sort of health-care economy around the data that relate to people,” he told The Economist.
Policymakers should crunch all the information that emerges from providing health services and use it to figure out to improve “every aspect of healthcare,” he said.
But it’s easier said than done. Policymakers, social scientists and data crunchers need to question their own assumptions when approaching such large sets of information if only because if they don’t, it could lead to deeply flawed “insights,” and potential (if inadvertent) violations of privacy, argue Boyd and Crawford in a recent paper on the “Big Data” phenomenon.
In the case of flawed insights, the two researchers point to the relatively recent rash of social scientists’ research based on analysis of tweets to extrapolate what society at large is experiencing.
They point out the obvious flaw that not everyone uses Twitter, and in addition, many account holders (40 percent according to Twitter) don’t ever say anything on the micro-blogging service: They’re there just to listen.
These are just a couple of small examples of the kind of problems that researchers reliant on big data are likely to encounter that the duo highlighted in their paper, which is well worth the read.
“The era of Big Data has only just begun, but it is already important that we start questioning the assumptions, values, and biases of this new wave of research,” they write. “As scholars who are invested in the production of knowledge, such interrogations are an essential component of what we do.”