Idealab

Obama Administration Bets Big On ‘Big Data’ Initiative

03.29.12 | 9:45 am

IBM's Blue Gene supercomputer, developed with the US Department of Energy's Lawrence Livermore National Laboratory in California.

The Obama Administration is betting big — some $200 million a year — on a trend that’s been the talk of the tech industry for the past decade or so: “Big data” a name given to the increasingly large and unwieldy sets of data about all sorts of systems — from Earth’s climate to consumer behavior to traffic patterns — that computers are capable of collecting, storing and analyzing.

Putting these large sets of data to their most effective uses has been a challenge of engineers, companies and policymakers alike, but the White House is counting on six federal agencies to get a handle on the situation, even comparing the newly announced initiative to the federal funding that birthed the prototype of the internet.

“In the same way that past federal investments in information-technology R&D led to dramatic advances in supercomputing and the creation of the internet, the initiative we are launching today promises to transform our ability to use Big Data for scientific discovery, environmental and biomedical research, education, and national security,” said Dr. John P. Holdren, Assistant to the President and Director of the White House Office of Science and Technology Policy, in a statement on Thursday announcing the new effort.

The $200 million in federal annual funding is coming from the 2013 budgets for the six agencies participating in the program, each of which is using the money to promote radically different projects. Those six agencies include: the Defense Advanced Research Projects Agency (DARPA) (which birthed the internet’s prototype ARPANET), the Defense Department, the National Institutes of Health, the National Science Foundation, the Department of Energy, the U.S. Geological Survey.

More information on the initiative will be outlined in a press conference at the American Association for the Advancement of Science in Washington, D.C. at 2 pm ET. Web users can tune in to the live webcast here. But until then, here’s the initial details on each of the agency’s distinct applications.

DARPA

DARPA, for one, will be using $25 million over the next four years to develop a program called XDATA, which aims to build “human-computer interaction tools for facilitating rapidly customizable visual reasoning for diverse missions,” or an augmented reality system. That sure sounds similar to the proposed Google Glasses to us.

The Defense Department

Meanwhile, the Defense Department writ-large will spend $60 million annually from the initiative and an additional $190 million annually on “truly autonomous systems that can maneuver and make decisions on their own,” e.g. smart drones and driverless vehicles, and “a 100-fold increase in the ability of analysts to extract information from texts in any language, and a similar increase in the number of objects, activities, and events that an analyst can observe.”

The National Institutes of Health

The National Institutes of Health on Thursday announced that its own big data effort — “the world’s largest set of data on human genetic variation” — was already live and publicly accessible online. The data set contains the complete DNA of 1,700 study participants and 900 more are scheduled to be sequenced added “as soon as possible,” gleaned from public and private institution surveys and designed to “identify genetic variation occurring in less than 1 percent of the study populations” which NIH says may “make important genetic contributions to common diseases, such as cancer or diabetes.” The data is stored as an XML file on Amazon’s cloud computer servers and NIH invites “other cloud computing providers who are interested in hosting the data.” Other hosts will have to be ready to handle the load though: The set is 2,000 terabytes in size, which the NIH says is equal to 30,000 DVDs.

“The explosion of biomedical data has already significantly advanced our understanding of health and disease. Now we want to find new and better ways to make the most of these data to speed discovery, innovation and improvements in the nation’s health and economy,” said NIH Director Francis S. Collins, M.D. in a statement Thursday.

NIH is also working with the National Science Foundation on another research solicitation in imaging and other data sets on disease.

The National Science Foundation

The NSF is concentrating on funding universities to train the “next generation” of scientists and engineers to be able to handle big data. One $10 million NSF project at the University of California, Berkeley aims to further “machine learning, cloud computing, and crowd sourcing.” Another round of grants (amount unspecified) will go to “EarthCube – a system that will allow geoscientists to access, analyze and share information about our planet.”

“A wealth of information may be found within these sets, with enormous potential to shed light on some of the toughest and most pressing challenges facing the nation,” added Lisa-Joy Zgorski, a spokesperson for the National Science Foundation, in an email to TPM. “To capitalize on this unprecedented opportunity–to extract insights, discover new patterns and make new connections across disciplines–we need better tools to access, store, search, visualize, and analyze these data.”

The Department of Energy

The Energy Department is using $25 million to create an institute “to develop new
tools to help scientists manage and visualize data on the Department’s supercomputers.” The project, called the Scalable Data Management, Analysis and Visualization (SDAV) Institute, will involve six national laboratories and seven universities.

The U.S. Geological Survey

The USGS is dolling out grants for projects that will focus on “species response to climate change, earthquake recurrence rates, and the next generation of ecological indicators.”

But aside from these initiatives, the Obama Administration also called upon the private sector and non-governmental organizations to provide support and funding for the big data push.

“We also want to challenge industry, research universities, and non-profits to join with the Administration to make the most of the opportunities created by Big Data,” wrote Tom Kalil, deputy director at the Office of Science and Technology Policy, in a blog post. “Clearly, the government can’t do this on its own. We need what the President calls an ‘all hands on deck’ effort.”

Send Tips

Carl Franzen

Obama Administration Bets Big On ‘Big Data’ Initiative

Start the discussion at forums.talkingpointsmemo.com