11 November 2008

Tony Hey (MSR) on e-science and OA - Berlin 6

Tony Hey of Microsoft Research, formerly of the UK e-Science initiative, former Dean of Engineering at Southhampton, kicks off the discussion with the first keynote.

Researchers are facing moving from data to information to knowledge, at a very rapid pace. I think it's safe to say that the digital deluge in research is a reality uncontested largely, and there is an ever important need for ways to collect, analyze, process, visualize and archive this mass of digital information.

Hey speaks about the fourth paradigm, one of data centric science. He lists the other paradigms / trends of scientific research as the following:

- thousands of years ago - experimental science
- Last few hundred years - theoretical (ie., Newton's Laws)
- Last few decades - computational (ie., simulation of complex phenomena)
- Today - data-centric science (or as described at the SEED/WEF brainstorming session - a bioinformatics revolution, correlation based science)

We're experiencing a new kind of publishing - where data is often published before the actual paper. This is a different paradigm for publishing.

Hey tosses out the example of the Sloan Digital Sky Survey, the first major astronomical survey project, with 5 color images of 1/4 of the sky and pictures of 300 million celestial objects. The SkyServer, Hey says, is the posterchild in 21st century data publishing, an example of what you can do if you make the data available, and also an example of citizen science (Galaxy Zoo). Galaxy Zoo is a project where they ask the community to participate and help classify galaxies.

In speaking about his time as the Dean of Engineering at SouthHampton, Hey says that traditional means for scholarly publishing is a model in crisis. Journal subscriptions are rising faster than library budgets, web technology and digital media now make dissemination of knowledge "easy" and "Free" without traditional paper journals.

His e-science work continues at MSR, colored also by his role on the NSF advisory committee on cyberinfrastructure (ACCI).

"In order to help catalyze and facilitate the growth of advanced CI, a critical component is the adoption of Open Access policy for data, public [literature] and software" - NSF ACCI

"Make sure we walk the talk." - Hey

Cloud Computing is, to Hey, where we're going in the future.

We're moving towards a world where all data is linked ... and where data is stored/processed/analyzed in the cloud.-- Data is interconnected through machine interpretable information - social networks as a special case of these data networks, or "data meshes".

Cloud computing will affect all sorts of research, and OA - copies of articles can be trusted to the cloud, perhaps IRs still keeping local copies as well. Hey believes the OA movement will succeed, presenting a new rapprochement with publishers.

