IEEE escience meeting at Oxford: 4th Paradigm

The first paradigm is experimental science. The second paradigm is theoretical science, and the third, computational science. The forth is data-intensive science. This data-intensive science paradigm is also a feature of the emerging datafullness of the object of study. Satellites and sensorwebs, CCTVs and Streetviews, MRIs and CAT scans, Facebook and YouTube– what we study is no longer data poor, but increasingly data-full. The question is no longer one of how to scrape up enough data to create a study, but rather how to winnow the emerging data deluge. Sociologists can no more ignore the data available from online social networks than meteorologists can ignore an emerging Mid-Atlantic tropical depression.

In his talk at the IEEE eScience meeting, Jeff Dozier also mentioned that earth sciences are entering a new task horizon. In the1800-1900s, the earth sciences were discipline oriented sciences. From the 1980s+ we saw the development of earth system science. Emerging now: earth knowledge in service of policy to address planetary risks, such as climate change.

The eScience challenges are many here. The increase in observational data make it possible to refine the resolution of climate models, which push the limits of available HPC resources. The data processing algorithms designed for science must be made robust enough to sustain resource and environmental enforcement decisions. New venues for communication between scientists, data providers, and policy decision makers need to be supported and used. This is a real opportunity for organizations such as the ESIP Federation to become active forums for problem solving.

Photo Credit: NASA Earth Observatory

Microsoft Research’s 4th Paradigm ebook is available under a CC license here:

#3 All Hands e-Science meeting at Oxford University

Software as a Service and Software as a science: keynote by Tony Hoare (Microsoft scientist from Cambridge).

The e-Science effort in the UK was to ensure that digital information technologies would have as great an impact on the practice of science as it was having in telecommunications, entertainment, and other aspects of society.

In the human genome project, the people who were funded did not promise to cure a single patient in the first 15 years. The notion was that the overall knowledge gain was so significant that future advances in medical knowledge would ensue. In the same way, the growth of digital tools in science will not necessarily pay-off in the short term, but will build, over time, those new tools that will move science to a new level of capability.

The computer engineers that are engaged in e-science research are not just of service to “real scientists” but are also engaged in a real engineering science. And so Professor Hoare argues that the software products are not just a service to others but also the outcome of a science as “real” as chemistry or physics.

Having browsed the booths and the breakouts, I can say that the entire meeting, 600 people talking and listening for 5 days, rolls on three wheels: high performance computing (and pooled data storage), and the means to distribute this  capability for scientists in multiple locations; science tools and services built on top of this data/computing network; and collaboration practices that promote and manage a range of sharing from data sharing, to shared experiments, to the (open access) publication of results.

The engineering of the HPC infrastructure and the building of the services on top of these are not the real transformative levers of e-science. They mostly add efficiency and distribute resources more widely, so that science does not need to happen in a few concentrated locations (research labs at selected universities and corporate locations). This distribution of effort extends regionally, and eventually, globally. But this capability and the tools that allow its use replace similar tools that scientists at selected universities already use.

The promise of new collaboration practices is where e-science has the potential to transform science in ways that are both intended and unintended. Last evening after dining at “high-table” at Christchurch College, I had a spirited conversation with a fellow on the phenomenon of Wikipedia. He was astonished by the amount of trust that users had in the quality of Wikipedia. I countered that the main value of Wikipedia was its ability to cover an amazing number of topics, far more than any previous encyclopedia. The real value of Wikipedia was its range, I proposed. This value was achieved the only way possible: by reinventing the role of the author/editor. Similarly, e-science will gain its promise only when it reinvents what it means to do science; who can do it; how it’s reviewed; where its published; how it’s used. Very little of this promise will simply grow from improvements in HPC and tools. Much of this will emerge as new users and new collaborative opportunities arise.

Photo Credit: NASA Earth Observatory

#2 All Hands e-Science meeting at Oxford University

Tom Rodden is looking at the history of e-Science, moving from infrastructure to collaborative tools (e.g., MyExperiment). After all the digital world is in the foreground of their lives. 1.5 billion Internet users in 2010. The more that our lives are performed on digital platforms, the larger footprint we leave.  Google uses this footprint to target advertising. The next stop is uniquitous computing lifestyle.Hew then do we build a contextual footprint as a conscious activity. Computers will be able to sense human activities and use this sense to enable new forms of interaction.

Some gathered quotes:

“Half the world’s people have never made a phone call: 1990s.”

“Half the World will use a Mobile phone by 2010.”

“By year end 2012, physical sensors will create 20 percent of non-video internet traffic.” (Gartner group).

Mobile phone use becomes a means of credit rating in countries with little credit history. Tom looks at the technology of amusement parks, where research is creating “fear sensors” that help park rides maintain an optimal amount of terror for each customer. Digital location services will help people find and share transportation services in real time. When DARPA released 10 red balloons, the main challenge was to create the reward system to get enough people to work together.

Crowd sourcing: ReCaptcha and the search for Steve Fossett are examples of crowds enlisted for a common good. These are just the beginning of public engagement in digital crowd activities. As we become ever more embedded in digital activities, we need to remember: “What matters is not technology itself, but its relationship to us”

Mark Weiser and John Seely Brown (1996).

Rodden is wary of the imbalance of knowledge/power when digital services can collect an ever widening swath of information about our human endeavors. How do we track this information flow? How do we resist?

Photo Credit: NASA Earth Observatory

#1 All Hands e-Science meeting at Oxford University

Anne Trefethen from Oxford is opening up the All Hands e-Science meeting. 186 submissions for presentations shows the growth of interest and activity in the UK for Anne Trefethen from Oxford is opening up the All Hands e-Science meeting. 186 submissions for presentations shows the growth of interest and activity in the UK for e-Science research and practice. The meeting is on the outskirts of Oxford, at the football (soccer) stadium conference center. Next door (across the parking lot) is a bowling alley and multiplex cinema. No building older than 50 years anywhere in the vicinity. So the location looks more like Oxnard than Oxford. The crowd is appropriately geeky in an academic fashion. The opening keynote (Helen Bailey) is a dancer, talking about e-Science on practice-led research. Where does e-Science lie in the larger field of technology? Is it simply science research informatics? Is it centrally HPC? Is it science 101 (hint… ASCII)? The “e” stands for “electronic,” an extension from e-mail and/or e-commerce; both of the latter refer to internet-enabled transactions. Much of the “e” in e-science involves the use of networks of computers to enable collaborations across locations. The research “transactions” flow beyond single laboratories/universities.

Helen Bailey uses e-Science to build co-located dance performances where their are dancers from multiple locations in a single dance arena (using video feeds). This research focusses on the synchronous capabilities of an HPC network to support multiple video feeds in order to assemble a real-time event.

Helen’s website:

Photo Credit: