Presentation : First thoughts on the experiment: Paul Berkman
Conveners: Mark Parsons, Dave Clark, Liao Shunbao and Paul Berkman
In 1957-58, the International Geophysical Year launched an era of international and interdisciplinary research on the Earth system. In addition to developing the first comprehensive look at the Earth system, among the other significant achievements was the establishment of the World Data Center system and the first artificial satellites in orbit. Today, we are being overwhelmed by the volume of digital information and diverse strategies for its management. Needs are becoming acute to share diverse digital data across boundaries, particularly when 15% of the digital information is considered to be structured and the remaining 85% is unstructured for the purposes of knowledge discovery. Like the first satellites, which just demonstrated the potential for planetary observing systems with diverse payloads, dynamically integrating data and discovering knowledge from disparate data centers would be a demonstration of capacity.
This workshop will begin planning an international experiment with data from at least two highly disparate data centers (one of which is a World Data Center). The experiment will be designed to: (a) dynamically, comprehensively and objectively integrate these data; (b) derive meaningful relationships from the data; and (c) generate knowledge to address a well-defined Earth system science problem. The problem will be related to the Polar Regions, in recognition of the International Polar Year that will be convened from March 2007 to March 2009. To successfully design this international and interdisciplinary data experiment, we will need input from data managers, software engineers, metadata experts, Earth scientists and other individuals involved with data preservation, access and analysis.
This workshop will be convened as a panel with active audience participation. The panel members will include individuals from the CODATA sessions on: Steps Towards a System of Systems; Best Practices; Virtual Observatories in the Geosciences; and Data Mining, Data Integration and Knowledge Discovery. This workshop is a product of discussions from the International Polar Year (http://www.ipy.org) and Electronic Geophysical Year (http://www.egy.org) programs. Paul Berkman will serve as the panel moderator.
To apply the experiment to WTF-CEOP metadata, we will need granule level (rather than collection level) descriptions. The opportunity with digital resources is to utilize their inherent structure / patterns to implement granule-level descriptions in an automated manner that will dynamically identify objective relationships within and between resources. "Automated granularity" (see http://www.jstage.jst.go.jp/article/dsj/5/0/84/_pdf):
Potential Phases of the ExperimentPhase 1: Adding Value to WTF-CEOP Metadata
- Demonstrate that metadata can be repurposed in an interoperable manner
- Use repurposed metadata to identify relationships between datasets
- Link repurposed metadata to actual datasets in relational contexts
- Enhance granularity of datasets directly to interpret relationships within and between datasets
- Elaborate on the iterative process of adjusting the granularity for additional interpretations
Objective: Demonstrate interoperability and value added of metadata that has been repurposed with automated granularity.Rationale: Metadata is ubiquitous, contains subjective descriptions of content and context, requires significant effort that is not scalable, and designed to facilitate access (rather than discovery of relationships).
Experimental Design: The metadata would refer to datasets, reports, policy documents or other information resources that are associated with the hydrological cycle. The metadata and associated digital objects would be selected based on a specific experimental framework associated with GEO-GEOSS societal benefit areas.
Experimental Methods: Utilize general structural features of metadata (e.g., colon ":" as a boundary condition / rule set) as well as common elements (e.g., ISO standards) to automate the granularity and framework for dynamically relating elements within and between metadata records .