CODATA logo
CODATA 2002: Frontiers of
Scientific and Technical Data

Montréal, Canada — 29 September - 3 October
 

Technical Demonstration Abstracts

Proceedings
Table of Contents

Keynote Speakers

Invited Cross-Cutting Themes

CODATA 2015

Physical Science Data

Biological Science Data

Earth and Environmental Data

Medical and Health Data

Behavioral and Social Science Data

Informatics and Technology

Data Science

Data Policy

Technical Demonstrations

Large Data Projects

Poster Sessions

Public Lectures

Program at a Glance

Detailed Program

List of Participants
[PDF File]

(To view PDF files, you must have Adobe Acrobat Reader.
)

Conference Sponsors

About the CODATA 2002 Conference

 

 


Track II-D-2:
Technical Demonstrations

Chairs:
Richard Chinman, University Corporation for Atmospheric Research, Boulder, CO, USA
Robert S. Chen, CIESIN, Columbia University, USA

1. World Wide Web Mirroring Technology of the World Data Center System
David M. Clark, World Data Center Panel, NOAA/NESDIS, USA

The widespread implementation and acceptance of the World Wide Web (WWW) has changed many facets of the techniques by which Earth and environmental data are accessed, compiled, archived, analyzed and exchanged. The ICSU World Data Centers, established over 50 years ago, are beginning to use this technology as they evolve into a new way of operations. One key element of this new technology is known as WWW “mirroring.” Strictly speaking, mirroring is reproducing exactly the web content from one site to another at physically separated location. However, there are other types of “mirroring” which uses the same technology, but are different in appearance and/or content of the site. The WDCs are beginning to use these three types of mirroring technology to encourage new partners in the WDC system. These new WDC partners bring a regional diversity or a discipline specific enhancement to the WDC system. Currently there are ten sites on five continents mirroring a variety of data types using the different modes of mirroring technology. These include paleoclimate data mirrored in the US, Kenya, Argentina and France, and space environment data mirrored in the US, Japan, South Africa, Australia and Russia. These mirror sites have greatly enhanced the exchange and integrity of the respective discipline databases. A demonstration of this technology will be presented.


2. Factor Analysis Optimization: Applied in Natural Language Knowledge Discovery
Robert J. Watts, U.S. Army Tank-automotive and Armaments Command, National Automotive Center, USA
Alan L. Porter, Search Technology, Inc. and Georgia Tech, USA
Donghua Zhu, Beijing Institute of Technology, China

The Technology Opportunities Analysis of Scientific Information System (Tech OASIS), commercially available under the trade name VantagePoint, automates the identification and visualization of relationships inherent in sets (i.e., hundreds or thousands) of literature abstracts. A Tech OASIS proprietary approach applies principal components analysis (PCA), multi-dimensional scaling (MDS) and a path-erasing algorithm to elicit and display clusters of related concepts. However, cluster groupings and visual representations are not singular for the same set of literature abstracts (i.e., user selection of the items to be clustered and the number of factors to be considered will generate alternative cluster solutions and relationships displays). Our current research, the results of which shall be demonstrated, seeks to identify and automate selection of a "best" cluster analysis solution for a set of literature abstracts. How then can a "best" solution be identified? Research on quality measures of factor/cluster groups indicates that those that appear promising are entropy, F measure and cohesiveness. Our developed approach strives to minimize the entropy and F measures and maximize cohesiveness, and also considers set coverage. We apply this to automatically map conceptual (term) relationships for 1202 abstracts concerning "natural language knowledge discovery."

 

3. ADRES: An online reporting system for veterinary hospitals
P.K. Sidhu and N.K. Dhand, Punjab Agricultural University, India

An animal husbandry department reporting system (ADRES) has been developed for online submission of monthly progress reports of veterinary hospitals. It is a database prepared under Microsoft Access 2000, which has records of all the veterinary hospitals and dispensaries of animal husbandry department, Punjab, India. Every institution has been given a separate ID. The codes for various infectious diseases have been selected according to the codes given by OIE (Office International des Epizooties). In addition to reports about disease occurrence, information can also be recorded for progress of insemination program, animals slaughtered in abattoirs, animals exported to other states and countries, animal welfare camps held and farmer training camps organized etc. Records can be easily compiled on sub-division, district and state basis and reports can be prepared online for submission to Government of India. It is visualized that the system may make the reports submission digital, efficient and accurate. Although, the database has been primarily developed for Punjab State, other states of India and other countries may also easily use it.

 

4. PAU_Epi~AID: A relational database for epidemiological, clinical and laboratory data management
N.K. Dhand, Punjab Agricultural University, India

A veterinary database (Punjab Agricultural University Epidemiological Animal disease Investigation Database, PAU_ Epi~AID) has been developed to meet the requirements of data management during outbreak investigations, monitoring and surveillance, clinical and laboratory investigations. It is based on Microsoft Access 2000 and includes a databank of digitalized information of all states and union territories of India. Information of districts, sub divisions, veterinary institutions and important villages of Punjab (India) has also been incorporated, every unit being represented by an independent numeric code. More than 60 interrelated tables have been prepared for registering information on animal disease outbreaks, farm data viz. housing, feeding, management, past disease history, vaccination history etc. and animal general information, production, reproduction and disease data. Findings of various laboratories such as bacteriology, virology, pathology, parasitology, molecular biology, toxicology, serology etc. can also be documented. Data can be easily entered in simple forms hyper-linked to one another, which allow queries and reports preparation at click of mouse. Flexibility has been provided for additional requirements due to diverse needs. The database may be of immense use in data storage, retrieval and management in epidemiological institutions and veterinary clinics.

 

5. Archiving Technology for Natural Resources and Environmental Data in Developing Countries, A Case Study in China
Wang Zhengxing, Chen Wenbo, Liu Chuang, Ding Xiaoqiang, Chinese Academy of Sciences, China

Data archiving has long been regarded as a less important sector in China. As a result, there is no long-term commitment at the national level to preserve natural resources data, and usually smaller budgets for data management than for research. Therefore, it is essential to develop a feasible strategy and technology to manage the exponential growth of the data. The strategy and technology should be cost-saving, robust, user-friendly, and sustainable in the long run. A PC-based system has been developed to manage satellite imagery, Geographic Information System (GIS) maps, tabular attribute data, and text data. The data in text format include data policies compiled from international, national, and regional organizations. Full documentation on these data are on-line and free to download. Only metadata and documentation are on-line for GIS maps and tabular data; the full datasets are distributed by CD-ROM, e-mail, or ftp.

Remote sensing data are often too expensive for developing countries. An agreement has been reached between GCIRC and remote sensing receiving station vendors. According to the agreement, GCIRC can freely use the remote sensing data (MODIS) from the receiving station, conditional on making their system available to demonstrate to potential buyers. This assures the most important data source for archiving. Considering the huge volumes of data and limited PC capacity, only quick-look images and metedata are permanently on-line. Users can search for data by date, geolocation, or granule. Full 1B images are updated daily and kept on-line for one week; users can download the recent data for free. All raw data (direct broadcast) and 1B images are archived on CD-ROMs, which are easy to read using a personal computer.

 

6. Delivering interdisciplinary spatial data online: The Ramsar Wetland Data Gateway
Greg Yetman and Robert S. Chen, Columbia University, USA

Natural resource managers and researchers around the world are facing a range of cross-disciplinary issues involving global and regional environmental change, threats to biodiversity and long-term sustainability, and increasing human pressures on the environment. They must increasingly harness a range of socioeconomic and environental data to better understand and manage natural resources at local, regional, and global scales.

This demonstration will illustrate an online information resource designed to help meet the interdisciplinary data needs of scientists and resource managers concerned with wetlands of international importance. The Ramsar Wetland Data Gateway, developed in collaboration with the Ramsar Bureau and Wetlands International, combines relational database technology with interactive mapping tools to provide powerful search and visualization capabilities across a range of data from different sources and disciplines. The Gateway is also being developed to support interoperable data access across distributed spatial data servers.

 

Last site update: 15 March 2003