Automatic Meta Data Collection System for Satellite and Ground-based Observation Data by the STARS RSS1.0: An approach for the STARS Semantic Web
Takuya KUBO¹, Ken T. MURATA², Eizen KIMURA³, Satoshi ISHIKURA¹ and Iku SHINOHARA 4
1 Department of Electrical and Electronic Engineering and Computer Science, Ehime University
2 Center for Information Technology, Ehime University
³ Department of Medical Informatics Social Medicine and Medical Informatics Medical School of Ehime Univ
4 Institute of Space and Astronautical Science, Japan Aerospace Exploration Agency
In the Solar-Terrestrial Physics (STP), it is pointed out that circulation and utilization of observation data among researchers are insufficient. One of the reasons is that the data formats of STP observation data are not common. This is not only the issue of STP data but also of other natural science data. To archive interdisciplinary researches, we need to overcome this circulation and utilization problems.
Under such a background, the Solar-Terrestrial data Analysis and Reference System (STARS) has been designed and developed by the authors’ group. The STARS has its own database that manages meta-data of satellite and ground-based observation data files. The STARS provides users with cross-over data file search services and download services over the Internet. It is noted that retrieving meta-data from the observation data and registering them to database have been carried out by hand so far in the STARS. It is hard to deal with a huge amount of observation data due to the lack of manpower.
We developed an automatic meta-data collection system for the observation data using the STARS RSS (RDF Site Summary) 1.0. The RSS1.0 is one of the XML-based markup languages based on the RDF (Resource Description Framework), which is designed for syndicating news and content of news-like sites. Using the RSS1.0 as a meta-data distribution method, the workflow from retrieving meta-data to registering them into the database is automated. This technique was applied for the DARTS (Data Archive and Transmission System), which is a science database managed by the PLAIN Center at ISAS/JAXA in Japan . We succeeded in generating and collecting the meta-data automatically.
Our final goal is to establish the STARS Semantic Web. The Semantic Web provides a common framework that allows data to be shared and reused across applications, enterprises, and communities. The most fundamental issue on the establishment is who manages meta-data in the Semantic Web. In the present study, we designed meta-data of the STARS along with the RSS1.0 document. In order to describe the meta-data of the STARS beyond RSS1.0 vocabulary, we defined original vocabularies for the STARS resources using RDF Schema.
Our system works as follows. The RSS1.0 documents generated on data sites are automatically collected by a meta-data collection agent. The agent then extract meta-data to store them in an XML database. The XML database provides advanced retrieval processing that has considered property and relation. In the future, we develop a RDF database supported by inference engine, which leads to automatic processing or high level search for the data which are not only for observation data but for news and event information related to the STP.