19th International CODATA Conference
Category: Infoscience
Marine Project Data Management System
J.S.Sarupria (sarujs@darya.nio.org) and G.V.Reddy
(reddy@darya.nio.org)
Data & Information Division, National Institute of Oceanography, India
Marine science is a multidisciplinary science which produces very varied types
of data/information including hydrographic, biological, chemical, fisheries, geological,
geophysical data, in water column and at the bottom of the ocean thus it
is very difficult to incorporate the data into one database.
Project Data Management is the provision of data management services to active research projects and is sometimes knows as ‘End-to-End’ data management.
To construct the project data management policies and to address issues such as long-term archival of the data to ensure that project makes a lasting contribution to marine science. Establish a coherent, credible, semi-distributed and scalable, end-to-end Data Management (DM) plan with clear DM objectives, activities and timeline; a policy (e.g., delivery and exchange standards), widely agreed by the scientists and sponsors, and supported by funding agencies.
It includes a range of services and products, taking full advantage of best practices, standards or innovative approaches; several pro-active strategies addressing all needs and requirements.
Project Data Centre designed to establish guidelines, provide advice, and facilitate exchange of knowledge and expertise; several experienced, full-time national data managers / coordinators; , through existing infrastructures, capacity building, new international synergies to insure compliance, an increase of the overall value of scientific research and derived outputs.
At least 2-3 % of the project money should directly spent on data management activities to ensure compliance.
Technological requirements should be continually examined to avoid both too-heavy immediate requirements or longer term potential dead-ends which may leave present or future data users without access to data.
Some times data centre may spend 90% of their effort on to rescue 10% of the data ,but that is part of the point of having the data centre especially if otherwise value able data would go unreported. The best approach to minimize these efforts is to work more closely with data originators.
Data have various levels of quality, Initially, the project paid little attention to this question As the project matured, a subjective ranking system was implemented in the data set, to attempt to grade the quality of various data sets. Future efforts at collecting and entering data should explicitly consider several dimensions, including accuracy, precision, reliability, and data source.
Data have errors (in the sense of mistakes), some of which can be caught and corrected before they enter the database, some of which after they enter the database, and some of which persist in the database. We attempted to minimize these via a fairly conventional review process, but it would be worthwhile to assess this process and consider additional measures for quality assurance. Cooperative quality control by user and community expert feedback is extremely important, and needs to be facilitated so that data can be both promptly used and promptly reviewed.
An attempt is made to asses data/ information available in public domain . The result shows that more then 60% of marine data/information is being available in public domain from observational programs. The rest is either lost or partially available in publications .The reasons are many for loosing data /information, but these are mainly due to : (i) Data management component is missing in project proposal/ observational programs ,(ii) Observational programs are not well linked with data management activities ,(iii) There are communication gapes between data collectors (Scientists) and data managers and (iv) Funding agencies are not monitoring data flow from data generating agencies to the national data centers / archives. Therefore, the data acquisition network is partially fail to trace/ monitor data flow effectively.
The overall objective of the DM is to help and support the Project Investigators (Pis), project sponsors, funding agencies and end-users, to learn from the past and achieve the best science, for today and tomorrow.