19th International CODATA Conference
Category: Data Archiving

Designating User Communities for Scientific Data: Challenges and Solutions

Mark A. Parsons (parsonsm@nsidc.org) and Ruth Duerr
National Snow and Ice Data Center/World Data Center for Glaciology, University of Colorado, USA


Explicitly defining a "designated user community" for a given data collection is essential to good scientific data stewardship. It enables data managers to determine what information needs to be developed and maintained to ensure the usability of the data now and into the future. A thorough understanding of the users helps managers define how to present and enable access to the data and may determine the actual format of the data. These considerations in turn have a direct impact on the long-term preservation of the data. Designating a specific user community is so important that it is explicitly called out as a mandatory responsibility of a an Open Archival Information System by the Consultative Committee for Space Data Systems in their ISO standard reference model. However, while defining a community may be essential,

It is also extremely difficult, and it is impossible to predict how the use of a data collection may change over time. Indeed with today's rapidly changing information technology and scientific understanding, new data applications are likely to be discovered more frequently than ever. This creates a series of data management problems for data stewards be they traditional data archives or smaller nodes in a distributed or virtual data management system.

As data managers at the World Data Center for Glaciology, Boulder and the National Snow and Ice Data Center (NSIDC), we routinely confront and try to solve many of these problems. We certainly do not have all the answers, and it is likely that we will struggle with these problems forever. Yet we have begun to develop a set of best practices that mitigate these problems, and we believe these practices can be applied in a larger scientific context. We will use data collections held at NSIDC as case studies to illustrate this. We will also discuss how reexamining designated user communities might expand the use of science data.