NOAA’S Future Data Activities: Petabyte Archives, Metadata and Systems Integration

 

David M. Clark

National Oceanic and Atmospheric Administration

NESDIS/NGDC

325 Broadway

Boulder, Colorado  80305

David.M.Clark@noaa.gov

 

 

The National Oceanic and Atmospheric Administration currently operates and maintains over 100 observing systems.  These systems include remote sensing and in-situ platforms on land, sea, air and space.  The amount of the data collected by these very different platforms will be of variable size; some will be quite small while others will be very large.  Because of this large influx of data into the NOAA environmental archive from the NOAA and non-NOAA observing systems, NOAA is beginning to design petabyte-sized archives.  The Comprehensive Large Array-data Stewardship System (CLASS) is being developed by NOAA to address this challenge.  CLASS is a web-based data archive and distribution system for NOAA’s environmental data. It is NOAA's premier online facility for the distribution of NOAA and U.S. Department of Defense (DOD) operational environmental satellite data and derived data products.  One of the key aspects in developing these petabyte archives is the incorporation of the observing systems’ metadata into a geospatial database.  Metadata are all the information necessary for data to be independently understood by users, to ensure proper stewardship of the data, and to allow for future discovery.  Developing metadata for all of the data sets which result from NOAA’s observing systems is the first step in integrating NOAA’s observing and data systems.  The next step in meeting this challenge, will be the development of Global Earth Observation Integrated Data Environment (GEO-IDE).  The GEO-IDE is a NOAA-wide architecture that will integrate legacy systems and guide development of future NOAA environmental data management systems.  It is envisioned as a “system of systems” – a framework that provides effective and efficient integration of NOAA’s many quasi-independent systems.  It will be built upon agreed standards (including metadata), principles, and guidelines and will guide the evolution of existing systems into a service-oriented architecture.  Ultimately, a NOAA single system of systems will be in place to access the data sets needed to address significant societal questions of this and future generations.