20th
International CODATA Conference (Times New Roman, 11pt)
Session: Computational informatics: integrating data science with materials
modeling
Materials
Research Groups, Digital Libraries, & Education:
Metadata
from Nanoscale Simulation
Code
Laura Bartolo
Images of nanostructures can
provide important visual resources for scientific research and education.
However associating appropriate metadata with these visual resources is
essential for their interpretation especially when they are shared with
collaborators, students, or the broader scientific community. Within the
NSDL Materials Digital Library Pathway (MatDL),
information scientists at Kent State University (KSU) and materials scientists
at the
Input parameters and associated values that determine the output of nanoscale simulations are
recorded in order to capture the metadata material scientists consider
important. This strategy has the advantage of creating more granular,
consistent, and accurate metadata from input file parameters and values that
can be associated immediately with simulation output. This approach also
eliminates duplicate effort and possible recording errors that could occur if
the metadata were produced by a separate mechanism at a later time. Ultimately,
streamlining the process of submitting these resources to a digital library and
eliminating the necessity for hand-generating metadata, should increase the likelihood of
submissions to outside repositories, such as digital libraries that are
sustained by user contributions. Capturing and associating metadata with
research data as it is generated will enable the data, including non-print
materials, to be more easily located and interpreted within a lab, a group of
labs collaborating through a distributed network, as well as a digital library.
Currently, input file parameters and values for the U-M research group's master
simulation code and all of its modules are captured as DC metadata in XML
format upon execution of the code. The DC metadata transparently
describes all associated parameters and values necessary to identify, describe,
or recreate the simulation. The metadata includes DC elements such as:
title, creator, and date as well as subject terms which are drawn from parameters,
e.g., simulation type, code module, and integrator scheme. DC description
includes additional parameter names and associated values such as: number of
particles; system number density; system temperature as a function of time;
time step for integration of equations of motion; and number of
dimensions. Subject terms also lay the foundation for the development of
a community-built, refereed, web-accessible dictionary, glossary, and thesaurus
which will serve as a reference resource on assembly of nanosystems for upper level undergraduates as well
as for beginning graduate research lab assistants.
This presentation will describe: 1) metadata capture for nanoscale computer simulation output during research
code execution; 2) pilot results of resource submissions; 3) development of a
reference resource on nanosystem
assembly useful in research and educational settings. Next steps
include: implementing metadata capture with national and international
collaborators of the pilot group; and expanding the reference resource on nanosystem assembly.