Physical Science Data Abstracts |
||||
Medical
and Health Data
Behavioral and Social Science Data Data Policy Detailed ProgramList
of Participants About the CODATA 2002 Conference
|
1. The Handling of Crystallographic Data Brian McMahon, International Union of Crystallography, England The Crystallographic Information File (CIF) was commissioned by the International Union of Crystallography in the late 1980s to provide a common exchange format for diffraction data and the structural models derived therefrom. It specifically addressed the requirements of an information exchange mechanism that would be portable, durable, extensible and easy to manipulate, and has won widespread acceptance as a community standard. Nowadays, CIFs are created by diffractometer software, imported and exported from structure solution, refinement and visualisation programs, and used as an electronic submission format for some structural science journals. CIF employs simple tag-value associations in a plain ASCII file, where the meanings of the tags are stored in external reference files known as data dictionaries. These dictionaries are machine-readable (in fact conforming to the same format), and provide not only a human-readable definition of the meaning of a tag, but also machine-parsable directives specifying the type and range of permitted values, contextual validity (whether an item may appear once only or multiple times) and relationships between different items. In many ways this is similar to the separation between document instances and their structural descriptions (document type definitions or DTDs) in XML, the extensible markup language that is increasingly used for document and data handling applications. However, while many existing XML DTDs describe rather general aspects of document structure, the tags defined in CIF dictionaries detail very specific pieces of information, and leave no room for ambiguity as these items are read into and written out from a variety of software applications. Recognised tags in CIF include not only subject-specific items (e.g. the edge lengths of a crystal unit cell) but also general tags describing the creator of the file (including address and email), its revision history, related literature citations, and general textual commentary, either for formal publication or as part of a laboratory notebook record. The objective is to capture in a single file the raw experimental data, all relevant experimental conditions, and details of subsequent processing, interpretation and comment. From a complete CIF, specialist databases harvest the material they require. While such a database might be unable to store the entire content of the source file, the IUCr encourages databases to retain deposit copies of the source or to provide links from database records to the source (for example as a supplement to a published journal article). The richness of the tag definitions also allows automated validation of the results reported in a CIF by checking their internal consistency. At present validation software is built by hand from the published descriptions of data tags, but experiments are in hand to express the relationship between numeric tags in a fully machine-readable and executable formulation. While the CIF format is unique to crystallography (and a small number of related disciplines) it has much to contribute towards the design of similar data-handling mechanisms in other formats.
The physical property data, equilibrium data and prediction models are essential parts of process synthesis, design, optimization and operation. Although efforts to collect and organize such data and models have been performed for decades, the demand for data models and their proper and efficient use are still growing. With the financial support of MOCIE (Ministry of Commerce, Industry and Energy) of Korea, four universities have collaborated to develop a thermophysical properties databank and enhance their capacity on experimental production. The databank (KDB) contains about 4000 pure components (hydrocarbons, polymers and electrolytes) and 5000 equilibrium data sets. Most of the data were collected along with their accuracy of measurements and/or experimental uncertainties. The data can be searched by a stand-alone program or via internet. This presentation will discuss current status and features of KDB. In process engineering
applications, selecting proper data, selecting proper model,
regression of the model parameter and their proper uses are
the most important aspect. CAPEC( Computer Aided Process Engineering
Center, Technical University of Denmark) has been developing
programs to help the proper use of thermodynamic properties
data and prediction models for years. A stepwise procedure to
select data sets from property databases such as KDB and CAPEC-DB
, generating problem specific parameters and their proper use
through appropriate property models in process engineering problems
has been developed in CAPEC. The presentation will also highlight
the application of property model and data in specific process
engineering problems.
3.
Reliability of Uncertainty Assignments in Generating Recommended
Data from a Large Set of Experimental Physicochemical Data Experimental (raw) physicochemical property data are the fundamental building blocks for generating recommended data and for developing data prediction methods. The preparation of recommended data requires a well-designed raw data repository with complete supporting information (metadata) and reliable uncertainty assessments, a series of processes involving data normalization, standardization, and statistical analysis, as well as anomaly identification and rectification. Since there are considerable duplicate measurements in a large data collection, uncertainty assessments become a key factor in selecting high quality data among related data sets. While other information in the database can help with the selection, the uncertainty estimates provide the most important information concerning the quality of property data. This presentation will focus on the assignment and assessment of uncertainty with a large set of experimental physicochemical property data as well as the impact of uncertainty assessments on generating recommended data. Uncertainties represent a crucial data quality attribute. They are stored in the form of a numerical value, which is interpreted as a bias for the associated property value. The addition and subtraction of this bias from the property defines a range of values. Without uncertainties, numerical property values cannot be evaluated, while inappropriate uncertainties can also be misleading. In assessing uncertainty all potential sources of errors are propagated into the uncertainty of the property. In this process, complete information on measurement techniques, sample purity, uncertainty assessment by the investigator, and investigator's experience/records, etc. is essential in establishing uncertainties by database professionals. Reliable provision
of uncertainties for property values in databases establishes
the basis for determination of recommended values. However,
the process of arriving at an appropriate judgment on uncertainties
is rather complex. Correct assignment of uncertainty requires
highly knowledgeable and skilled data professionals, and furthermore,
includes a subjective component. A large-scale data collection
such as TRC SOURCE makes this sophisticated task even more demanding.
A recent statistical analysis on critical constants and their
uncertainties assigned in TRC SOURCE reflected the difficulty
in assigning reliable uncertainties and also revealed a decisive
effect of uncertainties on generating recommended values. Based
on this study, a computer algorithm has been developed at NIST/TRC
to systematically evaluate uncertainty assessments.
With a view to the synthesis and design of separation processes, fitting and critical examination of model parameters used for process simulation and the development of group contribution methods 1973 a computerized data bank for phase equilibrium data was started at the University of Dortmund. While at the beginning mainly VLE data for non-electrolyte mixtures (Tb > 0 °C) were considered, later on also VLE (including compounds with Tb < 0 °C), LLE, hE, γ∞, azeotropic, cPE, SLE, vE, adsorption equilibrium, ... data for non-electrolyte and electrolyte systems as well as pure component properties were stored in a computer readable form. This data bank (Dortmund Data Bank (DDB)) now contains nearly all worldwide available phase equilibrium data, excess properties and pure component properties. To use the full potential of this comprehensive compilation a powerful software package was developed by DDBST GmbH (www.ddbst.de) for verifying, storing, handling and processing the various pure component and mixture data. Programs for the correlation and prediction of pure component properties, phase equilibria, excess properties as well as graphical data representation were also included. Together with the data from the Dortmund Data bank these programs allow to analyze the real mixture behavior of a system of interest and to fit reliable model parameters (gE-models, equations of state, group contribution methods) for the synthesis and design of chemical processes on the basis of the most actual experimental data and estimation methods. The talk will give
an overview on the development, structure and contents of the
DDB and will highlight certain aspects of the accessibility
and use of thermophysical data in the Internet age. Future plans
concerning the development of the DDB and the software package
DDBSP will be discussed. 1.
Gas hydrates in Siberian geological structures Results of prospecting
of gas hydrates accumulations in continental regions of Siberia
are discussed.
2.
Gas Hydrates - Where we are now? Gas hydrates
were known for more than 200 years (1778 - Priestley). However,
we have been studying industrial hydrates for about 70 years.
There are more than 5000 publications related to the research
on gas hydrates. We have learned some properties of hydrates
formed in technological systems of production and transport
of gas. We know the conditions for the formation and dissociation
gas hydrates, we know the methods of removing hydrate plugs
from pipelines, and the prevention methods of hydrate formation. The areas of study of gas hydrates that need to be developed include:
3.
Data on kinetics and thermodynamics of gas hydrates, application
to calculations of phase formation
4.
Gas Hydrates Management Program at GTI Gas hydrates are
an impediment to gas flow as well as a potential energy resource.
When they form inside pipelines, hydrates can slow or completely
block gas flow, a significant problem for producers striving
to move gas from offshore wells to onshore processing facilities.
Producers, gas storage, transmission companies spend millions
of dollars each year on hydrate inhibitors and other actions
to help prevent hydrate formation, trying to balance cost, environmental
impact, efficiency and safety. Better understanding of the mechanisms
that trigger hydrate formation and dissociation could lead to
creation of more effective hydrate inhibitors. GTI is the premier, industry-led natural gas research and development organization in the United States, dedicated to meet current and future energy and environmental challenges. At its facilities near Chicago, Illinois, GTI has assembled state-of-the art laboratories (Laser Imaging, Acoustics and Calorimetry) operated by an expert research team that is uniquely equipped to investigate the mechanism of formation and dissociation of gas hydrates; the impact of drilling fluids, the low dosage inhibitors and anti agglomerents on the hydrates. Recent results from the facility are presented.
5.
Computer Modeling of the Properties of Gas Hydrates - The state-of-the-art
Various theoretical techniques for the modelling of the physical, thermodynamics and electronic properties of gas hydrates will be reviewed. Selected examples from recent work of the author's group will be presented. Emphasis will be placed on the prediction of the dynamic properties, occupancy, formation and dissociation mechanism of gas hydrates. Perspective on using advanced simulation method for the prediction of phase equilibria will be discussed.
6.
Natural Gas Hydrates Studies in China Natural gas hydrates
studies are very important. The CODATA Task Groups on Data on
Natural Gas Hydrates was newly approved in October 2000. In
China, gas hydrates is a potential field for studying and exploring.
The area of permafrost regions accounts for 10% of the world
permafrost, especially in the mid-latitude and high-altitude
mountainous regions in Qinghai-Tibet Plateau. The oil-gas resources
have been confirmed by exploring in the north of Tibet Plateau.
It is made clear that methane emissions and carbon dioxide uptake
by observation in Qinghai-Tibet Plateau. These evidences show
volumes of gas hydrate may be exist. In addition, extensive
sea and long shoreline make it hopeful that began to study and
explore gas hydrates. In China offshore seas, mainly in South
China Sea and East China Sea, obvious signs of hydrates have
been distinguished in seismic reflection profile, and high temperature
of seawater and high ratio of methane in fluids can be observed.
All these signs and observations indicate that it is completely
possible there exists a large amount of gas hydrates in China
offshore seas.
7.
State of CODATA project on information system on Gas Hydrates Previous CODATA
general assembly approved establishment of task group on Gas
Hydrates data. Most authoritative specialists in field of gas
hydrates were invited to be members of the group. They in total
represent all major field of science and technology related
to gas hydrates and most of the countries, were gas hydrates
studies attract significant attention. The group has developed
a concept and general recommendations on the structure of the
system and requirements of data.
By now more then
a hundred groups in different countries identified by now as
prospective participants of creation of the system.
1.
Molten Salt Database Project: Building Information and Predicting
Properties The genesis of the Molten Salt Database, realized as early as 1967 with the publication of the Molten Salt Handbook by George Janz is as relevant today as it was over 30 years ago. New high-tech applications of molten salts have emerged and the need for data is crucial for the development of new processes (pyrochemical reprocessing of nuclear fuel, nuclear reactors of new generation, elaboration of new materials, new environment-friendy energetic sorces, ). Building a world-class critically, evaluated database is a difficult and complex process, involving considerable time and money. Ultimately, the success of the project depends on positive interactions between a diverse group of people - support staff to identify and collect relevant literature, scientists to extract and evaluate the data, database experts to design and build the necessary data architecture and interfaces, database reviewers to ensure that the database is of the highest quality, and marketing staff to ensure the widest dissemination of the database. The advent of the World Wide Web (WWW) has provided another exciting component to this paradigm - a global database structure that enables direct data deposition and evaluation by the scientific community. Also the new concepts
in engineering data information system are emerging and make
it possible to merge people, computers, databases and other
resources in ways that were simply never possible before. These efforts are made in parallel with our current research activities on molten salts but also in interaction with those other related actions on materials and engineering. For instance, it is also intended to adapt and apply methodologies originally used for other purposes ("human genome") to the field of molten salts., as recently demonstrated for other materials by K. Rajan at RPI, using computational "informatics" tools.
2.
Development of Knowledge Base System Linked to Material Database The distributed
material database system named 'Data-Free-Way' has been developed
by four organizations (the National Institute for Materials
Science, the Japan Atomic Energy Research Institute, the Japan
Nuclear Cycle Development Institute, and the Japan Science and
Technology Corporation) under a cooperative agreement in order
to share fresh and stimulating information as well as accumulated
information for the development of advanced nuclear materials,
for the design of structural components, etc. In the system
retrieved results are expressed as a table and/or a graph. 3.
Activity on Materials Databases in the Society of Materials
Science, Japan
4.
Role of MITS-NIMS to Development of Materials database Material Information Technology Station (MITS) of National Institute for Materials Science (NIMS), established in October 2001, is aimed to be a worldwide information center for materials science and engineering. Our main activities include fact-data producing and publication, literature data acquisition, and database production. We have been continuing experiments of metal creep and fatigue for 35 years, and the data are published and distributed as NIMS Data Sheets. Besides, from this year, we start literature data acquisition on materials' structure and properties. Both of the fact-data and literature data are stored and managed as databases. We are constructing more than 10 material databases, which include polymers, metals and alloys, nuclear materials, super conducting materials, etc. Online services of these databases will be available from next April. Being aware that a simple system with only data retrieving function can not provide enough information for material research and industrial activities, in which not only data, but also data related knowledge, and decision support function are needed, we have started several new research and development projects aiming to construct intelligent material information systems with data integration, data analysis and decision support functions. One of our projects is to develop a material risk information platform. Basing upon material property databases, material life prediction theory, and accident information databases, this platform will provide users with material risk knowledge as well as fact data, for the purpose of safe use and correct selection of materials used for high risk equipment, for example, a power plant. Another system under construction is a decision support system
for composite material design - a composite design and property
prediction system. With this system, a virtual composite can
be composed with optional structure and component materials.
Then some basic properties such as thermal conductivity of the
composite can be evaluated according to its constitution and
the properties of constituents that stored in the databases.
1.
Thermodynamic Properties and Equations of State at High Energy
Densities During last century
the range of thermodynamic parameters was greatly broadened
because of rapid development of technologies. Thermodynamic
properties of matter at high pressures and temperatures are
very important for fundamental researches in the fields of nuclear
physics, astrophysics, thermodynamics of dense plasma. A number
of applications such as nuclear fusion, thermonuclear synthesis,
creation of new types of weapon, comet and meteorite hazard
etc. requires knowledge of experimental data in a wide region
of parameters.
2.
Internet Chemical Directory ChIN Helps Access to Variety of
Chemical - Databases on Internet References
3.
Graph-Theoretical Concepts and Physicochemical Data
4.
Progress in the Development of Combustion Kinetics Databases
for Liquid Fuels
5.
Database of Geochemical Kinetics of Minerals and Rocks Data of reaction rates of minerals and rocks in waters at high temperatures T and high pressures P are important in understanding the water -rock interactions in lithosphere, and in dealing with the pollution of ground water and deep buried nuclear wastes. Reaction rates have been measured experimentally in the T range 25 to 300 °C and at various pressures. A few kinetic experiments of the mineral dissolution were performed at T above 300 °C and P higher than 22 MPa. Experiments were usually carried out using flow reactors. As operating a continuous stirred tank reactor CSTR reactor, steady state dissolution rates r (mol.sec-1m-2) were computed from the measured solution composition using where stands for the molar concentration difference between the inlet and outlet of the ith species in solution, F represents the fluid mass flow rate, vi refers to the stoichiometric content of i in mineral, s is the total mineral surface in the reactor (m2). As operating a flowthrough packed bed reactor PBR, mineral particles were put inside the vertical vessel. Within the PBR, a transient material balance in a column at length Z gives: This model characterizes mass transfer in the axial direction in terms of an effective longitudinal diffusivity DL that is superimposed on the plug flow velocity U. The length Z and U have been known. As measured the residence time distribution function of the flow system, we can figure out the DL. If the boundary condition and initial condition are well known, then, the dissolution rate of the mineral is derived from the following mass balance expression for the concentration of the ith solute in a reactor cell: where
Ci is the concentration of ith species, t
is the average residence time, and V is the solution volume
in the pressure vessel (ml).
where
Rnet is the net rate of reaction, k+
is the rate constant of the forward reaction, ai
is the activity of species i in the rate determining reaction
raised to some power m. Others are included, e.g., incongruent
dissolution, non-linear dissolution rate, non-linear dynamics
in the reaction system (if happened). This data base will also
provide simulation models in predicting the water/rock interaction
in nature.
1.
Development of a Large System of Clustered Engineering Databases
for Risk-Based Life Management
2.
Open Corrosion Expertise Access Network The paper will describe concepts and results from the European Commission supported "OCEAN" project (Open Corrosion Expertise Network), of which the first phase is about to be finalized during 2002. The objective of this project is the design and implementation of an open, extensible system for providing access to existing corrosion information. This will be achieved through a network of interested data providers, users and developers. Where available, existing standards and technologies will be used, with the partners developing informatics and commercial protocols to allow users single-point access to distributed data collections. One of the major difficulties of corrosion engineering is the
multi-dimensional nature of the corrosion problem. A very large
number of alloys are available, and these may then be exposed
to an almost infinite range of environments. Thus, although
many thousands of corrosion tests have been performed and numerous
papers published, it remains difficult for the individual corrosion
engineer to bring together the information that is relevant
to a specific situation. To some extent this problem has been
tackled by centralized collections of corrosion data and abstracts.
However, these are limited to published information, and tend
to be rather inaccessible to potential users. The latter problem
relates partly to the dedicated user interfaces that are typically
used with these data collections, and partly to the commercial
necessities of ensuring a reasonable return for the information
providers.
3.
Use of Database Technology for Saving and Rescuing of Perishing
Engineering Data and Information In Eastern Europe
4.
The Background and Development of MatML, a Markup Language for
Materials Property Data 1.
Requirements for Access to Technical Data -- An Industrial Perspective
The ultimate objective of any collaborative venture is to share understanding. Such collaboration is the fundamental basis for all social activity. The modern-day challenge is to collaborate across the globe in an environment where change is an ever-increasing factor. The digital information revolution both fuels and offers to alleviate this challenge. However, the "Tower of Babel" remains a highly relevant parable. Integration of computer systems is a multi-level problem. While integration is increasingly available across the foundation levels of hardware, software, user access and data, semantic integration is rarely on the basis of an explicit, agreed information model. Such models control the representation of data. XML is now a major tool in the kit of system integrators. In order to control the content of an XML file, the necessary information model is either a DTD (Document Type Definition) or, increasingly, an XML Schema. Organisations are generating large numbers of different DTDs and XML Schemas to address the needs of individual projects. Creating information models for integration purposes causes a great deal of pain as different organisations meet to agree and define the terminology and required information capability. The XML community is new to this challenge where as the ISO sub-committee TC184/SC4 <http://www.tc184-sc4.org/> has been working for almost twenty years to create (currently) six standards, including ISO 10303 ("Product data representation and exchange" or "STEP"). The ISO/TC184/SC4 family of information standards addresses a wide range of industrial requirements. Mature parts of the standards have delivered real business benefits to various different projects. Some challenges remain in respect of such information standards: deployment in conjunction with project management requirements; facilitation of concurrent systems engineering; adoption by Small to Medium Enterprises; security; intellectual property rights; legacy systems; and integration of multiple sources. Such requirements remain the barrier between the sources of high quality scientific and technical data and the exploitation of such data within industry. The WWW and other communities have recognised that XML as a single prevalent representation format is not sufficient and a current hot topic is ontologies. Potentially, ontologies offer a different route to integration where unified definitions across the integration levels offer the basis for automated analysis and creation of integration solutions. However, in the short term, "ontology" is a label that is in use in too many different guises and projects such as the Standard Upper Ontology <http://suo.ieee.org/> will require further development before industry is able to effectively exploit the potential power of ontologies.
For improvement of materials database as an intelligent foundation, many databases have been developed from wide ranges of materials. However, most of them are built independently for each field of research and are just as a numerical value fact data. In reality very few are realized as a full-scale utilizable database retrieval system. Regarding material database or material data as common property, easy performance of sharing or mutual use of material database is requested along with utilization of non-material specialized field. To respond to this demand, a prototype of platform system to avail mutual use across boundaries in the field of material database was developed.
3.
XML data-description for data-sharing of material databases
By developing this data-sharing system, various properties of materials which stored in different databases can be linked on the generic platform with the standardized template, in order to use for the life-cycle design of products from comprehensive approaches such as DfE (Design for Environment), DfS (Design for Safety), etc., used in industry.
4.
A Prototyping of Interoperable System for Data Evaluation of
Creep and Tetsuya Tsujikami,
Faculty of Science and Engineering, Ryukoku University, Japan
From early stage
of the computerization , creep and fatigue data have been
stored in computers. So far many materials databases have
been built in this area. And a number of mumerical/statistical
procedures for the curve fitting for creep and fatigue have
been proposed. But none of them are still interoperable. Materials
data systems in the era of the Internet should have the interoperability
for not only the factual data but also for data evaluation
modules.
1.
Materials Data on the Internet The availability of the Internet has provided unprecedented opportunities for both data compilers and users. With respect to materials data we will explore:
Examples will be illustrated of specific materials databases available on the Internet from a variety of materials data fields:
While
there is no question that large and widely varied
bodies of data are accessible on the Internet, significant
improvements are needed promptly, or else prospective
users will become so disillusioned that they abandon
electronic access for data. Among the problems that
need
2.
Physicochemical data in Landolt-Börnstein Online Nearly 120 years ago the data collection Landolt Börnstein was founded in the field of Physical chemistry. The broad scope of this expert data collection in various fields ranging from Elementary Particle Physics to Technology and the strong increase in the number of primary articles forced a transition to the open New series. New volumes are planned according to the development of new fields in science and technology, whereas the former 1th to 6th edition were planned as a closed edition. Since 1996 CD-ROMs are produced in parallel to the printed volumes. In the year 2000 Landolt Boernstein offered free access to all volumes published until 1990. This prerelease was used heavily by the 10.000 registered testusers, more than two million pages were downloaded in a short period. An electronic user survey showed, that more than 80% of the users wanted to have a full electronic version of LB at their working place. End of 2001 the complete Landolt-Börnstein collection went online. A fulltext search engine allows searches for substances and properties within all 300 LB volumes, i.e. 150.000 pages and 25.000 documents. The search can be limited to a group of Landolt-Boernstein. Specific search is possible for the fields authors, document titles and tables of contents. Simultaneous search in LB and all Springer journals is possible. Users can get automatic alerting information according to their profile of interest. Physico chemical data are collected systematically by specialists in the field and various databases were built up. LB has excellent cooperation with several database centers. First of all they provide the raw data, which are then used by authors inside or outside of the institutions to prepare selected, evaluated and recommended data for the printed version of Landolt-Börnstein. For the electronic version additional data and references can be included. All of the material is double checked by scientists and their assistants in the Landolt-Börnstein editorial office. Examples of physicochemical data are presented: 1) Thermodynamic data of pure substances and their mixtures: cooperation with TRC/NIST in the USA and SGTE in Europe. 2) Liquid crystals database LIQCRYST, Scidex. Development of a specific graphical structure search tool for organic substances. Of course search for CAS registry numbers molecular formula, chemical names etc. is included. For a given substance the search yields a dynamical combination of a variety of physical properties, e.g. NMR, NQR and density data. 3) High quality phase equilibrium data, i.e. phase diagrams, crystallographic and other thermodynamic data in a simple to use periodic table system.
3.
Expressing Measurements and Chemical Systems for Physical
Property Data Physical property data are typically associated with a measurement of a particular chemical system in a particular state. In order for such data to be effectively utilized, both the measurement and the system must be appropriately documented. In the scientific literature, this information is often presented in great detail, while in electronic databases it is often reduced to a minimal form. For example, a scientific paper may discuss the presence or absence of impurities in a reagent, while the entry in an electronic database may simply refer to the reagent as a pure compound. For many applications such an approach is reasonable, but for others it may limit the uses to which the database can be applied. A common response to criticisms that electronic databases lack this sort of information is to note that researchers can always refer to the original literature from which the data was abstracted. While it may not be possible to match the detail of the original literature, providing richer information in this area could provide several advantages for researchers using electronic databases. If a researcher searches a database and finds three values for a property have been measured, with two measurements being quite close to each other, the researcher may conclude that the value lies near the two measurements, discarding the third. However, if the researcher is provided with information that indicates that the third measurement was made by a more reliable method, this value may be chosen instead. A major obstacle to providing such information in electronic form is that such work requires a grammar capable of expressing such information. Since this sort of information is not always recorded, such a grammar must allow for the ability to state that such information is unknown or only known to a limited extent. This talk will
discuss some possible approaches to improving the manner in
which chemical systems and measurements are expressed in electronic
form. It will include examples of problems encountered in
the development in the NIST Chemistry WebBook, a web site
which contains physical property data compiled from several
databases.
|
|||
Last site update: 15 March 2003
|