Theme
II-4: Quality Issues in Bioinformatics T.N. Bhat, J.
Rumble, G. Gilliland
Presentation abstracts
will appear here as soon as possible.
The biotechnology sector
has generated vast amounts of data and will continue to do so
in the future. Consistent schema, uniform validation tools and
standard database interfaces are needed to allow efficient queries
and exchange of data. Quality and uniformity are two major issues
for dependable and reliable results. However, data diversity is
an important consideration from the point of view of accommodating
a wide range of applications. By diversity we mean inclusion of
the diverse nomenclature and description systems in place in biology.
The need for uniformity
amid diversity may come in several contexts. Some examples are,
molecule names, validation parameters, and definition of homology,
definition of a domain, data formats and database query interfaces.
A right balance between "order" and "disorder"
among these terminologies is crucial for successful data exchange
and user query interfaces. Another issue of great importance is
the distinction between user deposited data and the value added
information introduced by the organizers of the database. Most
of the present efforts on data validation are in place only at
the time of deposition of the data and they operate through the
regular channels of refereeing. Often, for the purpose of improving
quality and data uniformity, databases introduce new information.
The topic of this session is to define and discuss these quality
issues in bioinformatics and to propose improvements and ‘preferred
validation procedures and guide lines’. The session is expected
to focus on both archival databases (e.g., GenBank, SWISS-PROT,
PDB ) and derived databases (e.g., SCOP, ModBase). Often, databases
provide a citation for the database. However, due to practical
and evolutionary reason, the citation may or may not adequately
document the scope of the database. It is hoped that this session
attempts to address these issues as well.
Quality Issues in Biomacromolecular
Structure Databases
T.N. Bhat and G. Gilliland, NIST, USA
Validation of Protein
Name Assignments in Databases
W.C. Barker, F. Pfeiffer, National Biomedical Research Foundation,
USA
Incorporation of Feedback
for Quality Control in Automated Protein Sequence Annotation
John S. Garavelli, Protein Information Resource, National Biomedical
Research Foundation, Washington
Organisation and Standardisation
of Information in SWISS-PROT and TrEMBL
Michele Magrane EMBL Outstation, Wellcome Trust Genome Campus
DANTE: A Workbench
for Sequence Analysis
Javier Tamames* and Anna Tramontano
Problems with the denominator
in epidemiological studies
Dirk J. van Schalkwyk , Faculty of Business Informatics, Cape
Technikon, South Africa
Quality Issues in Data
Banks for Molecular Biology
A. M. Lesk, University of Cambridge Clinical School, Wellcome
Trust Institute for Molecular Mechanisms of Disease, Wellcome/MRC
Building, Hills Road, Cambridge CB2 2XY, U.K
Round Table Discussion
II-A: Standards in biological information Systems
A.M. Lesk, M. Krichevksy, W.C. Barker, M. Magrane.
|