Session:
Primary Biological Databases
EMBL
Nucleotide Sequence Database
Guy
Cochrane , (cochrane@ebi.ac.uk),
EMBL Outstation Hinxton, The European Bioinformatics
Institute, Wellcome Trust Genome Campus, Hinxton,
The EMBL Nucleotide Sequence Database sets
out to provide an archive of primary nucleotide sequence and annotation. Representing
some 74 million sequence entries from across the
living world collected over more than a quarter of a century, the database continues
to grow at an exponential rate. Along with collaborators DDBJ and GenBank, the EMBL Nucleotide Sequence Database aims
for comprehensive coverage of all publicly available sequence. On a nightly
basis, the three collaborating databases exchange data, such that archived
sequence and annotation are available through search and retrieval tools at all
three sites.
Data are recruited from
submitters through variety of routes, tailored to the needs of the submitters
and their data. In-house curators work with submitters to strive for consistent
use of annotation structures across the whole body of data.
Presentation of EMBL Nucleotide Sequence Database data
at the EBI includes the provision of entry retrieval tools, whole database
releases for download, sequence homology search tools and the Sequence
Retrieval System, SRS, for building complex searches by specific field.
Furthermore, nucleotide sequence data are integrated, through cross-referencing,
with a host of other bioinformatics resources at the EBI and beyond.
In the
talk, I will introduce the database, highlight a number of recent developments
and discuss approaches to dealing with the ever increasing
volume and diversity of data.
Keywords:
nucleotide sequence, database, annotation, bioinformatics tool.