19th International CODATA Conference
Category: Interoperability

Natural Products Databases based on SciDex

Volkmar Vill
Institute of Organic Chemistry, University of Hamburg, Germany
http://liqcryst.chemie.uni-hamburg.de
, www.lci-publisher.com


PhytoBase (Wessjohann/IPB-Halle) and NAPIS (Rauzah/Kuala Lumpur) are two new examples of integrated information systems for natural products.  The features of SciDex can handle all required data types including assigned spectra, isolation procedures and images for these databases. Newly developed is the support for biological data. Taxonomic trees can be displayed dynamically from simple properties.

The SciDex database management system is a flexible and powerful object-oriented database management system, which has been previously used for material databases. The term 'object-oriented' describes the fact that all scientific information is given in a specific way to display, store and search data without any reduction of quality caused by SQL rules. SciDex is a complete information system for local PC, Intranet and Internet, including substructure comparison, data validation and numerical analysis. Enhanced is the search and comparison for structure properties, including NMR shifts, couplings, partial charges etc. New is the support for taxonomic relations.

Fully integrated Information system means that chemicals and organisms, isolation and spectroscopy, journals and patents, etc. are kept in one consistent format, and can be fully cross linked and cross-referenced with each other. Documentation of existing data is possible just as well as analysis of the recorded data, leading in the end to prediction of selected properties based on the chemical structure. All of this is additionally completely portable to an intranet-system, where the database resides on one computer, and the client-system on multiple other computers, accessing all the same data concurrently and transparently.

The differences between SQL-based systems and fully object-oriented systems (OOS) lie in the ability of the latter to define properties such as an NMR-spectrum by its own object-definition, such as chemical shifts (as 'peak list' and as 'assigned structure'), coupling-constants (as interval with error bars), solvent, measurement frequency, etc. The objects define their own rules, the solution to a query has the same symmetry as the problem, there is no loss of information implied in the registration process, nor a  generation of 'pseudo-information'.

The following diagram illustrates the Conception of SciDex.

Fig 1: Conception of the SciDex database

In the case of a natural products database, compounds, organism and drugs are all handled the same way, one containing a link to the other. The taxonomic information is saved as well, creating yet another relationship, displayable in a `Bio Tree` (Fig. 2), which shows the relations between species, family, order, class, phylum, allowing a hierarchical display with possibility of selection of single entries, or a whole tree, whatever desired. This BioTree is calculated in real time from individual, even ambiguous links between single taxons.

Fig 2: Display of a “BioTree”

SciDex can store all spectroscopy information in an efficient and simple way without using too specific object types. The sub fields of NMR for example uses 'structure' and 'peak list' as data types. Structures are not only objects of the compound, but also of any property. Because every structure within SciDex can carry 3D-data, NMR-Shift, partial charges, it can be easily used to store, display and analyze these data and substructure searches containing exact shifts or ranges of shifts and coupling constants are easy to carry out. (See Fig 3)

The 'peak list' is an open array of X-Y parameters, which can be used for all kinds of spectra, including IR, MS, UV and NMR data. A graphical display of the spectrum will be automatically generated when such a property is viewed in 'Single Property' mode. (See Fig. 4)

References

http://liqcryst.chemie.uni-hamburg.de

http://www.lci-publisher.com

http://www.phytobase.org

http://www.glycoprojects.kimia.um.edu.my/website/Glyco/Intro1


Fig 3: NMR-data assigned to the atoms of a structure

Fig 4: Mass spectrum as 'peak list', together with the automatically generated graph

 

Keywords:  natural products, spectroscopy, database management systems, taxonomy