Track I-D-2:
The US National Library of Medicine's Visible Human Project®
Data Sets
Chair: Michael J. Ackerman, National Library of Medicine,
National Institutes of Health, USA
In the mid-1990s,
the US National Library of Medicine sponsored the acquisition
and development of the Visible Human Project®
database. This image database contains anatomical cross-sectional
images, which allow the reconstruction of three-dimensional
male and female anatomy to an accuracy of less than 1.0
mm. The male anatomy is contained in a 15 gigabyte database,
the female in a 40 gigabyte database.
This session
will consist of four papers. The first will summarize
the history of the Visible Human Project®
and the development of the Visible Human data sets. We
will then explore the problems encountered in the real-time
navigation of such large image databases. The third paper
will discuss the extraction of data from such a database,
and the final paper will cover the problems of validation.
|
1.
The Visible Human Project® Image
Data Sets
From Inception to Completion and Beyond
Richard A. Banvard, National Library of Medicine, National Institutes
of Health, USA
The Visible Human
Project® Data Sets resulted from a recommendation
of the National Library of Medicine (NLM) Board of Regents'
1987 Long Range Plan that stated the NLM should "thoroughly
and systematically investigate the technical requirements and
feasibility of instituting a biomedical images library."
At the suggestion of an expert panel convened by the Board and
reporting in April 1990 that - "NLM should undertake a
first project, building a digital image library of volumetric
data representing a complete normal adult human male and female.
This 'Visible Human' project would include digital images derived
from computerized tomography, magnetic resonance imaging, and
photographic images from cryosectioning of cadavers." -
the University of Colorado was contracted in August 1991 by
NLM to undertake collection of this "Visible Human"
image data set. In November 1994 the Visible Human Male data
set was announced and released to the public, followed one year
later by the Visible Human Female. The data sets are available
via FTP at no cost, to anyone holding a no cost license. Each
image: CT, MRI and cryosection is stored as a separate file;
can be downloaded singularly or in any number up to the entire
data set. Several mirror sites have been established to facilitate
download for international license holders. The images also
can be purchased on tape for a fee from the National Technical
Information Services (NTIS). This session will include a discussion
of the genesis of the Visible Human Project®,
a description of the University of Colorado's cryosectioning
procedures, and descriptions of several of the more interesting
and notable outcomes developed by license holders who have used
The Visible Human Project® Data Sets.
2. Visible Human Explorer
Hao Le, Flashback Imaging Inc., Canada
Brian Wannamaker, Sea Scan International Inc., Canada
The technology for
imaging in medical applications continues apace. This increases
the potential for improvements in medical research, diagnostic
procedures, and patient care. On the other hand, the increase
in imaging activity also increases the shear volume of data
that must be dealt with. The imagery may be reviewed for immediate
diagnostic procedures and discarded. Or it may be stored or
archived for further use. However, storage or archiving is effectively
discarding unless effective means for recovering the data exist.
Accessibility is an essential component of developing and distributing
new knowledge from growing data volumes. This paper will discuss
specific approaches to improving accessibility of large image
databases like that of the Visible Human Project. Real time
navigation in 2D and 3D of image databases as well as user interfaces
designed for public and academic use will be outlined. The presentation
will be illustrated with some thousands of images from the Visible
Human Project.
3. The NLM Insight Registration and
Segmentation Toolkit
William Lorensen, GE Research, USA
In 1999, the National
Library of Medicine (NLM) awarded six contracts to develop a
registration and segmentation toolkit. The overall objective
of the project is to produce an application programming interface
(API) implemented within a public domain toolkit. The NLM Segmentation
and Registration Toolkit supports image analysis research in
segmentation, classification and deformable registration of
medical images. This toolkit meets the following critical technical
requirements identified by the National Library of Medicine:
-
Work with the Visual Human Male and Female data sets.
-
Provide a foundation for future medical image understanding
research.
-
Become
a self sustaining code development effort.
-
Accommodate
periodic and incremental modifications and additions.
-
Accommodate
expansion to parallel implementations.
-
Accommodate
large memory requirements.
-
Support
a variety of visualization and/or rendering platforms.
In addition to the
technical challenges presented by these requirements, the selected
team and subcontractors, had to work as a distributed group.
The software development experience of the groups also varied.
Some members had created software for a large community while
others had only developed software for their local groups. The
team defined a web centric software development process modeled
after the Extreme Programming approach that relies on rapid
and parallel requirements analysis, design, coding and testing.
Communication through web based mailing lists and bug trackers
was supplemented with conventional telephone conferences.
The first public
version of the software is scheduled for release in October,
2002.
This talk discusses
the chronology of the project, the core architecture and algorithms
as well as the light weight software engineering processes used
throughout the project. Finally, we present lessons learned
that will be of value to future distributed software development
projects.
4.
The Visible Human Data Sets: A Protoype and a Roadmap for
Navigating Medical Imaging Data
Peter Ratiu, Harvard Medical School, Brigham and Women's
Hospital, USA
The
Visible Human Data Sets are to date the most complete, multi-modality
data sets of human anatomy. The computational challenges
posed have been widely discussed and many of them have been
or are being solved by experts in various aspects of medical
image analysis and medical informatics. Their approach,
which has proved profitable, is to regard the Visible Human
as a vast collection of bits, single and multi-channell
images, with little regard to its intrinsic content B human
anatomy. This approach allowed them to solve computational
problems that had appeared overwhelming at the inception
of the project: powerful servers can make available the
individual images, manipulate and display them in various
ways, on the desktop of end-users. An example of such solution
implemented by computer scientists is the EPFL Server.
The more specific
problem, of how to use this unique information in medical
research has been less often addressed. One reason for this,
is that the data is vast, its manipulation seemingly unwieldy
for anatomists, until now more versed in using scalpels
than mouse buttons. Another reason is, the inherent novelty
of the data: for the first time, it opens the possibility
of a quantitative approach to anatomy. However, this quantitative
approach can be best exploited by first defining problems
in anatomy, anthropology, pathology in these terms.
I will discuss
two basic aspects of the Visible Human Project as a landmark
data set:
-
The problem of establishing a universal anatomical coordinate
system, with applications in basic research as well as
clinical medicine (radiology, clinical imaging), and how
the VHP can contribute to the solution.
-
The need for a quantitative comparative anatomy, as this
is becoming apparent in a broad array of disciplines,
ranging from physical anthropology to gynecology. I will
present how the VHP data can be employed as a roadmap
for navigating diverse data.
The aim of this
presentation is to present the problems related to medical
imaging data to experts in other fileds, in such manner,
that it may spark a mutually profitable dialog with hitherto
alien disciplines.
Track III-C-5:
Emerging tools and techniques for data handling in
developing countries
Chair: Julia Royall, Chief, International Programs, and
Director, Malaria Research Telecommunications Network,
for the National Library of Medicine, USA
This
session will feature three panelists, all working with
various tools and information technology to manage data
to improve health in Africa.
Allen
Hightower is Chief, Data Management Activity at CDC’s
National Center
for Infectious Disease and a pioneer in initiating NLM’s
malaria research network at a remote site on Lake
Victoria in Kenya. He has developed several tools for data collection
and management which will change the speed and quality
of data collection in Africa.
From
the KEMRI-Wellcome Trust research unit on the coast of
Kenya
comes Tom Oluoch, systems operator/data manager and co-creator
of a virtual library for this site, which brings together
researchers from Kenya Medical Research Institute and
Oxford University. His eyewitness case study is full of concrete
examples of how IT and data management have brought expansion
and change to this remote research unit.
Bob
Mayes is Chief, Health Informatics Section, Zimbabwe CDC AIDS Program.
CDC’s program of technical assistance to Zimbabwe
focuses on strengthening surveillance and laboratory measures,
scaling up promising prevention and care strategies, supporting
behavior change communication projects,
data mining, semantic management of data for systematic
review, and promoting technology transfer.
The
presenters will discuss individual examples and case studies,
as well as talk about how these tools can facilitate the
discovery process.
|
1.
Field Data Collection for the Malaria Research Network in Kenya
Allen Hightower, Centers for Disease Control, USA
Allen Hightower is Chief, Data Management Activity at CDC's
National Center for Infectious Disease and a pioneer in initiating
NLM's malaria research network at a remote site on Lake Victoria
in Kenya. He has developed several tools for data collection
and management which will change the speed and quality of data
collection in Africa. He is currently evaluating field data
collection using paperless GPS/data collection systems via Pocket
PC-based personal data assistants in two projects:
(1) collecting census and GPS data for a wash-durable bednet
study area and
(2) conducting a survey in a 15 village area on bednet usage
for linkage with other health-related data.
2.
Eye witness account: the role of IT and data management in expansion
and change at a remote research unit in Kenya
Tom Oluoch, KEMRI-Wellcome Trust, Kenya
From the KEMRI-Wellcome
Trust research unit on the coast of Kenya comes Tom Oluoch,
systems operator/data manager and co-creator of a virtual library
for this site, which brings together researchers from Kenya
Medical Research Institute and Oxford University. His eyewitness
case study is full of concrete examples of how IT and data management
have brought expansion and change to this remote research unit.
3.
CDC in Zimbabwe: strengthening regional surveillance and laboratory
measures, supporting infrastructure development and promoting
technology transfer
Robert Mayes, CDC AIDS Program, Zimbabwe
Bob Mayes is Chief,
Health Informatics Section, Zimbabwe CDC AIDS Program. CDC's
program of technical assistance to Zimbabwe focuses on strengthening
surveillance and laboratory measures, scaling up promising prevention
and care strategies, supporting behavior change communication
projects, data mining, semantic management of data for systematic
review, and promoting technology transfer.
4.
Complex Data From Health Research
Themba Mohoto, Reproductive Research Unit, Chris Hani
Baragwanath Hospital, Oweto
In the continuing search for better health for all, health
researchers are faced with numerous methodological problems
of a complex nature in their efforts to strengthen health
programs, evaluate health systems and measure the impact
of interventions. This in turn has posed greater challenges
for data analysts.
This paper investigates the types of data produced in health
research including:
-
Multi-stage survey data, e.g. Demographic Health Survey
(DHS) in which data is collected at many levels such as
household data, women data and children data and there
is a need to link the data from these various levels.
-
Longitudinal
or Repeated measures studies. Such data can arise either
from cohort studies or from clinical trials. In this type
of study there are repeated observations within individuals.
-
In
clinical trial databases there are also difficulties with
recording adverse events or concomitant medications, as
there will be a variable number of these per patient.
-
A
new area is that of cluster randomized trials which combines
features of multistage sample data with features of clinical
trial data.
Statisticians
in this area are investigating ways of dealing with these
problems.
Track III-D-6:
Données & Santé : utilisations et enjeux
(Data and Health: Usage and Issues)
Chair: Daniel Laurent, Université MLV, France ;
Jean-Pierre Caliste UTC, France
Le poids du secteur de la Santé dans l'économie
mondiale est devenu déterminant. Les dépenses
pour la Santé représentent désormais
15% du PIB des Etats-Unis et 10% de celui de la France
ou du Canada. Internet a bouleversé le domaine
et accentué son ouverture au grand public en termes
d'informations : il existe plus de 17 000 sites totalement
dédiés à la santé et 40% des
interrogations des internautes américains concernent
des sites santé.
La diversité des situations médicales reposant
sur l'utilisation de données complexes correspond
a des angles d'approches variés : objectifs institutionnels
et politiques, circulation d'informations médicales
dans les réseaux spécialisés et par
le biais d'internet, nouvelles pratiques médicales
et assistance pour des sites isolés, évolution
des services médicaux pour le praticien et le patient.
On constate une importance croissante des relations entre
les composantes étatiques (systèmes de Sécurité
sociale) et privées (assureurs, laboratoires pharmaceutiques
)
tant au plan organisationnel (niveau de définition
des actes) qu'à celui des aspects micro et macroéconomiques.
L'utilisation de données complexes (numériques,
imagerie
) et la gestion des connaissances fait
appel aux techniques de traitement de données de
nature hétérogène en s'appuyant sur
des approches résolument pluridisciplinaires venant
conforter la théorie de l'information.
A partir de ces constatations, Codata France a fait du
domaine de la santé l'un de ses trois axes prioritaires
d'activité. Il propose d'organiser un atelier thématique
sur cette question. Si nécessaire, il pourrait
être réparti en 2 sessions spécialisées.
Pourraient y être présentées les thématiques
suivantes :
-
les réseaux d'information ou les " autoroutes
" de l'information en santé.
-
données
et internet (e health) : fiabilité, validité
-
les
réseaux de santé et les réseaux
coordonnés de soins : de nouveaux enjeux
pour le " managed care "
-
les systèmes d'information en santé
: réseaux nationaux ou régionaux,
réseaux d'établissements, réseaux
de santé, cabinet médical
-
les enjeux de la télémédecine
-
l'utilisation des données par centres d'appels
(" call centers ")
-
le dossier médical du patient
-
-
la protection et l'archivage des données
-
le confidentialité des données
-
l'intéropérabilité des données
et des systèmes
|
1.
New information systems for the public healthcare insurance
organization : the Catalan Health Service (CatSalut) in Spain
TORT I BARDOLET (Jaume), Generalitat de Catalunya, CatSalut,
Barcelona, Spain
Key words : information
systems, health care organisation, insurance, risk management,
data.
Mots clés : systèmes d'information, management
des systèmes de santé, gestion du risque, données.
Ten years after
its creation, the Catalan Health Service (SCS) is initiating
a reorientation process aimed at consolidating its role as the
public healthcare insurance organization for all citizens of
Catalonia. This reorientation involves generating a series of
actions oriented more towards attention to the insurance holder/citizen,
while maintaining a close relation with the suppliers of healthcare
services from the public network.
This transformation coincides with the intention of generating
qualitative and quantitative advancement regarding the structure
of information systems available up to now. Thus a new Systems
Plan is being drawn up, oriented towards the SCS's function
as a public healthcare insurance organization.
1. Definition of
the SCS's management needs
Aims:
1.1 To manage resources efficiently
1.2 To implant processes for continued improvement in service
quality
1.3 To bring about active client management
1.4 To manage risk
1.5 To implant efficient administrative processes
These aims involve
a series of needs that must be taken into account when developing
new management and information systems.
-
To
back the management aims of the major working areas: demand,
offer and internal administration, and lines of action for
each of these (services).
-
To
facilitate the systematic drawing up of management reports
based on parameters enabling the executive structure to make
decisions concerning steps to be taken.
-
To
collect all necessary information properly and in good time
by means of the most adequate software.
In order to specify
these aims, a series of management levers has been devised:
To manage resources
To provide activity follow-up
To provide cost follow-up
To manage the quality level
To establish communication with clients
To manage risk
To improve the health of the population
To rationalize processes
To improve claim procedure for damages
Moreover, this has
to be specified using pre-established follow-up parameters for
drawing up the management reports.
2. Evaluating the
developments and structure required
The proposal for
the basic structure of the new information systems is based
on three concepts and their corresponding identifiers:
- The insurance
holder = personal identification code (CIP)
- The service providing unit = productive unit code (UP)
- The service / activity = service code
It has been planned
that the different computer applications will work on a large
data warehouse that will compile all activity (contracts of
insurance holders with the productive structure) and which must
make possible the generation of different views for each of
the functions (see Graph 1).
The system has been
graphically represented as a pyramid divided transversally into
three parts. The lower trapezium shows the database (information)
; the middle, the computer applications (the treatment of information);
and the upper triangle, the management information system.
The design of the
information system is structured around four basic areas: demand,
offer, activity and economy-finance (see Graph 2).
2.
The planning and management of emergency treatment in Catalonia,
by means of a specific information system
TORT I BARDOLET (Jaume), Generalitat de Catalunya, CatSalut,
Barcelona, Spain
Key words : information
systems, emergency, health care organisation, planning, data.
Mots clés : systèmes d'information, urgences,
management des systèmes de santé, planification,
données.
The Overall Emergency
Plan has been used in Catalonia for the past three years. This
is a global scheme that includes planning, precaution and prevention,
management and supervision of emergency healthcare attention.
It was created, above all, for those times of the year when
there is an increased demand for healthcare attention for a
variety of reasons.
The Plan includes:
-
The
analysis of the population requiring emergency attention:
user characteristics, reasons for the examination, analysis
of user expectations and motivations.
-
Preventative
actions: increased homecare coverage, increased influenza
vaccine coverage, follow-up of users who have repeatedly requested
emergency attention.
-
Organizational
actions: the drawing up by the hospitals of annual working
plans for emergency attention, telephone-based back-up for
mental health professionals, and coordination among healthcare
mechanisms.
-
An
increase in the offer of contracted hospital discharges, and
reinforcements in the summer and during periods of sustained
growth in demand.
The information
system
The Overall Emergency Plan is based on a specific information
system -extranet- which makes it possible for a group of productive
units from different healthcare areas to register - on a daily
basis - emergency activity data from their centers, as well
as other relevant information that allows the forecasting of
increased demand and the quick and effective adoption of corrective
measures. The extranet includes information regarding:
-
Specialized
attention (hospitals):
-
data concerning activity: emergency cases attended and admitted,
hospital admissions, discharges and transfers to other centers
-
data concerning resource availability: patients awaiting
admission, waiting period, available beds
-
Continued
primary attention: emergency activity of these centers
-
Primary
attention: data concerning continued attention, number of
house calls
-
Specific
emergency services (061): number of telephone calls attended
and services carried out.
3.
Etude d'un système d'aide à la gestion de l'information
dans la santé
Appliqué au domaine cardiovasculaire
Elisabeth Scarbonchi, Daniel Laurent, Christian Recchia,
Université de Marne-la-Vallée, Institut Francilien
d'Ingénierie des Services (I.F.I.S.), France
Dans le cadre
d'un réseau de soins, le praticien et l'usager ont
accès à un ensemble d'informations le concernant.
Les informations sont réparties dans différents
services d'un même hôpital voir plusieurs établissements.
Dans ce système intervient la nature (typologie des
informations), leur localisation et les volumes concernés,
notamment les données informatiques lorsqu'il s'agit
d'imagerie médicale.
Les réseaux
à haut débit sont de nature à offrir
des possibilités de connexion entre ces différentes
sources pour une exploitation optimales dans les services
de cardiologie.
Lorsqu'il s'agit
de données numériques et textuelles, les techniques
de datamining et textmining pourront être utilisés
dans le but de produire de l'information à valeur ajoutée
dans le cadre d'un fonctionnement opérationnel voir
dans un contexte de recherche.
Lorsqu'il s'agit
de sources d'images leur mise à disposition immédiate
et interactive offre des possibilités et des perspectives
d'animation et représentation dans un contexte opérationnel.
La mise en place
d'un système d'informations multisources en réseau
nécessitera de traiter avec une attention particulière
les problèmes de sécurité et de propriété
de données.
4.
Données et santé : propriété,
accès, protection, transmission. Les enjeux des réseaux
de santé
Christian
Bourret, Université de Marne-la-Vallée, France
Serge Chambaud, Institut National de la Propriété
Industrielle (I.N.P.I.), France
Elisabeth Scarbonchi, Université de Marne-la-Vallée,
France
Daniel Laurent, Université de Marne-la-Vallée,
France
Mots clés
: propriété des données, confidentialité,
dossier médical patient, réseaux de santé,
autoroutes de l'information.
Key words : data ownership, confidentiality, medical record,
business methods, patents, health care management, information
networks.
La propriété et la protection des données
constituent un des enjeux majeurs de la société
post-industrielle fondée sur les biens immatériels
: les services et la diffusion de l'information. Dans le contexte
du développement de l'industrie de l'information, les
données médicales constituent un enjeu commercial
très important. Ces données sont très
spécifiques. Il s'agit avant tout de données
personnelles, sensibles et confidentielles, faisant l'objet
de législations particulières. Pour bâtir
notre problématique, nous nous appuierons sur l'expérience
française des réseaux de santé, que nous
élargirons ensuite à des comparaisons avec les
Etats-Unis.
Le premier enjeu étudié sera celui de la propriété
et de l'usage des données produites par les réseaux
de santé. Nous l'analyserons à partir du dossier
médical patient. Les données qu'il renferme
appartiennent-elles au patient ? Aux différents médecins
et aux organisations (hôpitaux, cliniques, assurance
maladie
) pris individuellement ? Au réseau ?
A l'entité qui l'héberge : notaire de l'information,
infomédiaire ou hébergeur ? La réponse
est loin d'être évidente. L'ensemble des données
: le dossier global partagé, constitue-t-il en termes
de propriété un tout différent de la
somme de ses parties ? Peut-on vraiment strictement séparer
l'usage des données de leur propriété
? Nous analyserons les différentes réponses
actuelles possibles à ces questions.
Nous évoquerons
ensuite une autre question déterminante : l'accès
du patient à la consultation de ses données
de santé personnelles. En France, la nouvelle loi du
4 mars 2002 a posé de grands principes mais a laissé
de nombreuses interrogations en suspens. Cet accès
se fera-t-il directement ? Indirectement par le biais d'un
médecin ? Et à quelles données le patient
aura-t- il accès ? A l'intégralité de
son dossier ou à un résumé ? Aura-t-il
également accès aux commentaires des praticiens
? Nous tracerons des pistes de réflexion pour éclairer
toutes ces questions.
Tout se complique encore quand, comme c'est largement le cas
aux Etats-Unis, les patients constituent leur propre dossier
médical. Dans ce cas, quelle en est la fiabilité
? Peut-il être utilisé par des professionnels
qui engageraient ainsi leur responsabilité ?
En terme de propriété
industrielle et intellectuelle, se pose aussi la question
de la brevetabilité et de la protection des logiciels
de création, de gestion ou de diffusion du dossier
médical patient. Les dossiers médicaux patients
sont-ils protégeables ? Les critères de brevetabilité
classiques s'appliquent-ils ou non à eux ? Ou bien,
constituent-ils des " business methods " et, dans
ce cas, comment les protéger ? Les réponses
peuvent varier selon les pays. Nous aborderons ces questions
à travers une comparaison entre les possibilités
offertes en France et aux Etats-Unis.
La transmission des données médicales constitue
un autre enjeu majeur, celui des autoroutes de l'information.
Nous examinerons deux aspects essentiels de l'évolution
actuelle, notamment en France : l'effacement progressif de
l'Etat au profit d'acteurs privés et le choix fondamental
entre la sécurisation d'un réseau de transmission
de données médicales (Réseau Santé
Social de Cégétel-Vivendi) ou de la sécurisation
des données elles-mêmes (France Télécom
ou Cegedim).
5.
Les réseaux de santé : une expérimentation
française centrée sur le partage de l'information
Gabriella Salzano, Université de Marne-la-Vallée,
France
Christian Bourret, Université de Marne-la-Vallée,
France
Jean-Pierre Caliste, Université de Technologie de Compiègne
(UTC), France
Daniel Laurent, Université de Marne-la-Vallée,
France
Mots clés
: réseaux de santé, systèmes d'information,
information partagée.
Key words : health care management, information systems, data,
shared information.
Depuis le début
des années 1980, l'ensemble des grands pays industrialisés
sont confrontés au problème de la maîtrise
des coûts de leurs systèmes de santé et
en particulier de ceux de l'hospitalisation. Une solution
envisagée a été le " virage ambulatoire
" visant à privilégier la médecine
de ville en s'appuyant sur les nouvelles technologies de l'information
et de la communication (NTIC). En France, une voie originale
a été expérimentée : les réseaux
de santé. Elle a été légitimée
par la loi du 4 mars 2002 relative au droit des malades et
à la qualité du système de santé.
Les réseaux de santé se veulent résolument
au service du patient. Leurs objectifs sont de décloisonner
le système de santé en améliorant l'indispensable
relation ville-hôpital mais aussi les relations entre
les différents professionnels en charge du même
patient. Il s'agit d'assurer la qualité et la continuité
de soins par la mise en place d'une organisation innovante,
fondée sur des valeurs partagées, comme la construction
de pratiques collégiales et non plus individuelles
ou hiérarchisées, et un meilleur partage de
l'information.
Les systèmes d'information constituent le pivot des
réseaux de santé. Ils doivent tout d'abord assurer
l'interopérabilité (coordination et intégration)
de différents autres sous-systèmes, notamment
les systèmes d'information propres aux hôpitaux
ou cliniques, les logiciels de gestion de cabinets médicaux
ou des autres professionnels. Il doivent aussi permettre l'accès
à des bases de données ou à des logiciels
d'aides à la décision (référentiels
) comme aux services de télémédecine.
Ils doivent aussi assurer la gestion de services spécifiques
au réseau : plate-forme d'orientation des urgences
et / ou centre d'appels, dossier patient partagé au
sein du réseau
Nous analyserons les principaux
problèmes à résoudre, en termes d'organisation
et d'applications.
Les réseaux de santé répondent à
des forts besoins de changement. Leur mise en place et leurs
performances doivent être évaluées. L'évaluation
influence fortement l'élaboration du système
d'information, car celui-ci devra fournir les données
indispensables au suivi des indicateurs d'évaluation
et répondre à des exigences de qualité,
spécifiques aux objectifs des réseaux.
Dans cette communication, nous évoquerons les enjeux
et les méthodologies d'évaluation des réseaux
de santé, en soulignant les interactions avec les méthodologies
d'élaboration des systèmes d'information, dans
un cadre de management de projets complexes.
Behavioral
and Social Science Data
|
Track I-C-4:
Government as a Driver in Database Development in the
Behavioral Sciences
Chair: David Johnson, Building Engineering and Science
Talent, USA
The behavioral
sciences have not had a tradition of data sharing. Thus
they have been somewhat behind other sciences in the development
of databases. Officials in several science agencies of
the US federal government have been concerned about this
lack of data sharing and have taken measures to stimulate
development. The purpose of this panel is to explore the
ways that government agencies can arrange funding opportunities
to stimulate innovation in areas that scientists within
given fields have been reluctant to address.The work of
three US agencies will be highlighted: The National Science
Foundation, the National Institutes of Health, and the
Federal Aviation Administration.
Government and science often exercise reciprocal influences
on each other. The three examples that that will be explored
in this panel session represent three discrete models
by which governments may stimulate a science to produce
knowledge in a way that it would not have in the absence
of the government's effort.
|
1.
Sharing data collection and sharing collected data: The NICHD
Study of Early Child Care and Youth Development
Sarah L. Friedman, The NICHD Study of Early Child Care and
Youth Development, USA
The NICHD Study
of Early Child Care and Youth Development came to life as a
result of a 1988 NICHD solicitation (RFA) and is scheduled to
terminate at the end of 2009. The aim of the solicitation was
to bring together investigators from different universities
or research institutions to collaborate with NICHD staff on
the planning and execution of one longitudinal study with data
to be collected across sites. The idea for such a collaborative
study was unprecedented in the scientific field of developmental
psychology.
Ten data collection
sites were selected on a competitive basis and the affiliated
investigators, in collaboration with NICHD staff, have designed
the different phases of the solicited longitudinal study and
have implemented it. While the data collected at each of the
sites belongs to the site, NICHD required that each of the 10
sites would send its data to a central location, the Data Acquisition
and Analysis Center, for data editing, data reduction and data
analyses. The study investigators, in collaboration of the data
center staff, guide the data acquisition and analyses. Upon
completion of an agreed upon quota of network authored scientific
papers for a given phase of the study, individual study investigators
get access to the data sets of the entire sample. A few months
after the data sets and supporting documentation are available
to individual study investigators for their exclusive use, the
same data sets are made available to interested and qualified
others in the scientific community.
While the archiving
of the data is done by an NICHD grantee, the Murray Center at
Radcliff College has expressed interest in archiving the data
and supporting their use by interested and qualified investigators.
If the grantee institutions will accept the Murray Center request,
the data collected by the grantees will be available to the
scientific community beyond the life of the grant.
2.
Data Sharing at NIH and NIA
Miriam F. Kelty, National Institute on Aging, Office of
Extramural Activities, USA
NIH published its policy mandating sharing of unique biological
resources in 1986. Sixteen years later NIH published a draft
policy. It states that NIH expects the timely release and sharing
of final research data for use by other researchers. Further,
NIH will require extramural and intramural investigators to
promulgate a data sharing plan in their research proposals or
to explain why a plan to share data is not possible. The policy
is available for comment until June 1. The presentation will
provide background information and summarize public comments.
NIA staff have been
leading advocates for data sharing and have encouraged it among
grantees, particularly when research involves large data sets
that are valuable research resources and impractical to replicate.
NIA will provide funds to make data that are well documented
and user-friendly available to other researchers. Some examples
of NIA supported activities in support of data sharing are described
below:
The National Archive
of Computerized Data on Aging (NACDA), located within the Interuniversity
Consortium for Political and Social Research (ICPSR), is funded
by the National Institute on Aging. NACDA's mission is to advance
research on aging by helping researchers to profit from the
under-exploited potential of a broad range of datasets. NACDA
acquires and preserves data relevant to gerontological research,
processing as needed to promote effective research use, disseminates
them to researchers, and facilitates their use. By preserving
and making available the largest library of electronic data
on aging in the United States, NACDA offers opportunities for
secondary analysis on major issues of scientific and policy
relevance.
NACDA supports a
data analysis system that allows the user to access subset variables
or cases. The system can be used with a variety of data stets,
including the Longitudinal Survey on Aging, National Survey
of Self-Care and Aging, National Health and Nutrition Survey,
National Hospital Discharge Survey, and the National Health
Interview Survey.
NIA supports a range
of studies that have agreed to make data available to researchers.
An example is the Health and Retirement Study, a nationally
representative study that collects data on aging and retirement.
The study is based at the University of Michigan and the Michigan
Center on Demography of Aging makes data available to a range
of researchers. Some data is available to anyone for analysis
while other data sets are restricted and require contractual
agreements prior to being made available for use.
The presentation
will address NIA's experience with the use of available data
sets and raise some issues surrounding data sharing.
3. Data Archiving for Animal Cognition
Research: The NIMH Experience
Howard S. Kurtzman, Cognitive Science Program, National
Institute of Mental Health, USA
In July 2001, the National Institute of Mental Health (a component
of the U.S. National Institutes of Health) sponsored a workshop
on "Data Archiving for Animal Cognition Research."
Participants included leading scientists as well as experts
in archiving, publishing, policy, and law. Due to the focus
on non-human research, participants were able to devote primary
attention to important issues aside from protection of confidentiality,
which has dominated most previous discussions of behavioral
science archiving. The further limitation of the workshop's
scope to animal cognition research allowed archiving to be examined
realistically in the context of one particular scientific community's
goals, methods, organization, and traditions.
The workshop produced a set of conclusions, detailed in a formal
report, concerning: (1) the likely impacts of archiving on research
and education, (2) guidelines for incorporating archiving into
research practice, (3) contents of archives, (4) technical standards,
and (5) organizational and policy issues. The presentation will
review these conclusions and describe activities following up
on the workshop. Also discussed will be the applicability of
the workshop's conclusions to other areas of behavioral science
and how this workshop's approach to stimulating archive development
might serve as a model for other fields.
4.
Data Sharing and the Social and Behavioral Sciences at the National
Science Foundation
Philip Rubin, Division of Behavioral and Cognitive Sciences,
USA
At the heart of
the National Science Foundation's (NSF) strategic plan are people,
ideas, and tools. In the latter area, our goal is to provide
broadly accessible, state-of-the-art information-bases and shared
research and education tools. We actively encourage data sharing
across all of our fields of study. This presentation will provide
examples from the social and behavioral sciences. As data sharing
is encouraged and increased, however, there are growing concerns
and issues related to privacy and confidentiality. These issues
will also be discussed, as will future directions in information
sharing.
At the NSF, the
Directorate for Social, Behavioral, and Economic Sciences (SBE)
participates in special initiatives and competitions on a number
of topics, including infrastructure to improve data resources,
data archives, collaboratories, and centers.
The breadth of fields is wide in our Directorate, ranging from
Anthropology through Political Science and Economics. However,
common to many of the disciplinary areas that we support is
a rapid change in how the science is being done. What is emerging
is a large scale social science, driven by computational progress,
the need for scientific expertise across a number of domains,
growing bodies of data and other information, and theoretical
and practical issues that require for their understanding a
broader view than has been taken in the past.
This change will
be illustrated by some examples of recent or continuing projects
that we are supporting. For example, physical anthropologists
utilize tools from a wide range of overlapping disciplines ranging
from molecular biology (population genetics) to field ecology
to remote sensing (paleoanthropology). In all of these areas
large amounts of data are generated that are conducive to the
establishment of digital libraries, databases, web-based archives
and the like. A recent SBE Infrastructure award will be described
that supports a number of interrelated activities that will
advance research in physical anthropology, evolutionary biology,
neuroscience and any others that may require information and/or
biomaterials from nonhuman primates.
An example in geography
is the National Historical Geographic Information System (NHGIS)
at the University of Minnesota, Twin Cities. This project upgrades
and enhances U.S. Census databases from 1790 to the present,
including the digitization of all census geography so that place-specific
information can be readily used in geographic information systems.
We expect that the NHGIS will become a resource that can be
used widely for social science training, by the media, for policy
research at the state and local levels, by the private sector,
and in secondary education.
Last year the National
Science Board approved renewal of NSF support for the Panel
Study of Income Dynamics (PSID). The PSID is a longitudinal
survey initiated in 1968 of a nationally representative sample
for U.S. individuals and the family units in which they reside.
The major objective of the panel is to provide shared-use databases,
research platforms and educational tools on cyclical, intergenerational
and life-course measures of economic and social behavior. With
thirty-plus years of data on the same families, the PSID can
justly be considered a cornerstone of the infrastructure support
for empirically based social science research.
Additional examples
abound, and will be discussed. These include CSISS, the Center
for Spatially Integrated Social Science, at the University of
Santa Barbara; the fMRI Data Center at Dartmouth College, a
national cognitive neuroscience resource; data-rich linguistics
projects that support both the preservation of knowledge of
disappearing languages and statistically-guided approaches to
increasing our understanding of ongoing language use; systems
for storage and dissemination of multimodal (audio, visual,
haptic, etc.) data; and systems and techniques for the meta-analysis
of large scale data sets.
Data sharing is at the heart of NSF's mission and of our vision
of the social and behavioral sciences. This presentation is
intended to provide an overview of that vision.
Track I-D-6:
Database Innovation in the Behavioral Sciences and
the Debate Over What Should Be Stored
Session organizer: US National Committee for the International
Union of Psychological Sciences, National Academy of Sciences,
Washington, D.C., USA
Chair: Merry
Bullock, American Psychological Association
Data sharing
is not the norm in behavioral science, although there
are pockets of change and innovation. At the same time,
a debate is underway regarding what data from experiments
are worth placing in databases to be available for others.
As it becomes possible to store huge quantities of data,
it is becoming more necessary to assure that databases
grow into useful tools rather than clogged informational
arteries. This panel has two objectives: to inform attendees
of innovations and to discuss the possible criteria for
determining what should be included in databases.
Panelists will discuss several innovative databases that
are proving transformational for the fields they touch.
For example, a database of functional magnetic resonance
images of the brain created at Dartmouth College is making
it possible to test hypotheses about brain-behavior relations
on data pooled across many individual studies; a database
of geographic information based at the University of California,
Santa Barbara is allowing those in a variety of disciplines
to look at the influence of location on such things as
health behaviors, social development, and wealth accumulation.
A database of aptitude test scores at the University of
Virginia is a test bed for statistical innovations that
are making it possible to legitimately compare data and
not just outcomes from disparate studies.
The Panel will describe several of these innovations in
behavioral and other sciences, and will address important
emerging issues. For example, the fMRI database (originally
envisioned as capturing all the images from most of the
major neuroscience journals) is constrained because of
file size-images from a single journal consume terabytes
of storage space and raise important questions of accessibility.
As the behavioral sciences evolve toward more common acceptance
of data sharing, those in the behavioral sciences must
evolve toward a more common understanding of what should
be contained in a database and what sorts of data are
appropriate for archiving. Examples and issues from other
disciplines will help inform the discussion.
|
1.
Acquisition Criteria at the Murray Research Center: A Center
for the Study of Lives
Jacquelyn B. James, Murray Research Center
The Murray Research
Center is a repository for social and behavioral sciences data
on the in-depth study of lives over time, and issues of special
concern to American women. The center acquires data sets that
are amenable to secondary analysis, replication, or longitudinal
follow-up. In determining whether or not to acquire a new data
set for the archive, several kinds of criteria are used. The
criteria can be roughly grouped into five general categories:
content of the study, methodology, previous analysis and publication,
historical value, and cost of acquiring and processing the data.
Each of these will be described with an indication of the relative
importance of each criterion, where possible.
2.
What Functional Neuroimaging Data is 'Worth' Sharing and the
Scope of Large-Scale Study Data Archiving
John Darrell Van Horn, The fMRI Data Center, Dartmouth College,
USA
Functional neuroimaging studies routinely produce large sets
of raw data that comprise both functional image time series
as well as high-resolution anatomical brain volumes. It is often
the case that these data are then passed through several steps
of processing and then only a limited set of the statistical
output is presented in papers published in the peer-reviewed
literature. Arguments for archiving only these summary results
have suggested that they are of greater value than that of the
raw data itself. However, since with each step of processing
the information content of a data set remains constant or is
reduced, it is difficult to see the source of any increased
scientific value. The fMRI Data Center (fMRIDC) strives to archive
complete raw functional neuroimaging data sets accompanied by
enough information that anyone else would be able to reconstruct
the steps in processing of the data and arrive at the same statistical
brain map as the original authors. To achieve this, the fMRIDC
requests that authors of published studies provide considerably
more study 'meta' and raw data than is typically presented in
their published article. As such, several studies currently
in the fMRIDC archive rival the size of the entire human genome
database (~20GB compressed). Through the storing of complete
study data sets, the fMRIDC effort will serve to not only advance
thinking into fundamental concepts about brain function by permitting
others to examine the published neuroimaging data of others
but also to document more thoroughly the scientific record of
work in the fields of functional brain imaging and cognitive
neuroscience.
3.
Accession and Sharing of Geographic Information
Michael F. Goodchild, University of California, Santa Barbara,
USA
Geographic information
is a well-defined type, with complex uses and production systems.
The Alexandria Digital Library began as an effort to provide
remote access to a large collection of geographic information
(maps and images), but has evolved into a functional geolibrary
(a digital library that can be searched using geographic location
as the primary key). I use ADL to illustrate many of the issues
and principles inherent in sharing geographic information, and
in policies regarding its acquisition by archives, including
granularity, metadata schema, support for search across distributed
archives, portals and clearinghouses, and interoperability.
|