The Data Conservancy: Building a Sustainable System for Interdisciplinary Scientific Data Curation and Preservation
Embargo until
Date
2009
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The Data Conservancy (DC) is one of two awards through the US National Science
Foundation’s DataNet program. The goal of the DataNet program is to create “a set of
exemplar national and global data research infrastructure organizations (dubbed DataNet
Partners) that provide unique opportunities to communities of researchers to advance sci-
ence and/or engineering research and learning.”
The DC embraces a shared vision: data curation is not an end, but rather a means to col-
lect, organize, validate, and preserve data to address the grand research challenges that
face society. The overarching goal of The Data Conservancy is to support new forms of
inquiry and learning to meet these challenges through the creation, implementation, and
sustained management of an integrated and comprehensive data curation strategy. DC
will address this overarching goal with a comprehensive project comprising four inter-
dependent threads: 1) infrastructure research and development, 2) computer science and
information science research, 3) broader impacts, and 4) sustainability.
The DC is led by the Sheridan Libraries at Johns Hopkins University. Working with the
Sloan Digital Sky Survey data and the US National Virtual Observatory, the Sheridan
Libraries have developed an initial architectural design, data models and metadata pro-
files, and organizational models to support data curation. The DC will build upon these
initial lessons learned from the partnership between the library and astronomy commu-
nity and extend them into the life sciences, earth sciences, and social sciences. Use cases
will provide the initial framework for technical requirements. A robust information sci-
ence and computer science research agenda will highlight the scientific requirements and
inform the development of a data framework for observations and a theoretical frame-
work for data curation. These activities will guide the development of new curriculum at
library and information science schools thereby building capacity for a new generation of
data scientists.
One of the central tenets of DC’s sustainability plan relates to the leadership role of the
library. The Sheridan Libraries at Johns Hopkins University have established a leader-
ship position in prototyping data curation systems and services, especially as they relate
to astronomy. One of the key outcomes of DC will be a new model for libraries in the
digital age. There are several fundamental implications for libraries in the realm of data
curation as they relate to collections, services, and infrastructure. The North American
Association of Research Libraries has already engaged the DC in its effort to consider
these implications strategically as a means to transform the library’s role and contribu-
tions toward building and sustaining data curation infrastructure.
Description
Presentation at the PV 2009 conference in Madrid, Spain
Keywords
Data curation, Data Conservancy, Curation, Libraries, Social sciences, Earth sciences, Life sciences, Astronomy