The Data Conservancy: Building a Sustainable System for Interdisciplinary Scientific Data Curation and Preservation

Author: Hanisch, Robert; Choudhury, Sayeed
Abstract: The Data Conservancy (DC) is one of two awards through the US National Science Foundation’s DataNet program. The goal of the DataNet program is to create “a set of exemplar national and global data research infrastructure organizations (dubbed DataNet Partners) that provide unique opportunities to communities of researchers to advance sci- ence and/or engineering research and learning.” The DC embraces a shared vision: data curation is not an end, but rather a means to col- lect, organize, validate, and preserve data to address the grand research challenges that face society. The overarching goal of The Data Conservancy is to support new forms of inquiry and learning to meet these challenges through the creation, implementation, and sustained management of an integrated and comprehensive data curation strategy. DC will address this overarching goal with a comprehensive project comprising four inter- dependent threads: 1) infrastructure research and development, 2) computer science and information science research, 3) broader impacts, and 4) sustainability. The DC is led by the Sheridan Libraries at Johns Hopkins University. Working with the Sloan Digital Sky Survey data and the US National Virtual Observatory, the Sheridan Libraries have developed an initial architectural design, data models and metadata pro- files, and organizational models to support data curation. The DC will build upon these initial lessons learned from the partnership between the library and astronomy commu- nity and extend them into the life sciences, earth sciences, and social sciences. Use cases will provide the initial framework for technical requirements. A robust information sci- ence and computer science research agenda will highlight the scientific requirements and inform the development of a data framework for observations and a theoretical frame- work for data curation. These activities will guide the development of new curriculum at library and information science schools thereby building capacity for a new generation of data scientists. One of the central tenets of DC’s sustainability plan relates to the leadership role of the library. The Sheridan Libraries at Johns Hopkins University have established a leader- ship position in prototyping data curation systems and services, especially as they relate to astronomy. One of the key outcomes of DC will be a new model for libraries in the digital age. There are several fundamental implications for libraries in the realm of data curation as they relate to collections, services, and infrastructure. The North American Association of Research Libraries has already engaged the DC in its effort to consider these implications strategically as a means to transform the library’s role and contribu- tions toward building and sustaining data curation infrastructure.
Description: Presentation at the PV 2009 conference in Madrid, Spain
Date: 2009
Subject: Data curation
Data Conservancy
Social sciences
Earth sciences
Life sciences

