Online Scientific Data Curation, Publication, and Archiving
Thakar, Ani R.
Szalay, Alexander S
MetadataShow full item record
Science projects are data publishers. The scale and complexity of current and future science data changes the nature of the publication process. Publication is becoming a major project component. At a minimum, a project must preserve the ephemeral data it gathers. De- rived data can be reconstructed from metadata, but meta- data is ephemeral. Longer term, a project should expect some archive to preserve the data. We observe that pub- lished scientific data needs to be available forever – this gives rise to the data pyramid of versions and to data in- flation where the derived data volumes explode. As an example, this article describes the Sloan Digital Sky Sur- vey (SDSS) strategies for data publication, data access, curation, and preservation.