Data Curation

Active and on-going management of data through its lifecycle of interest and usefulness to scholarship, science, and education

Researchers Working in this Area

Related Research Projects

Collaborative Research: ABI Development: Kurator: A Provenance-enabled Workflow Platform and Toolkit to Curate Biodiversity Data

Time frame
Bertram Ludäscher
Total funding to date
Funding agency
National Science Foundation

Data curation is a critical step in scientific data digitization, sharing, integration and use. The considerable resources allocated to digitization of natural science collections in the U.S. and globally require a focus on both digitization efficiencies and the utility of the generated data. One way to address both issues is to employ workflow software to automate and streamline data curation…


Socio-technical Data Analytics (SODA) Education

Time frame
Catherine Blake
Total funding to date
Funding agency
Institute of Museum and Library Services

This project will create both a master’s and doctoral-level specialization in Socio-technical Data Analytics (SODA). Partnerships with local researchers and businesses who already work with large data-sets will enable master's graduates to receive first-hand experience with both the social and technical implications of large digital data collections, and thus be well-prepared for leadership…

The Internet of Musical Events Digital Scholarship Community and the Archiving of Performance (InterMuse)

Time frame
J. Stephen Downie
Total funding to date
Funding agency
University of York

This project arises from longstanding recognition of the challenges associated with the documentation of, and access to, collections of performance ephemera, for which the British Library is a key repository in the UK. Live musical events play a vital role in community life across the globe, yet they often leave only faint traces on the historical record, even in modern times. Sources can be…

sheet music

The Reading Time Machine: Transforming Astrophysical Literature into Actionable Data

Time frame
Jill Naiman
Total funding to date
Funding agency

This project is a collaboration with Harvard University and the Astrophysics Data System (ADS), a digital library portal operated by the Smithsonian Astrophysical Observatory (SAO) under a NASA grant. With over 15 million records, ADS is one of the most important archives in the scientific field of astronomy.

"Newer documents are ‘born digital,’ making them machine-readable and…

solar system

Weakly Supervised Graph Neural Networks

Time frame
Jingrui He
Total funding to date
Funding agency
National Science Foundation

Graph Neural Networks have proven to be a powerful tool for harnessing graph data, which is widely used for representing rich relational information in multiple areas. However, the performance of graph neural networks largely depends on the amount of labeled data, which is subject to an expensive and time-consuming annotation process. This creates data without labels, or a label scarcity.…

graph on a computer screen

News Stories