Workset Creation for Scholarly Analysis+Data Capsule

Time Frame


Total Funding to Date



  • J. Stephen Downie

This project builds upon, extends, and integrates two developmental research threads within the HathiTrust Research Center (HTRC). The first thread originates from work that was conducted in the Workset Collections for Scholarly Analysis (WCSA): Prototyping Project. The second thread continues the work of the Data Capsules (DC) project, previously supported by the Alfred P. Sloan Foundation (2011-2014). The primary objective of the WCSA+DC project is the seamless integration of the workset model and tools with the Data Capsule framework to provide non-consumptive research access to HathiTrust's massive corpus of data objects, securely and at scale, regardless of copyright status. The key outcomes and benefits of the WCSA+DC, Phase I project are: 

  • The deployment of a new Workset Builder tool that enhances search and discovery across the entire HTDL by complementing traditional volume-level bibliographic metadata with new metadata derived from a variety of sources at various levels granularity.
  • The creation of Linked Open Data resources to help scholars find, select, integrate and disseminate a wider range of data as part of their scholarly analysis life-cycle.
  • A new Data Capsule framework that integrates worksets, runs at scale, and does both in a secure, non-consumptive, manner.
  • A set of exemplar pre-built Data Capsules that incorporate tools commonly used by both the DH and CL communities that scholars can then customize to their specific needs.


  • Beth Plale, CO-Principal Investigator, Indiana University

Funding Agencies

  • Andrew W. Mellon Foundation, 2016 – $1,170,000.00