The iSchool at Illinois is involved in a partnership that has received a research grant from the Institute of Museum and Library Services for an extension of the Data Capsule service, which enables remote access by the HathiTrust Digital Library to other collections managed by research libraries. The partnership is led by the School of Informatics and Computing at Indiana University.
As the volume of digital content has expanded exponentially over the past several years, researchers and educators have recognized the potential of big data techniques to analyze, access, and organize digital scholarly collections. The Data Capsule service, which was developed for use in the HathiTrust Research Center (HTRC), creates virtual computers for users to access a restricted collection. Within HTRC, the Data Capsule service is used for non-consumptive analytics, which allow the computer to analyze the text but doesn’t allow the user to read or disseminate copyrighted content. Non-consumptive analytics include text extraction, textual analysis and information extraction, linguistic analysis, automated translation, image analysis, file manipulation, OCR correction, and indexing and search capabilities.
"Enabling greater library and archival community use of the HTRC Data Capsule service will open some very unique possibilities for use of born-digital content within many different types of libraries and archives," said Beth Plale, professor at Indiana University, who is leading the initiative. "The grant draws from years of experience of providing a similar service within HathiTrust and proposes to evaluate the needs of research libraries in other cases of restricted data requiring safeguarding the interests of right holders and protecting privacy."
The project will partner with eight academic libraries across the country to understand current library needs and practices in provisioning library services for computational access to special collections having constraints due to sensitivity or restrictions. It also will extend the Data Capsule service to broader needs of provisioning for analytical access to restricted collections across a range of collections and uses; study extensions of Data Capsule to cloud computing environments for broader uses; and identify gaps in skills needed for librarians to enable secure data analytics and provide resources that can address those gaps.
Funded partners include Illinois, Indiana, University of California at Berkeley, and the University of Virginia. Lafayette College, MIT, Rutgers University, Swarthmore College, and UCLA are also engaged in the project.
The two-year grant is for $360,000.
"We are delighted to be part of the partnership that is bringing the Data Capsule technology to the broader library world," said J. Stephen Downie, iSchool professor, associate dean for research, and co-director of the HTRC. "This exciting technology opens up analytic access to new collections that would otherwise have been restricted for researchers."
