Digital Libraries Subscribe to Digital Libraries

RELATED RESEARCH PROJECTS

HT%2BBW%20Promo%20%233
National Endowment for the Humanities

The HathiTrust Research Center (HTRC) is partnering with the Cultural Observatory team that developed the Google Books Ngram Viewer together with Google. The goal of this collaboration is to implement a greatly enhanced open-source version of the Cultural Observatory’s open-source “Bookworm” text analysis and visualization tool designed to assist scholars to meet the challenges posed by the massive scale of the HT corpus. We are calling our multi-disciplinary, multi-institutional collaboration, the HathiTrust + Bookworm (HT+BW) Project. Participating institutions include the University of Illinois, Indiana University, Northeastern University, Baylor College of Medicine, and Rice University.

Bookworm is a tool that visualizes language usage trends in repositories of...

htrc_new
HathiTrust

The HathiTrust has provided funding for the HathiTrust Research Center (HTRC), colocated at University of Illinois and Indiana University, to serve as the research arm of the HathiTrust and create an agile, technology-rich service for researchers in the digital humanities, social sciences, natural sciences, and informatics. This service will help researchers conduct nonconsumptive research on the HathiTrust digital library database, a collection of just under 14 million digitized volumes, equating to 4.9 billion pages, 60% of which is under some copyright restriction. At the same time, center staff will develop and refine tools to aid in digital humanities and text mining research over large databases and will operate the secure, large-scale computation environment required by this...

htrcnew3-wh
Andrew W. Mellon Foundation

This project builds upon, extends, and integrates two developmental research threads within the HathiTrust Research Center (HTRC). The first thread originates from work that was conducted in the Workset Collections for Scholarly Analysis (WCSA): Prototyping Project. The second thread continues the work of the Data Capsules (DC) project, previously supported by the Alfred P. Sloan Foundation (2011-2014). The primary objective of the WCSA+DC project is the seamless integration of the workset model and tools with the Data Capsule framework to provide non-consumptive research access to HathiTrust's massive corpus of data objects, securely and at...

IN THE NEWS

Sep. 18, 2017

Three recipients of the 2017 Digital Library Federation (DLF) Forum Fellowships have ties to the iSchool. Jane Kelly, an MS/LIS student in the Leep program, and Nushrat Khan (MS '16) were awarded DLF Forum Fellowships for Students and New Professionals. Richard J. Urban (PhD '12) was named a KRESS+DLF Forum Fellow. The awards recognize the recipients' dedication to their work and the field of digital libraries.

Kelly is the historical and special collections assistant at the Harvard Law School Library. "The work I'm most excited about these days is the HLS Community Capture Project, a grant-funded project that I'm managing to prototype a tool to facilitate born-digital collecting from student organizations at Harvard Law School," she said. "When I'm not working on that, you'll likely find me with researchers in our reading room, managing our print collection of institutional, student, and faculty publications, or doing a little web archiving. I'm interested in the ways in...

Aug. 4, 2017

The iSchool at Illinois is involved in a partnership that has received a research grant from the Institute of Museum and Library Services for an extension of the Data Capsule service, which enables remote access by the HathiTrust Digital Library to other collections managed by research libraries. The partnership is led by the School of Informatics and Computing at Indiana University. 
  
As the volume of digital content has expanded exponentially over the past several years, researchers and educators have recognized the potential of big data techniques to analyze, access, and organize digital scholarly collections. The Data Capsule service, which was developed for use in the HathiTrust Research Center (HTRC), creates virtual computers for users to access a restricted collection. Within HTRC, the Data Capsule service is used for non-consumptive analytics, which allow the computer to analyze the text but doesn’...

Jun. 16, 2017

iSchool faculty, staff, and students will present their research at the Joint Conference on Digital Libraries (JCDL), which will be held on June 19-23 in Toronto. The event brings together international scholars focusing on digital libraries and associated technical, practical, organizational, and social issues. The goal is to provide a forum for shared learning and facilitate the application of knowledge for research, development, construction, and utilization in digital libraries.

Papers to be presented at JCDL 2017 include:

  • "Information-Seeking in Large Scale Digital Libraries: Strategies for Scholarly Workset Creation"
    Authors include J. Stephen Downie, professor and associate dean for research, and Peter Organisciak (PhD '15), postdoctoral research associate
     
  • "Uncertainty about the Long-Term: Digital Libraries, Astronomy Data, and Open Source Software"
    Authors include...
Jun. 12, 2017

The iSchool is co-organizing a workshop on digital scholarship with Beijing Institute of Technology (BIT) Library on June 14-16 in Beijing. The workshop, Digital Scholarship Centers: Building Library Services for Data-Driven Scholarship, will instruct participants in library service models for digital scholarship and discuss concepts in digital humanities and computational social science. Dean Allen Renear will give opening remarks. Other iSchool presenters include J. Stephen Downie, professor and codirector of the HathiTrust Research Center (HTRC); Peter Organisciak (PhD '15), postdoctoral research associate; Eleanor Dickson, visiting HTRC digital humanities specialist; and Nic Weber (PhD '15), assistant professor at the University of Washington.

Downie will give the talks:

  • "Text Mining Concepts and Methods: HTRC and Non-Consumptive Research"
  • "Quick and Painless Introduction to Machine Learning"
  • "WEKA Machine Learning Tools: A Friendly...
Apr. 26, 2017

The iSchool and University Library are partners on a National Leadership Grant for Libraries awarded by the Institute of Museum and Library Services (IMLS). The grant supports work to hold a national forum and develop a white paper aimed at simplifying scholars' access to in-copyright and access-restricted texts for computational analysis and data mining research.

Text data mining and analysis are important research methods for scholars. However, efforts to access and analyze data sets are frequently complicated when texts are protected by copyright or other intellectual property restrictions.

The forum will bring together stakeholders in the areas of libraries, research, and publishing to discuss and recommend a research, policy, and practice framework that guides scholarly access to protected texts for data mining and other analyses. Thereafter, the grant partners will produce a white paper to summarize the discussions and present best practices and policy...

Jan. 24, 2017

J. Stephen Downie, professor and associate dean for research, participated in the Center for Open Data in the Humanities (CODH) seminar, "Big Data and Digital Humanities," on January 23 at the National Institute of Informatics in Tokyo, Japan.

Started in April 2016, the CODH will be formally established as a center in April 2017. It involves faculty from the National Institute of Informatics and The Institute of Statistical Mathematics, both in Japan, who collaborate with computer scientists and humanities scholars around the globe. CODH promotes research and development to improve access to humanities data, using the concept of open science along with the latest technology in informatics and statistics.

Downie gave the presentation, "Digital humanities using both closed and open data: Use cases from the HathiTrust Research Center":

The HathiTrust Digital...

Dec. 5, 2016

Unique in its sheer size and breadth, a new open dataset released by the HathiTrust Research Center (HTRC) will provide researchers with access to otherwise restricted information. The HTRC Extracted Features (EF) Dataset reports quantitative counts of words, lines, parts of speech, and other details extracted from each page of the more than thirteen million volumes found in the HathiTrust Digital Library. 

An earlier release of the EF Dataset, drawn from a subset covering only the five million volumes in HathiTrust's public domain collection, has enabled novel research from scholars in economics, history, linguistics, literary studies, and sociology, among other fields. The new EF dataset, released under a Creative Commons Attribution license, provides access to features drawn from the remaining eight million volumes that otherwise would be...

Pages