Digital Libraries Subscribe to Digital Libraries

RELATED RESEARCH PROJECTS

HT%2BBW%20Promo%20%233
National Endowment for the Humanities

The HathiTrust Research Center (HTRC) is partnering with the Cultural Observatory team that developed the Google Books Ngram Viewer together with Google. The goal of this collaboration is to implement a greatly enhanced open-source version of the Cultural Observatory’s open-source “Bookworm” text analysis and visualization tool designed to assist scholars to meet the challenges posed by the massive scale of the HT corpus. We are calling our multi-disciplinary, multi-institutional collaboration, the HathiTrust + Bookworm (HT+BW) Project. Participating institutions include the University of Illinois, Indiana University, Northeastern University, Baylor College of Medicine, and Rice University.

Bookworm is a tool that visualizes language usage trends in repositories of...

htrc_new
HathiTrust

The HathiTrust has provided funding for the HathiTrust Research Center (HTRC), colocated at University of Illinois and Indiana University, to serve as the research arm of the HathiTrust and create an agile, technology-rich service for researchers in the digital humanities, social sciences, natural sciences, and informatics. This service will help researchers conduct nonconsumptive research on the HathiTrust digital library database, a collection of just under 14 million digitized volumes, equating to 4.9 billion pages, 60% of which is under some copyright restriction. At the same time, center staff will develop and refine tools to aid in digital humanities and text mining research over large databases and will operate the secure, large-scale computation environment required by this...

htrcnew3-wh
Andrew W. Mellon Foundation

This project builds upon, extends, and integrates two developmental research threads within the HathiTrust Research Center (HTRC). The first thread originates from work that was conducted in the Workset Collections for Scholarly Analysis (WCSA): Prototyping Project. The second thread continues the work of the Data Capsules (DC) project, previously supported by the Alfred P. Sloan Foundation (2011-2014). The primary objective of the WCSA+DC project is the seamless integration of the workset model and tools with the Data Capsule framework to provide non-consumptive research access to HathiTrust's massive corpus of data objects, securely and at...

IN THE NEWS

Jun. 16, 2017

iSchool faculty, staff, and students will present their research at the Joint Conference on Digital Libraries (JCDL), which will be held on June 19-23 in Toronto. The event brings together international scholars focusing on digital libraries and associated technical, practical, organizational, and social issues. The goal is to provide a forum for shared learning and facilitate the application of knowledge for research, development, construction, and utilization in digital libraries.

Papers to be presented at JCDL 2017 include:

  • "Information-Seeking in Large Scale Digital Libraries: Strategies for Scholarly Workset Creation"
    Authors include J. Stephen Downie, professor and associate dean for research, and Peter Organisciak (PhD '15), postdoctoral research associate
     
  • "Uncertainty about the Long-Term: Digital Libraries, Astronomy Data, and Open Source Software"
    Authors include...
Jun. 12, 2017

The iSchool is co-organizing a workshop on digital scholarship with Beijing Institute of Technology (BIT) Library on June 14-16 in Beijing. The workshop, Digital Scholarship Centers: Building Library Services for Data-Driven Scholarship, will instruct participants in library service models for digital scholarship and discuss concepts in digital humanities and computational social science. Dean Allen Renear will give opening remarks. Other iSchool presenters include J. Stephen Downie, professor and codirector of the HathiTrust Research Center (HTRC); Peter Organisciak (PhD '15), postdoctoral research associate; Eleanor Dickson, visiting HTRC digital humanities specialist; and Nic Weber (PhD '15), assistant professor at the University of Washington.

Downie will give the talks:

  • "Text Mining Concepts and Methods: HTRC and Non-Consumptive Research"
  • "Quick and Painless Introduction to Machine Learning"
  • "WEKA Machine Learning Tools: A Friendly...
Apr. 26, 2017

The iSchool and University Library are partners on a National Leadership Grant for Libraries awarded by the Institute of Museum and Library Services (IMLS). The grant supports work to hold a national forum and develop a white paper aimed at simplifying scholars' access to in-copyright and access-restricted texts for computational analysis and data mining research.

Text data mining and analysis are important research methods for scholars. However, efforts to access and analyze data sets are frequently complicated when texts are protected by copyright or other intellectual property restrictions.

The forum will bring together stakeholders in the areas of libraries, research, and publishing to discuss and recommend a research, policy, and practice framework that guides scholarly access to protected texts for data mining and other analyses. Thereafter, the grant partners will produce a white paper to summarize the discussions and present best practices and policy...

Jan. 24, 2017

J. Stephen Downie, professor and associate dean for research, participated in the Center for Open Data in the Humanities (CODH) seminar, "Big Data and Digital Humanities," on January 23 at the National Institute of Informatics in Tokyo, Japan.

Started in April 2016, the CODH will be formally established as a center in April 2017. It involves faculty from the National Institute of Informatics and The Institute of Statistical Mathematics, both in Japan, who collaborate with computer scientists and humanities scholars around the globe. CODH promotes research and development to improve access to humanities data, using the concept of open science along with the latest technology in informatics and statistics.

Downie gave the presentation, "Digital humanities using both closed and open data: Use cases from the HathiTrust Research Center":

The HathiTrust Digital...

Dec. 5, 2016

Unique in its sheer size and breadth, a new open dataset released by the HathiTrust Research Center (HTRC) will provide researchers with access to otherwise restricted information. The HTRC Extracted Features (EF) Dataset reports quantitative counts of words, lines, parts of speech, and other details extracted from each page of the more than thirteen million volumes found in the HathiTrust Digital Library. 

An earlier release of the EF Dataset, drawn from a subset covering only the five million volumes in HathiTrust's public domain collection, has enabled novel research from scholars in economics, history, linguistics, literary studies, and sociology, among other fields. The new EF dataset, released under a Creative Commons Attribution license, provides access to features drawn from the remaining eight million volumes that otherwise would be...

Jun. 14, 2016

Several students and faculty members will share their research at the 2016 Joint Conference on Digital Libraries (JCDL), held on June 19-23 in Newark, New Jersey. The event brings together international scholars focusing on digital libraries and associated technical, practical, organizational, and social issues. The goal is to provide a forum for shared learning and facilitate the application of knowledge for research, development, construction, and utilization in digital libraries.

Papers presented at JCDL 2016 include:

"Enhancing Scholarly Use of Digital Libraries: A Comparative Survey Review of Bibliographic Metadata Ontologies"
Presenters include doctoral student Jacob Jett, faculty affiliate Timothy W. Cole, and Professor J. Stephen Downie

"Low-cost Semantic...

May. 5, 2016

Who influenced Charles Darwin when he was writing his pioneering theory of evolution, On the Origin of Species? Indiana University (IU) professor Colin Allen wants to know, and the HathiTrust Research Center may now hold the answer.

The HathiTrust Research Center (HTRC), a cooperative service of Indiana University, the University of Illinois, and HathiTrust, has expanded its services to support computational research on the entire collection of one of the world’s largest digital libraries, held by HathiTrust. HathiTrust’s collections include over 14 million digitized volumes, including more than 7 million books, more than 725,000 US federal government documents, and more than 350,000 serial publications. HathiTrust’s collections are drawn from some of the largest research libraries in North America, including Indiana University and the University of...

Pages