Digital Humanities Subscribe to Digital Humanities


National Endowment for the Humanities

The HathiTrust Research Center (HTRC) is partnering with the Cultural Observatory team that developed the Google Books Ngram Viewer together with Google. The goal of this collaboration is to implement a greatly enhanced open-source version of the Cultural Observatory’s open-source “Bookworm” text analysis and visualization tool designed to assist scholars to meet the challenges posed by the massive scale of the HT corpus. We are calling our multi-disciplinary, multi-institutional collaboration, the HathiTrust + Bookworm (HT+BW) Project. Participating institutions include the University of Illinois, Indiana University, Northeastern University, Baylor College of Medicine, and Rice University.

Bookworm is a tool that visualizes language usage trends in repositories of...

Social Sciences and Humanities Research Council of Canada

This HathiTrust Research Center (HTRC) project seeks to produce the first large-scale cross-cultural study of the novel according to quantitative methods. Ever since its putative rise in the eighteenth century, the novel has emerged as a central means of expressing what it means to be modern. And yet despite this cultural significance, we still lack a comprehensive study of the novel’s place within society that accounts for the vast quantity of novels produced since the eighteenth century, the period most often identified as marking the origins of the novel’s quantitative rise. Our aim is thus twofold: 1) to enliven our understanding of one of the most culturally significant modern art forms according to new computational means, and 2) to establish the methodological foundations of a...

Social Sciences and Humanities Research Council of Canada

Music prints and manuscripts created over the past thousand years sit on the shelves of libraries and museums around the globe. As these organizations digitize their collections, images of these scores are increasingly accessible online. However, the musical content remains difficult to search.

Google Books and HathiTrust have already made it possible to search the content of text documents through Optical Character Recognition (OCR), which transforms digital images of texts into a symbolic representation that can be searched by computers. For digital images of musical scores, the analogous technology is Optical Music Recognition (OMR).

The research team is working to improve OMR technology so that computers can recognize the musical symbols in these images, enabling us...

Andrew W. Mellon Foundation

“Understanding the Needs of Scholars in a Contemporary Publishing Environment,” better know as Publishing Without Walls (PWW), is a digital scholarly publishing initiative that is scholar-driven, openly accessible, scalable, and sustainable. PWW will directly engage with scholars throughout the research process. It aims to build publishing models that can be supported locally by a university’s library, while also opening new avenues toward publication through university presses and other publishers. PWW is here to help scholars navigate the new opportunities presented by collaborative, multimodal, and interim phase works. PWW is launching two new series: one focusing on the outcomes of the Humanities Without...


The HathiTrust has provided funding for the HathiTrust Research Center (HTRC), colocated at University of Illinois and Indiana University, to serve as the research arm of the HathiTrust and create an agile, technology-rich service for researchers in the digital humanities, social sciences, natural sciences, and informatics. This service will help researchers conduct nonconsumptive research on the HathiTrust digital library database, a collection of just under 14 million digitized volumes, equating to 4.9 billion pages, 60% of which is under some copyright restriction. At the same time, center staff will develop and refine tools to aid in digital humanities and text mining research over large databases and will operate the secure, large-scale computation environment required by this...

Andrew W. Mellon Foundation

This project builds upon, extends, and integrates two developmental research threads within the HathiTrust Research Center (HTRC). The first thread originates from work that was conducted in the Workset Collections for Scholarly Analysis (WCSA): Prototyping Project. The second thread continues the work of the Data Capsules (DC) project, previously supported by the Alfred P. Sloan Foundation (2011-2014). The primary objective of the WCSA+DC project is the seamless integration of the workset model and tools with the Data Capsule framework to provide non-consumptive research access to HathiTrust's massive corpus of data objects, securely and at...


Jan. 24, 2017

J. Stephen Downie, professor and associate dean for research, participated in the Center for Open Data in the Humanities (CODH) seminar, "Big Data and Digital Humanities," on January 23 at the National Institute of Informatics in Tokyo, Japan.

Started in April 2016, the CODH will be formally established as a center in April 2017. It involves faculty from the National Institute of Informatics and The Institute of Statistical Mathematics, both in Japan, who collaborate with computer scientists and humanities scholars around the globe. CODH promotes research and development to improve access to humanities data, using the concept of open science along with the latest technology in informatics and statistics.

Downie gave the presentation, "Digital humanities using both closed and open data: Use cases from the HathiTrust Research Center":

The HathiTrust Digital...

Dec. 5, 2016

Unique in its sheer size and breadth, a new open dataset released by the HathiTrust Research Center (HTRC) will provide researchers with access to otherwise restricted information. The HTRC Extracted Features (EF) Dataset reports quantitative counts of words, lines, parts of speech, and other details extracted from each page of the more than thirteen million volumes found in the HathiTrust Digital Library. 

An earlier release of the EF Dataset, drawn from a subset covering only the five million volumes in HathiTrust's public domain collection, has enabled novel research from scholars in economics, history, linguistics, literary studies, and sociology, among other fields. The new EF dataset, released under a Creative Commons Attribution license, provides access to features drawn from the remaining eight million volumes that otherwise would be...

Nov. 9, 2016

Associate Professor Bonnie Mak has been invited to share her expertise at a National Science Foundation (NSF) workshop on "Social Facets of Data Science." Organized by faculty from California Polytechnic State University, Cornell University, North Carolina State University, the University of Alabama, and the University of Texas at Austin, the workshop will examine data science as an important and growing profession that sits at the intersection of the STEM fields and the liberal and creative arts. 

Topics to be covered include data and society; data infrastructures; the environmental implications of data science; as well as scholarly method and craft in data science, which will include discussions of art, design, and film. 

Mak is one of six scholars selected from the country's leading research and teaching institutions who will be featured at the workshop.  

"I am honored to be part of this exciting initiative that explores links between the arts and data...

Oct. 24, 2016

Associate Professor Bonnie Mak will return to The Pennsylvania State University to participate in the inaugural Information + Humanities conference on October 28-29. The conference is sponsored by the Center for Humanities and Information, where Mak was visiting senior fellow in 2015-2016

Mak is among twelve invited speakers from across the country who will offer their perspectives on a set of terms especially associated with information, including infrastructure, classification, interface, keyword, and design. In her presentation on the topic of metadata, Mak will discuss how the descriptive practices of natural historians in the sixteenth and seventeenth centuries can shed light on questions about metadata in the twenty-first century. 

"I look forward to joining my colleagues to discuss how the notion of information...

Sep. 19, 2016

Most academic librarians stepping into a position can model their work on that of their predecessors. But not Thomas Padilla (MS '14). On his appointment in April as the first humanities data curator at the University of California, Santa Barbara (UCSB) Library (and the first in the entire University of California system), Padilla has had to draw on a number of different disciplines to shape his role of working with data throughout its life cycle, creating a support plan for digital humanities researchers, and providing research data consultation. Formerly the digital scholarship librarian at Michigan State University Libraries, Padilla is pioneering a new niche for academic...

Aug. 17, 2016

The iSchool is pleased to announce that Ted Underwood has joined the faculty, effective August 16. Professor Underwood also holds a joint appointment with the Department of English in the College of Liberal Arts and Sciences.

Underwood's shift toward the iSchool happened gradually. "I found myself collaborating more and more closely with students and faculty at the iSchool and eventually came to see it as a second intellectual home,” he said. Since 2003, he has been a faculty member in the English Department. He looks forward to helping students there and in the iSchool understand “how one builds a bridge between domain knowledge and information science."

Underwood works in the broad collection of fields known as digital humanities. "Specifically, I use digital libraries to cast new light on the literary past, showing how genres and assumptions (about gender, for instance) have changed across long timelines in ways that are sometimes too gradual to see if we’re just...

Jul. 8, 2016

Several iSchool representatives will speak next week at Digital Humanities 2016, the annual conference of the Alliance of Digital Humanities Organizations. The event will be held in Krakow, Poland on July 11-17.

Presentations by iSchool faculty, staff, and students include:

“Mining Texts with the Extracted Features Dataset”
Presented by Postdoctoral Research Associate Peter Organisciak and Professor J. Stephen Downie

“A Comparative Analysis of Bibliographic Ontologies: Implications for Digital Humanities”
Presenters include doctoral student Jacob Jett, faculty affiliate Timothy W. Cole, and...