Downie to discuss HTRC findings at Harvard Library

Stephen Downie
J. Stephen Downie, Professor and Associate Dean for Research

Professor and Associate Dean for Research J. Stephen Downie will present his recent work with the HathiTrust Research Center (HTRC) on April 30 at Harvard Library. Downie is codirector of HTRC, a collaboration between the University of Illinois, Indiana University, and the HathiTrust to enable advanced computational access to text found in the HathiTrust (HT) Digital Library.

His talk, "Creating Universal Open Access to Closed Textual Data at Scale: Use Cases from the HathiTrust Research Center," will discuss how the HTRC is creating a set of non-consumptive research services to make HT Digital Library volumes that are under copyright restrictions more open and useful to scholars.

"The creation and publication of the HTRC 'Extracted Features' (EF) dataset provides unigram counts and Part-of-Speech (POS) information for each of the 5.6 billion pages in the HT Digital Library," explained Downie. "In my talk, I will introduce two uses cases that leverage the EF dataset: the 'HathiTrust + Bookworm' visualization and analysis tool; and the Workset Building environment developed to provide researchers fine-grained access to the entire HT collection (both public domain and in-copyright) via the EF dataset."

Downie leads the HathiTrust + Bookworm text analysis project, which is creating tools to visualize the evolution of term usage over time. He also is the principal investigator on the Workset Creation for Scholarly Analysis + Data Capsules project, which integrates workset models and tools, and he represents the HTRC on the Novel(TM) text mining project as well as the Single Interface for Music Score Searching and Analysis project. All of these projects strive to provide large-scale analytic access to copyright-restricted cultural data.

Research Areas:
Updated on
Backto the news archive

Related News

Diesner appointed R.C. Evans Data Analytics Fellow

Associate Professor and PhD Program Director Jana Diesner has been appointed as a 2018-19 R.C. Evans Data Analytics Fellow in the University of Illinois-Deloitte Foundation Center for Business Analytics. Launched in 2016, the Center is part of the Gies College of Business at the Urbana campus.

Assistant Professor Jana Diesner

Jones to present at digital preservation conference

Doctoral candidate Jimi Jones will discuss his dissertation research at the National Digital Stewardship Alliance (NDSA) Digital Preservation 2018, which will be held October 17-18 in Las Vegas. NDSA is a consortium of more than 220 organizations committed to the long-term preservation and stewardship of digital information and cultural heritage, for the benefit of present and future generations.

Jimi Jones

Kahyun Choi defends dissertation

Doctoral candidate Kahyun Choi successfully defended her dissertation, "Computational Lyricology: Quantitative Approaches to Understanding Song Lyrics and Their Interpretations."

Kahyun Choi

Cooke to present research at Harvard summit

Associate Professor and MSLIS Program Director Nicole A. Cooke will discuss her research on fake news, misinformation, and disinformation at the 2018 Public Interest Technology Summit, which will be held on October 13 at Harvard University. The summit is hosted by digital HKS, an independent project at Harvard's Belfer Center for Science and International Affairs that is committed to teaching public leaders to understand how to design, build, and engage with digital technologies as they relate to civic participation, digital equity and inclusion, governance of government platforms, and accountability.

Nicole A. Cooke