School of Information Sciences

Underwood receives NEH grant to investigate consequences of error in digital libraries

Ted Underwood
Ted Underwood, Professor

Professor Ted Underwood has received a $73,122 grant from the National Endowment for the Humanities to investigate the consequences of error in digital libraries. While digital libraries represent an immense storehouse of knowledge, the texts are full of errors because of the imperfect process by which they are transcribed optically.

"It isn't unusual for five percent of the words in volumes to be mistranscribed, with the level of error much higher in some volumes," said Underwood. "Simply measuring the fraction of mistranscribed words is easy. It’s harder to know how much difference those errors make for the methods and questions that actually interest researchers. Some forms of analysis are undisturbed by high levels of error; others may be quite sensitive, especially when errors are distributed unevenly across different historical periods and genres."

Underwood will work with graduate students from the iSchool and English Department to construct parallel collections that pair each "clean" text with a realistically error-ridden version of the same book drawn from a digital library. The team will build collections of Chinese texts as well as English texts ranging from 1700 to the present, because different character sets and printing technologies produce different kinds of error. Then the team will apply a wide range of data-mining methods to both the clean and error-ridden collections and measure the distortion produced by transcription error and other common sources of noise. The project will provide tools that help other researchers estimate the level of uncertainty in their own conclusions.

"No data is perfect. There's always some kind of error. The question is whether the error is of a kind and magnitude likely to matter for a particular question," he said.

Underwood is a professor in the iSchool and also holds an appointment with the Department of English in the College of Liberal Arts and Sciences. He has authored three books about literary history, including Distant Horizons (The University of Chicago Press Books, 2019), Why Literary Periods Mattered: Historical Contrast and the Prestige of English Studies (Stanford University Press, 2013), and The Work of the Sun: Literature, Science and Political Economy 1760-1860 (New York: Palgrave, 2005). His articles have appeared in PMLA, Representations, MLQ, and Cultural Analytics. Underwood earned his PhD in English from Cornell University.

Updated on
Backto the news archive

Related News

Vaez Afshar named APT Student Scholar

Informatics PhD student Sepehr Vaez Afshar has been named a Student Scholar by the Association for Preservation Technology (APT). Each year, around ten students are selected worldwide for the scholarship program based on the quality and innovation of their research abstracts, as well as their contribution to the field of preservation technology. Scholars are paired with mentors from the APT College of Fellows, prepare and present their research during the association's annual conference, and enjoy opportunities for long-term professional networking and mentorship within the preservation community.

Sepehr Vaez Afshar

iSchool well represented at ASIS&T 2025

iSchool faculty, staff, and students will participate in the 88th Annual Meeting of the Association for Information Science and Technology (ASIS&T), which will be held on November 14-18 in Arlington, Virginia. ASIS&T will also host a Virtual Satellite Meeting on December 11-12. 

PhD students receive scholarships from IAPP

Information Sciences PhD students Mubarak Raji, Eryclis Rodrigues Silva, and Eryue Xu, and Informatics PhD student Muhammad Hussain have received A. Serwin Conference Scholarships from the International Association of Privacy Professionals (IAPP). The award, which recognizes outstanding students in the areas of privacy, AI governance, and digital responsibility, consists of $1,000 and complimentary conference registration. The IAPP’s annual conference, Privacy. Security. Risk., will be held October 30-31 in San Diego, California.

Perkins defends dissertation

PhD candidate Jana M. Perkins successfully defended her dissertation, "Scholarship writ large: A data-rich analysis of professionalization in English literary scholarship from 1940 to the present."

Jana Perkins

Yu receives 2025 Google PhD Fellowship

PhD student Yaman Yu has been named a recipient of the 2025 Google PhD Fellowship in Privacy, Safety, and Security. The fellowship program recognizes outstanding graduate students who are conducting exceptional and innovative research in computer science and related fields, with a special focus on candidates who seek to influence the future of technology. Google PhD fellowships include tuition and fees, a stipend, and mentorship from a Google Research Mentor for up to two years. Google.org is providing over $10 million to support 255 PhD students across 35 countries and 12 research domains.

Yaman Yu

School of Information Sciences

501 E. Daniel St.

MC-493

Champaign, IL

61820-6211

Voice: (217) 333-3280

Fax: (217) 244-3302

Email: ischool@illinois.edu

Back to top