Underwood receives NEH grant to investigate consequences of error in digital libraries

Ted Underwood
Ted Underwood, Professor

Professor Ted Underwood has received a $73,122 grant from the National Endowment for the Humanities to investigate the consequences of error in digital libraries. While digital libraries represent an immense storehouse of knowledge, the texts are full of errors because of the imperfect process by which they are transcribed optically.

"It isn't unusual for five percent of the words in volumes to be mistranscribed, with the level of error much higher in some volumes," said Underwood. "Simply measuring the fraction of mistranscribed words is easy. It’s harder to know how much difference those errors make for the methods and questions that actually interest researchers. Some forms of analysis are undisturbed by high levels of error; others may be quite sensitive, especially when errors are distributed unevenly across different historical periods and genres."

Underwood will work with graduate students from the iSchool and English Department to construct parallel collections that pair each "clean" text with a realistically error-ridden version of the same book drawn from a digital library. The team will build collections of Chinese texts as well as English texts ranging from 1700 to the present, because different character sets and printing technologies produce different kinds of error. Then the team will apply a wide range of data-mining methods to both the clean and error-ridden collections and measure the distortion produced by transcription error and other common sources of noise. The project will provide tools that help other researchers estimate the level of uncertainty in their own conclusions.

"No data is perfect. There's always some kind of error. The question is whether the error is of a kind and magnitude likely to matter for a particular question," he said.

Underwood is a professor in the iSchool and also holds an appointment with the Department of English in the College of Liberal Arts and Sciences. He has authored three books about literary history, including Distant Horizons (The University of Chicago Press Books, 2019), Why Literary Periods Mattered: Historical Contrast and the Prestige of English Studies (Stanford University Press, 2013), and The Work of the Sun: Literature, Science and Political Economy 1760-1860 (New York: Palgrave, 2005). His articles have appeared in PMLA, Representations, MLQ, and Cultural Analytics. Underwood earned his PhD in English from Cornell University.

Updated on
Backto the news archive

Related News

Ocepek and Sanfilippo co-edit book on misinformation

Assistant Professor Melissa Ocepek and Assistant Professor Madelyn Rose Sanfilippo have co-edited a new book, Governing Misinformation in Everyday Knowledge Commons, which was recently published by Cambridge University Press. An open access edition of the book is available, thanks to support from the Governing Knowledge Commons Research Coordination Network (NSF 2017495). The new book explores the socio-technical realities of misinformation in a variety of online and offline everyday environments. 

Governing Misinformation in Everyday Knowledge Commons book

Faculty receive support for AI-related projects from new pilot program

Associate Professor Yun Huang, Assistant Professor Jiaqi Ma, and Assistant Professor Haohan Wang have received computing resources from the National Artificial Intelligence Research Resource (NAIRR), a two-year pilot program led by the National Science Foundation in partnership with other federal agencies and nongovernmental partners. The goal of the pilot is to support AI-related research with particular emphasis on societal challenges. Last month, awardees presented their research at the NAIRR Pilot Annual Meeting.

Winning exhibits highlight evolution of music media and Uni High magazine

MSLIS students Monica Gil, Holly Bleeden, and Harrison Price were selected as winners of this year's Graduate Student Exhibit Contest, sponsored by the University of Illinois Library. Gil and Bleeden won first place for their exhibit, "Echoes of Time: The Evolution of Music Media," and Price won second place for his exhibit, "Unique-ly Illinois: Creative Writing from High School to Higher Education." The exhibits will be on display in the Marshall Gallery in the library through the end of March.

MSLIS students Monica Gil and Holly Bleeden standing next to their exhibit, "Echoes of Time: The Evolution of Music Media," at the Main Library.

Wei receives Amazon Post Internship Fellowship

PhD student Tianxin Wei has been awarded an Amazon Post Internship Fellowship, which will provide $20,000 in unrestricted funds and $20,000 in Amazon Web Services (AWS) credits to support Wei's research with his advisor, Professor Jingrui He. For the past two summers, Wei has served as an applied scientist intern at Amazon in Palo Alto, California. He has been part of a team that is working on search query understanding within Amazon apps and services, as well as developing shopping foundation models.

Tianxin Wei

iSchool participation in iConference 2025

The following iSchool faculty and students will participate in iConference 2025, which will be held virtually from March 11-14 and physically from March 18-22 in Bloomington, Indiana. The theme of this year's conference is "Living in an AI-gorithmic world."