School of Information Sciences

Underwood receives NEH grant to investigate consequences of error in digital libraries

Ted Underwood
Ted Underwood, Professor

Professor Ted Underwood has received a $73,122 grant from the National Endowment for the Humanities to investigate the consequences of error in digital libraries. While digital libraries represent an immense storehouse of knowledge, the texts are full of errors because of the imperfect process by which they are transcribed optically.

"It isn't unusual for five percent of the words in volumes to be mistranscribed, with the level of error much higher in some volumes," said Underwood. "Simply measuring the fraction of mistranscribed words is easy. It’s harder to know how much difference those errors make for the methods and questions that actually interest researchers. Some forms of analysis are undisturbed by high levels of error; others may be quite sensitive, especially when errors are distributed unevenly across different historical periods and genres."

Underwood will work with graduate students from the iSchool and English Department to construct parallel collections that pair each "clean" text with a realistically error-ridden version of the same book drawn from a digital library. The team will build collections of Chinese texts as well as English texts ranging from 1700 to the present, because different character sets and printing technologies produce different kinds of error. Then the team will apply a wide range of data-mining methods to both the clean and error-ridden collections and measure the distortion produced by transcription error and other common sources of noise. The project will provide tools that help other researchers estimate the level of uncertainty in their own conclusions.

"No data is perfect. There's always some kind of error. The question is whether the error is of a kind and magnitude likely to matter for a particular question," he said.

Underwood is a professor in the iSchool and also holds an appointment with the Department of English in the College of Liberal Arts and Sciences. He has authored three books about literary history, including Distant Horizons (The University of Chicago Press Books, 2019), Why Literary Periods Mattered: Historical Contrast and the Prestige of English Studies (Stanford University Press, 2013), and The Work of the Sun: Literature, Science and Political Economy 1760-1860 (New York: Palgrave, 2005). His articles have appeared in PMLA, Representations, MLQ, and Cultural Analytics. Underwood earned his PhD in English from Cornell University.

Updated on
Backto the news archive

Related News

BIG: Solving real problems for real organizations

Students in the Business Intelligence Group (BIG)—the experiential learning consultancy program affiliated with Associate Professor Yoo-Seong Song's Applied Business Research courses (IS 494 and IS 514)—spent the spring semester working directly with organizations across industries, including health care, financial services, aviation, gaming, community services, and higher education. 

Business Intelligence Group (BIG) student consultants smile on the steps of Foellinger Auditorium with Associate Professor Yoo-Seong Song

Cao and Liu receive Best Paper Award for FreeOrbit4D

PhD student Wei Cao and Assistant Professor Yaoyao Liu received a Best Paper Award at the 4th Workshop on Generative Models for Computer Vision, which was held during the 2026 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 

Wang group receives ICWSM Best Dataset Paper Award

A paper from Professor Dong Wang's Social Sensing & Intelligence Lab received the Best Dataset Paper Award at the International AAAI Conference on Web and Social Media (ICWSM) held in May 2026 in Los Angeles, California. According to Wang, the paper was accepted in the first review round, which had an acceptance rate of 4.7 percent (14 of 298 submissions). 

Adler and Wang to present at RESPECT 2026

Associate Professor Rachel Adler and Informatics PhD student Olive Wang will present their work at the Association for Computing Machinery Special Interest Group on Computer Science Education Conference on Research on Equity and Sustained Participation in Engineering, Computing, and Technology (RESPECT), which will be held in Chicago this week.

Bashir group presents work at PEPR 2026

PhD students Ramazan Yener, Eryue Xu, and Mubarak Raji presented their research this week at the 2026 USENIX Conference on Privacy Engineering Practice and Respect (PEPR) in Santa Clara, California. PEPR is focused on designing and building products and systems with privacy and respect for their users and the societies in which they operate. The students received USENIX grants covering their conference registration and providing travel support to attend the conference. 

Bashir group PEPR 2026

School of Information Sciences

501 E. Daniel St.

MC-493

Champaign, IL

61820-6211

Voice: (217) 333-3280

Email: ischool@illinois.edu

Back to top