Underwood’s research shows paradox of women’s representation in literature through the ages

Posted: February 15, 2018

While the issue of gender equality is more prevalent in modern times than in the Victorian era, a new study shows that in literature, the number of women characters and women authors has declined rather than grown over the years. Professor Ted Underwood led the research, which used machine learning to analyze the presentation of gender in more than 100,000 novels from 1703 to 2009 in the HathiTrust Digital Library. 

According to Underwood, "By 1960, women had lost half the space they occupied in nineteenth-century fiction, even though gender roles had become more flexible."

He and his fellow researchers, David Bamman, assistant professor of information science at the University of California, Berkeley, and Sabrina Lee, a graduate student in English at Illinois, recently published their findings, "The Transformation of Gender in English-Language Fiction," in the journal Cultural Analytics. Using an algorithm Underwood and Bamman had built for another characterization project, they discovered shifts in the words that characterize gender as well as a decrease in the number of gendered words. 

Their work was recently featured in the Smithsonian.com article, "Women Were Better Represented in Victorian Novels than Modern Ones." As Underwood points out in the article, "Although literary historians have talked about women's departure from the novel at certain points before, nobody's done the kind of broad-scale work that would demonstrate continuous trends. That’s where machine learning comes in."

This research was funded by the Workset Creation for Scholarly Analysis and Data Capsule (WCSA+DC) grant through the HathiTrust Research Center (HTRC). The HTRC is a collaboration between the University of Illinois, Indiana University, and the HathiTrust to enable advanced computational access to the HathiTrust Digital Library database, a collection of just under 14 million digitized volumes.

Underwood is a professor in the iSchool and also holds an appointment with the Department of English in the College of Liberal Arts and Sciences. He is the author of two books about literary history, including most recently Why Literary Periods Mattered (Stanford, 2013). His articles have appeared in PMLA, Representations, MLQ, and Cultural Analytics. He is currently finishing his upcoming book, Distant Horizons: Digital Evidence and Literary Change.

Filed Under: Digital Humanities, Digital Libraries, faculty news, HTRC