iSchool presents research at JCDL 2022

iSchool students, faculty, and staff presented their research at the ACM/IEEE Joint Conference on Digital Libraries (JCDL 2022), which was held in a hybrid format on June 20-24.

In the paper presentation, "Complexities Associated with User-generated Book Reviews in Digital Libraries: Temporal, Cultural, and Political Case Studies," PhD student Yuerong Hu, Assistant Professor Zoe LeBlanc, Associate Professor Jana Diesner, Professor Ted Underwood, Professor J. Stephen Downie, and Glen Worthey, associate director for research support services at the HathiTrust Research Center, discussed their study investigating user-generated book reviews through the lens of temporal changes of user-generated book lists, cross-cultural differences in user-generated book ratings, and user power dynamics reflected in the review texts. 

"In the last two decades, user-generated book reviews have opened up new opportunities for computational and empirical studies on readership, reception, and books," said Hu. "As iSchool professionals, we want to leverage these newly affordable research resources to empirically map the dynamics between books and readers online. We also want to make a timely contribution to this burgeoning area by filling two existing gaps: a lack of non-Anglophone perspectives and a dearth of attention to the real-world complexities associated with such web data provisions."

In the presentation, "A Prototype Gutenberg-HathiTrust Sentence-level Parallel Corpus for OCR Error Analysis: Pilot Investigations," Ming Jiang, PhD student in informatics; Ryan Dubnicek, digital humanities specialist; Worthey; Underwood; and Downie discussed the use of a prototype sentence-level parallel corpus to fill in the gaps resulting from optical character recognition (OCR) errors.

According to Jiang, "This research provides a novel dataset that can assist scholars who are exploring the impact of OCR noise on fine-grained semantic understanding tasks, such as next sentence prediction, chapter segmentation, and word-level semantic encoding. The ultimate goal of this research is to advance the understanding of the capability of NLP (natural language processing) tools to process OCR'd texts, hoping to facilitate downstream computational research on digitized library collections with trustworthy NLP support."

In addition to these presentations, Hu participated in the JCDL Doctoral Consortium and Downie was featured in the “Meet the Experts” session.