Sherman defends dissertation

Garrick Sherman
Garrick Sherman

Garrick Sherman successfully defended his PhD dissertation, "Document Expansion and Language Model Re-estimation for Information Retrieval," on August 22.

His committee included Associate Professor Jana Diesner, chair and director of research; Professor J. Stephen Downie; Professor Ted Underwood; and Associate Professor Jaime Arguello of the University of North Carolina at Chapel Hill.

From the abstract: Document expansion is the process of augmenting the text of a document with text drawn from one or more other documents. The purpose of this expansion is to increase the size of the term sample from which document representations, such as language models, may be estimated. While document expansion has been shown to improve the effectiveness of ad-hoc document retrieval, our work differs from previous work in a variety of ways. We propose a consistent language modeling approach to document expansion of full length documents. We also explore the use of one or more external document collections as sources of data during the expansion process. Our proposed methods prove successful in improving retrieval effectiveness over baselines. We also acknowledge that existing document expansion work, including our own, has relied on intuitive assumptions about the mechanisms by which it achieves its effects. In this thesis, we quantify aspects of document language model change resulting from expansion . . . Recognizing the potential for further retrieval effectiveness improvement by means of selective application of our model, we investigate methods for automatically predicting whether or not to expand individual documents and, if so, which expansion collection may yield the optimal document representation. We find that, although the document expansion retrieval model has proven effective overall, accurate prediction concerning the expansion of a given document depends too heavily on predicting the document's relevance.

Updated on
Backto the news archive

Related News

Join the iSchool at ALISE 2019

Join iSchool faculty and students for the annual conference of the Association for Library and Information Science Education (ALISE), which will take place from September 24-26 in Knoxville, Tennessee. The theme of ALISE 2019 is "Exploring Learning in a Global Information Context." Dean and Professor Eunice E. Santos will provide welcoming remarks at the iSchool-sponsored School Representative's Breakfast at 7:30 a.m. on September 25.

Underwood to discuss machine learning at Sawyer Seminar

Professor Ted Underwood will present his research on machine learning at the University of Pittsburgh on September 19. His talk is part of the University's Sawyer Seminar, a year-long project funded by The Andrew W. Mellon Foundation that brings together a diverse range of practitioners and disciplinary specialists to analyze the co-evolution of data and method across more than a century.

Ted Underwood

Chan presents research at 4S 2019

Associate Professor Anita Say Chan presented her research at the Annual Meeting of the Society for the Social Studies of Science (4S 2019), which took place in New Orleans on September 4-7. The Society is an international, nonprofit association that fosters interdisciplinary scholarship in social studies of science, technology, and medicine (a field often referred to as STS). The theme of this year's meeting was "Innovations, Interruptions, and Regenerations."

Anita Say Chan

Schneider discusses argumentation mining research

Assistant Professor Jodi Schneider presented her research on argumentation mining at a doctoral workshop at the University of Fribourg in Switzerland on September 2-3. Her lecture and tutorials were featured during the University’s Language and Cognition program’s “Linguistic and Corpus Perspectives on Argumentative Discourse” workshop. Schneider discussed problem definitions, corpora, and argument annotation for mining arguments from text. 

Jodi Schneider

iSchool students recognized by Research Park for outstanding work

Two students in the MS in information management (MS/IM) program received top honors for their performance as interns at the University's Research Park. Saurav Yadav was named "Most Outstanding Graduate Intern" for his work at the COUNTRY Financial DigitaLab, and Harshit Gupta received the award for "Best Technological Innovation" for his work as a data scientist technology intern at Syngenta.

Saurav Yadav and Harshit Gupta