Sherman defends dissertation

Garrick Sherman successfully defended his PhD dissertation, "Document Expansion and Language Model Re-estimation for Information Retrieval," on August 22.

His committee included Associate Professor Jana Diesner, chair and director of research; Professor J. Stephen Downie; Professor Ted Underwood; and Associate Professor Jaime Arguello of the University of North Carolina at Chapel Hill.

From the abstract: Document expansion is the process of augmenting the text of a document with text drawn from one or more other documents. The purpose of this expansion is to increase the size of the term sample from which document representations, such as language models, may be estimated. While document expansion has been shown to improve the effectiveness of ad-hoc document retrieval, our work differs from previous work in a variety of ways. We propose a consistent language modeling approach to document expansion of full length documents. We also explore the use of one or more external document collections as sources of data during the expansion process. Our proposed methods prove successful in improving retrieval effectiveness over baselines. We also acknowledge that existing document expansion work, including our own, has relied on intuitive assumptions about the mechanisms by which it achieves its effects. In this thesis, we quantify aspects of document language model change resulting from expansion . . . Recognizing the potential for further retrieval effectiveness improvement by means of selective application of our model, we investigate methods for automatically predicting whether or not to expand individual documents and, if so, which expansion collection may yield the optimal document representation. We find that, although the document expansion retrieval model has proven effective overall, accurate prediction concerning the expansion of a given document depends too heavily on predicting the document's relevance.

Updated on
Backto the news archive

Related News

CCB contributes to new Books to Parks site on Lyddie

The Center for Children's Books (CCB) collaborated with the National Park Service (NPS) to launch a new Books to Parks website on Lyddie, a 1991 novel by Katherine Paterson that highlights the experiences of young women working in textile mills in nineteenth-century Lowell, Massachusetts. 

Lyddie book

Layne-Worthey edits book on digital humanities and LIS

Glen Layne-Worthey, associate director for research support services for the HathiTrust Research Center (HTRC), and Isabel Galina, researcher at the Institute for Bibliographic Studies at the National University of Mexico, have edited a new book, The Routledge Companion to Libraries, Archives, and the Digital Humanities, which was recently released by Routledge.

Glen Layne-Worthey

Wang group to present at BigData 2024

Members of Associate Professor Dong Wang's research group, the Social Sensing and Intelligence Lab, will present their research at the 2024 IEEE International Conference on Big Data (BigData 2024), which will be held from December 15-18 in Washington, D.C. BigData 2024 is the premier venue to present and discuss progress in research, development, standards, and applications of topics in artificial intelligence, machine learning and big data analytics.

Dong Wang

Book co-edited by Sayuno wins national award in Philippines

A book edited by Postdoctoral Research Associate Cheeno Marlo Sayuno and Eugene Evasco has received a National Book Award from the Republic of the Philippines. The award, sponsored by the National Book Development Board and the Manila Critics Circle, is an annual prize that honors the most outstanding titles written, designed, and published in the Philippines. 

Cheeno Sayuno