Efron uses humanist approach to solving problems in search

Every month, Google alone fields billions of search requests. The staggering demand for information, coupled with the exponentially growing amount of information available, means that reliable search results are key to maneuvering a flooded information landscape.

Associate Professor Miles Efron is among the leading scholars investigating ways to improve search. With funded research projects supported by the National Science Foundation as well as by industry partners such as Google, he looks at the issue from a variety of angles, including questions of query representation and how temporal factors affect the relationship between queries and relevant information.

Though his research is thick with writing code and creating algorithms, Efron approaches his work through the lens of a humanist, incorporating his academic background in classics and medieval studies. “My goal is to translate familiar humanist concerns and see how they resonate in the kinds of domains that are of immediate practical concern to information professionals, like building search engines or translation systems,” he said. “I try to show my students that even if we’re using statistics and probability to model what a document means, ultimately we’re still making assertions and commitments about how pieces of text achieve meaning and how they communicate meaning.”

“What I love about LIS is it lets you bring together these different approaches. Information is this weird nexus of things like language and probability, and it is easy to think, what could these things possibly have to do with one another? Turns out they interact in many ways. We live in this very wonderful and strange period of time when people who have interests in both big data and humanistic questions can satisfy both of those interests simultaneously,” he said.

Efron’s current work looks at the natural evolution of content building on the web and innovates ways to use that content to improve existing functions. “There is so much knowledge in Wikipedia, for example, and I want to get as much out of this huge, wonderful resource as we can,” he said. “There is a lot of linked information in Wikipedia that will help information retrieval by improving query understanding, document representation, and many other language technologies.”

Recently, Efron received a Google Faculty Research Award, which allows him to work closely with senior engineers and researchers at Google on the project, Query Modeling Using Intra-Entity Knowledge Base Structure. He hopes to improve the effectiveness of search by using structured data found in knowledge bases like Wikipedia to improve information retrieval over unstructured documents. By analyzing the top documents returned on searches, Efron hopes to isolate what makes the top documents especially useful or unique, and use that information to expand upon and improve the user’s original query, thereby improving the quality of the results. “This project mines those top documents and tries to find the hallmarks of relevance in them. Then we can try to extract those bits of data and add them to the query. What we end up with is an augmented (hopefully improved) query that gets resubmitted to the search engine, and the results from that refined query are the ones the user finally sees,” he said.

“Over the past few years, Google has released open-source data to support academic research in this space, and this project will capitalize on that data,” said Efron. “The field is emerging, but I think there is a lot to be gained by bringing structured and unstructured data together.”

Making connections with industry has not only benefited Efron’s research, but has also had an impact on the careers of graduate students who work on Efron’s projects. A number of them have gone on to find positions at Google or Microsoft Research.

“There is a big need in industry for people with expertise in information retrieval research and development,” said Efron. “Getting students involved in that kind of work is good for everybody. The students hone skills and form connections that make finding a satisfying career much easier. Folks in industry get to hire the best of the best. And researchers like me keep our knowledge of the state-of-theart fresh by working with the services, data, and people that drive peoples’ everyday information interactions.”

Updated on
Backto the news archive

Related News

Tilley to serve on Lynd Ward Prize jury

Associate Professor Carol Tilley has been selected to serve as a judge for the 2022 Lynd Ward Graphic Novel Prize, which is presented to the best graphic novel, fiction or nonfiction, published in the previous year by a living U.S. or Canadian citizen or resident. The annual award is sponsored by Penn State University Libraries and administered by the Pennsylvania Center for the Book, an affiliate of the Center for the Book at the Library of Congress.

Carol Tilley

iSchool researchers receive funding for napari plugin project

A new project led by Assistant Professor Matthew Turk is among the napari plugin projects that have recently received support from the Chan Zuckerberg Initiative (CZI) in its effort to advance bioimaging technologies. Visiting Research Scientist Christopher Havlin will serve as co-principal investigator on the project, "Enabling Access To Multi-resolution Data."

Matthew Turk

New project focuses on rare categories

Associate Professor Jingrui He has been awarded a three-year, $500,000 grant from the National Science Foundation (NSF) to develop explainable techniques to detect and track rare categories. For her project, "RareXplain: A Computational Framework for Explainable Rare Category Analysis," she will focus on real-world problems where underrepresented, rare (abnormal) examples play critical roles, such as defective silicon wafers resulting from a new semiconductor manufacturing process and rare but severe complications (e.g., kidney failure) among diabetes patients.

Jingrui He

Lueg to join iSchool faculty

The iSchool is pleased to announce that Christopher Lueg will join the faculty as a professor in January 2022. He is currently a professor of medical informatics at the Bern University of Applied Sciences in Biel/Bienne, Switzerland.

Christopher Lueg

Why is a past attempt to ban 'Beloved' from a high school curriculum a political issue now?

Newly elected Virginia Republican Gov. Glenn Youngkin ran a campaign ad featuring a mother who eight years ago tried to ban Beloved, the Pulitzer Prize-winning novel by Toni Morrison, from her son's advanced placement high school English class. Youngkin's use of the ad has generated a discussion about banning books. Emily Knox is a professor and the interim associate dean for academic affairs in the School of Information Sciences, the author of Book Banning in 21st-Century America and editor of Trigger Warnings: History, Theory, Context. She talked with News Bureau arts and humanities editor Jodi Heckel.

Emily Knox