Every month, Google alone fields billions of search requests. The staggering demand for information, coupled with the exponentially growing amount of information available, means that reliable search results are key to maneuvering a flooded information landscape.
Associate Professor Miles Efron is among the leading scholars investigating ways to improve search. With funded research projects supported by the National Science Foundation as well as by industry partners such as Google, he looks at the issue from a variety of angles, including questions of query representation and how temporal factors affect the relationship between queries and relevant information.
Though his research is thick with writing code and creating algorithms, Efron approaches his work through the lens of a humanist, incorporating his academic background in classics and medieval studies. “My goal is to translate familiar humanist concerns and see how they resonate in the kinds of domains that are of immediate practical concern to information professionals, like building search engines or translation systems,” he said. “I try to show my students that even if we’re using statistics and probability to model what a document means, ultimately we’re still making assertions and commitments about how pieces of text achieve meaning and how they communicate meaning.”
“What I love about LIS is it lets you bring together these different approaches. Information is this weird nexus of things like language and probability, and it is easy to think, what could these things possibly have to do with one another? Turns out they interact in many ways. We live in this very wonderful and strange period of time when people who have interests in both big data and humanistic questions can satisfy both of those interests simultaneously,” he said.
Efron’s current work looks at the natural evolution of content building on the web and innovates ways to use that content to improve existing functions. “There is so much knowledge in Wikipedia, for example, and I want to get as much out of this huge, wonderful resource as we can,” he said. “There is a lot of linked information in Wikipedia that will help information retrieval by improving query understanding, document representation, and many other language technologies.”
Recently, Efron received a Google Faculty Research Award, which allows him to work closely with senior engineers and researchers at Google on the project, Query Modeling Using Intra-Entity Knowledge Base Structure. He hopes to improve the effectiveness of search by using structured data found in knowledge bases like Wikipedia to improve information retrieval over unstructured documents. By analyzing the top documents returned on searches, Efron hopes to isolate what makes the top documents especially useful or unique, and use that information to expand upon and improve the user’s original query, thereby improving the quality of the results. “This project mines those top documents and tries to find the hallmarks of relevance in them. Then we can try to extract those bits of data and add them to the query. What we end up with is an augmented (hopefully improved) query that gets resubmitted to the search engine, and the results from that refined query are the ones the user finally sees,” he said.
“Over the past few years, Google has released open-source data to support academic research in this space, and this project will capitalize on that data,” said Efron. “The field is emerging, but I think there is a lot to be gained by bringing structured and unstructured data together.”
Making connections with industry has not only benefited Efron’s research, but has also had an impact on the careers of graduate students who work on Efron’s projects. A number of them have gone on to find positions at Google or Microsoft Research.
“There is a big need in industry for people with expertise in information retrieval research and development,” said Efron. “Getting students involved in that kind of work is good for everybody. The students hone skills and form connections that make finding a satisfying career much easier. Folks in industry get to hire the best of the best. And researchers like me keep our knowledge of the state-of-theart fresh by working with the services, data, and people that drive peoples’ everyday information interactions.”