Information Retrieval

Finding relevant documents and other information resources to satisfy an information need or desire

Researchers Working in this Area

Related Research Projects

Changing How We Conduct Inquiry

Time frame
2014-2022
Investigator
Matthew Turk
Total funding to date
$1,500,000.00
Funding agency
Gordon & Betty Moore Foundation

This project aims to develop infrastructure that transforms the process of discovery through semantically-aware analysis and visualization of data from the physical sciences, in the hopes of demonstrating the high impact that interdisciplinary data-driven research can have on scientific research as a whole.

conducting inquiry

Crowd-Assisted Human-AI Teaming with Explanations

Time frame
2024-Present
Investigator
Dong Wang
Total funding to date
$599,999.00
Funding agency
National Science Foundation

This project investigates the problem of information integrity, that is, identifying faulty or ungrounded information online. It focuses on a specific domain, that of information produced during the COVID-19 pandemic, and processes both text and image data. While significant efforts in artificial intelligence (AI) and machine learning (ML) have addressed information integrity in this type of…

crowd of people

Data Capsule Appliance for Research Analysis of Restricted and Sensitive Data in Academic Libraries

Time frame
2017-Present
Investigator
J. Stephen Downie
Total funding to date
$32,500.00
Funding agency
Institute of Museum and Library Services

Indiana University, in partnership with eight other academic libraries, will enable new kinds of computational research while ensuring librarians remain expert stewards of information collections. In the last decade, there has been a nearly exponential increase in the volume of digital content, much of which could be valuable for computational research. However, not all datasets can be made…

Digging Deeper, Reaching Further: Libraries Empowering Users to Mine the HathiTrust Digital Library Resources

Time frame
2015-2018
Investigator
J. Stephen Downie
Total funding to date
$398,844.00
Funding agency
Institute of Museum and Library Services

Librarians and digital humanities scholars from the University of Illinois in partnership with colleagues at Indiana University, Northwestern University, Lafayette College, the University of North Carolina, and the HathiTrust Research Center will develop a shared curriculum for use in academic libraries and a train the trainer series designed to assist librarians in getting started with the…

HathiTrust + Bookworm Project

Time frame
2014-Present
Investigator
J. Stephen Downie
Total funding to date
$504,373.00
Funding agency
National Endowment for the Humanities

The HathiTrust Research Center (HTRC) is partnering with the Cultural Observatory team that developed the Google Books Ngram Viewer together with Google. The goal of this collaboration is to implement a greatly enhanced open-source version of the Cultural Observatory’s open-source “Bookworm” text analysis and visualization tool designed to assist scholars to meet the challenges posed by the…

HathiTrust + Bookworm project

HathiTrust Research Center Phase 2

Time frame
2019-Present
Investigator
J. Stephen Downie
Total funding to date
$535,292.00
Funding agency
HathiTrust

The HathiTrust has provided funding for the HathiTrust Research Center (HTRC), colocated at University of Illinois and Indiana University, to serve as the research arm of the HathiTrust and create an agile, technology-rich service for researchers in the digital humanities, social sciences, natural sciences, and informatics. This service will help researchers conduct nonconsumptive research on…

HathiTrust Research Center: New Opportunities through Computational Analysis of HathiTrust Digital Library 2014-2018

Time frame
2014-2018
Investigator
J. Stephen Downie
Total funding to date
$1,000,000.00
Funding agency
HathiTrust

The HathiTrust has provided funding for the HathiTrust Research Center (HTRC), colocated at University of Illinois and Indiana University, to serve as the research arm of the HathiTrust and create an agile, technology-rich service for researchers in the digital humanities, social sciences, natural sciences, and informatics. This service will help researchers conduct nonconsumptive research on…

Language Change in Text Retrieval

Time frame
2010-2016
Total funding to date
$49,429.00
Funding agency
Google

In order for older texts to be searchable, contemporary English needs to be translated into language from various historical timeframes. The project will develop software that will let people enter a query in contemporary English, and search over English texts throughout history—from Medieval times to the present day. The project will mostly involve training statistical models that assign…

Query Modeling Using Intra-Entity Knowledge Base Structure

Time frame
2015-2016
Total funding to date
$22,130.00
Funding agency
Google

This project aims to improve search engine effectiveness by using knowledge base (KB) entries to inform query expansion. While the intersection of KBs and information retrieval (IR) is a growing research area, this project proposes a novel approach to KB-based query modeling. In particular, this project proposes to let the structure that KB authors impose within individual KB entries guide the…

Scholar-Curated Worksets for Analysis, Reuse & Dissemination (SCWAReD)

Time frame
2021-Present
Investigator
J. Stephen Downie
Total funding to date
$1,031,655.00
Funding agency
Indiana University

The Scholar-Curated Worksets for Analysis, Reuse & Dissemination (SCWAReD, pronounced “squared”) project is intended to produce a suite of curated, targeted HTRC (HathiTrust Research Center) worksets and illustrative, reusable research models that demonstrate the collaborative workset-building, textual analysis, workflow development, and dataset creation activities typically carried out by…

person sitting at desk working on computer

Single Interface for Music Score Searching and Analysis

Time frame
2015-Present
Investigator
J. Stephen Downie
Total funding to date
$15,000.00
Funding agency
Social Sciences and Humanities Research Council of Canada

Music prints and manuscripts created over the past thousand years sit on the shelves of libraries and museums around the globe. As these organizations digitize their collections, images of these scores are increasingly accessible online. However, the musical content remains difficult to search.

Google Books and HathiTrust have already made it possible to search the content of text…

Temporal Factors

Time frame
2012-2016
Total funding to date
$408,908.00
Funding agency
National Science Foundation

Time affects information retrieval in many ways. Collections of documents change as new items are indexed. The content of documents themselves may change. Users submit queries at particular moments in time. And perhaps most importantly, people’s assessment of a document’s relevance to a query is often time-dependent. For example, searchers of news archives might seek information on a past…

Textual Geographies

Time frame
2016-2019
Investigator
J. Stephen Downie
Total funding to date
$15,536.00
Funding agency
National Endowment for the Humanities

Textual Geographies uses named entity recognition and geolocation to extract place names from multilingual (English, German, Spanish, and Chinese) printed volumes held by the HathiTrust digital library and to associate those names with detailed geographic information. The project corpus currently includes about 10 million volumes published between 1700 and the present day.

Textual Geographies logo

The Whole Tale

Time frame
2016-Present
Investigators
Bertram Ludäscher, Matthew Turk
Total funding to date
$4,986,951.00
Funding agency
National Science Foundation

Scholarly publications today are still mostly disconnected from the underlying data and code used to produce the published results and findings, despite an increasing recognition of the need to share all aspects of the research process. As data become more open and transportable, a second layer of research output has emerged, linking research publications to the associated data, possibly along…

Understanding Search Literacy and Search Skills Adoption: How People Solve Technical Problems via Search

Time frame
2015-2016
Investigator
Michael Twidale
Total funding to date
$65,000.00
Funding agency
Google

Despite the ubiquity of search in many people’s daily lives, a lack of search literacy can make it difficult to find solutions to technical problems, such as completing software-based tasks like troubleshooting program installations. iSchool Professor Michael Twidale and Assistant Professor Max Wilson of the University of Nottingham have received funding from Google for a project that aims to…

WCSA+DC

Time frame
2016-Present
Investigator
J. Stephen Downie
Total funding to date
$1,170,000.00
Funding agency
Andrew W. Mellon Foundation

This project builds upon, extends, and integrates two developmental research threads within the HathiTrust Research Center (HTRC). The first thread originates from work that was conducted in the Workset Collections for Scholarly Analysis (WCSA): Prototyping Project. The second thread continues the work of…

News Stories