School of Information Sciences

Blake builds Claim Framework to analyze and synthesize medical research

Catherine Blake
Catherine Blake, Professor

The news is filled with stories about the latest medical innovations—new prescription drugs that promise relief from symptoms, treatments that offer improved quality of life. Although journalists tend to focus on a single study, people with a chronic health condition who are trying to figure out the best treatment for them, and policy makers who are trying to make recommendations that are best for a community spend much of their time aggregating results from multiple studies.

“There are 24 million abstracts in MEDLINE that capture the best scientific evidence that is available to us. Although anyone can now access this information, rather than clarity, we end up suffering from information overload,” said Associate Professor Catherine Blake. “What we need is a way to analyze and synthesize the results from a collection of articles.”

To ameliorate this problem and work toward improving scientific analysis, Blake has developed the Claim Framework, a rhetorical structure that captures how scientists make claims and a set of tools that use natural language processing to pull out claims made within the journal articles. By attempting to solve the information problem behind this glut of published scientific data, Blake can reveal where there are uncertainties or gaps in the medical literature, as well as provide a detailed analysis of comparative claims.

claimframework.png?itok=IftgDn5h In evidence-based medicine, a detailed analysis of medical research serves as the foundation for decisions about care and evolutions of policy. Working in teams, scientists analyze the multitude of studies and conclusions about a particular drug. Scientists in toxicology use a similar process to conduct risk assessments to understand the risks associated with environmental toxins. Although these groups provide the highest quality evidence, it can take years to conduct a comprehensive review, thus leaving a gap between when a new finding is made and when that finding is factored into public policy.

“We are trying to build a complete picture of what the literature is telling us,” said Blake. “In evidence-based medicine you need to see where there is consensus, but we are also interested in inconsistencies and gaps in the literature. The Claim Framework doesn’t just sweep them under the table, but instead reveals them to us, and these gaps are precisely where we need to focus new research.”

By delving deep into the full text of the article, Blake and her research team capture nuances in the claims that might not be made clear in the abstract. “What is really interesting about this work is that the method does not confuse quantity with quality,” said Blake. “Scientists state their key findings in different ways but just looking for the most frequent word or phrase is not going to give you the key result.”

In a recent study, the research team pulled over 2 million sentences from 9,600 full-text journal articles and then focused on the sentences that talked about Metformin, a drug that is often used to treat diabetes. An automated function pulled out comparisons within the text and then presented a summary of those sentences.

“Only five percent of sentences report a direct comparison, which is a very small percentage, and yet those five percent of sentences contain an abundance of information that captures how effective that drug is both compared with other drugs, and to non-drug interventions and placebos,” said Blake. The resulting analysis conducted by Blake and her team showed what comparisons had not yet been made and thus where new research efforts should be focused.

Not only does this kind of analysis provide details about which specific medical outcomes are improved and under which conditions, it reveals contradictions between study results and shows where new studies should be conducted.

The initial framework was developed as part of a grant Blake received from the National Science Foundation and subsequently published in the Journal of Biomedical Informatics in 2010. In recent years, Blake has continued to explore the framework in different domains, including an early application of the framework to social science that received an honorable mention at the 2013 iConference poster session. Blake presented detailed examples of the framework in medicine, toxicology, and epidemiology at the 2014 Inconsistency Robustness conference held at Stanford University as well as at a meeting of the Research Group for Information Retrieval at the University of Wisconsin, Milwaukee School of Information Studies. It has also been published as a chapter in the book Inconsistency Robustness published by College Publications.

“The beauty of this framework is that it is domain-independent. We have thus far demonstrated the automated methods in medicine, toxicology, and epidemiology, but we are now working on how to apply the framework to other fields where science is informed by empirical evidence,” said Blake.BlakepicClaimFramework.jpg?itok=e_N7wfaz “We need a translational language in order to go from the way the science is written in full text articles to actually summarizing them in a meaningful way. It’s not about the quantity of data that we analyze. It’s about getting at what the result is.”

Ultimately, the Claim Framework will synthesize evidence and highlight new opportunities where further research is necessary to either fill a gap or highlight inconsistent findings.

“It is easy to find your lost keys under a street light, but much harder in the dark. By honing in on the gaps in the scientific literature, on what is missing, we can identify where we need to do more research. And ultimately targeted research will make for better health outcomes for all of us,” said Blake.

Blake's research team from left: master's student Maxwell Isenholt, doctoral student Ana Lucic, Blake, master's student Max McKittrick, and doctoral student Henry Gabb

Tags:
Updated on
Backto the news archive

Related News

Chan’s "Predatory Data" named a 2026 PROSE Award finalist

Professor Anita Say Chan's book Predatory Data: Eugenics in Big Tech and Our Fight for an Independent Future (University of California Press, 2025) has been named a finalist in the Computing and Information Sciences Category of the 2026 PROSE Awards. The annual awards bestowed by the Association of American Publishers recognize the very best in professional and scholarly publishing and celebrate works that have made significant advancements in their respective fields of study.

Anita Say Chan

He inducted into Sigma Xi

Professor Jingrui He has been inducted into Sigma Xi, The Scientific Research Honor Society. Sigma Xi is the international honor society of science and engineering and one of the oldest and largest scientific organizations in the world, boasting a history of service to science and society spanning over 125 years. It has a multidisciplinary membership of scientists, engineers, and scholars, and Sigma Xi chapters can be found in universities and colleges, government laboratories, and commercial research centers.

Jingrui He

Hassan and Bashir receive distinguished paper award

A paper co-authored by PhD student Muhammad Hassan and Associate Professor Masooda Bashir received the Distinguished Paper Award at the Workshop on Security and Privacy in Standardized IoT, which was held last month in San Diego, California, in conjunction with the Network and Distributed System Security (NDSS) Symposium 2026. 

iSchool researchers to present work at Technocracy Conference

This week, iSchool PhD students and faculty will present their research at the Technocracy Conference. Hosted by the Unit for Criticism and Interpretive Theory at the University of Illinois on March 5–6, the conference will begin with a panel of graduate student papers and continue the following day with invited speakers and a keynote. All events will take place at the Levis Faculty Center on the Urbana campus. 

New multi-institutional project to use AI to represent past historical periods

A new project led by a team of researchers from four universities aims to create and evaluate language models that represent past historical periods. The project, "Artificial Intelligence for Cultural and Historical Reasoning," was recently selected for a 2025 Humanities and AI Virtual Institute (HAVI) award from Schmidt Sciences. The $800,000 grant will be split among four institutions: Cornell University, the University of Illinois Urbana-Champaign, The University of British Columbia, and McGill University. Professor Ted Underwood will serve as the principal investigator for the portion of the project at Illinois.

Ted Underwood

School of Information Sciences

501 E. Daniel St.

MC-493

Champaign, IL

61820-6211

Voice: (217) 333-3280

Email: ischool@illinois.edu

Back to top