Blake builds Claim Framework to analyze and synthesize medical research

Thursday June 4 2015

The news is filled with stories about the latest medical innovations—new prescription drugs that promise relief from symptoms, treatments that offer improved quality of life. Although journalists tend to focus on a single study, people with a chronic health condition who are trying to figure out the best treatment for them, and policy makers who are trying to make recommendations that are best for a community spend much of their time aggregating results from multiple studies.

“There are 24 million abstracts in MEDLINE that capture the best scientific evidence that is available to us. Although anyone can now access this information, rather than clarity, we end up suffering from information overload,” said Associate Professor Catherine Blake. “What we need is a way to analyze and synthesize the results from a collection of articles.”

To ameliorate this problem and work toward improving scientific analysis, Blake has developed the Claim Framework, a rhetorical structure that captures how scientists make claims and a set of tools that use natural language processing to pull out claims made within the journal articles. By attempting to solve the information problem behind this glut of published scientific data, Blake can reveal where there are uncertainties or gaps in the medical literature, as well as provide a detailed analysis of comparative claims.

In evidence-based medicine, a detailed analysis of medical research serves as the foundation for decisions about care and evolutions of policy. Working in teams, scientists analyze the multitude of studies and conclusions about a particular drug. Scientists in toxicology use a similar process to conduct risk assessments to understand the risks associated with environmental toxins. Although these groups provide the highest quality evidence, it can take years to conduct a comprehensive review, thus leaving a gap between when a new finding is made and when that finding is factored into public policy.

“We are trying to build a complete picture of what the literature is telling us,” said Blake. “In evidence-based medicine you need to see where there is consensus, but we are also interested in inconsistencies and gaps in the literature. The Claim Framework doesn’t just sweep them under the table, but instead reveals them to us, and these gaps are precisely where we need to focus new research.”

By delving deep into the full text of the article, Blake and her research team capture nuances in the claims that might not be made clear in the abstract. “What is really interesting about this work is that the method does not confuse quantity with quality,” said Blake. “Scientists state their key findings in different ways but just looking for the most frequent word or phrase is not going to give you the key result.”

In a recent study, the research team pulled over 2 million sentences from 9,600 full-text journal articles and then focused on the sentences that talked about Metformin, a drug that is often used to treat diabetes. An automated function pulled out comparisons within the text and then presented a summary of those sentences.

“Only five percent of sentences report a direct comparison, which is a very small percentage, and yet those five percent of sentences contain an abundance of information that captures how effective that drug is both compared with other drugs, and to non-drug interventions and placebos,” said Blake. The resulting analysis conducted by Blake and her team showed what comparisons had not yet been made and thus where new research efforts should be focused.

Not only does this kind of analysis provide details about which specific medical outcomes are improved and under which conditions, it reveals contradictions between study results and shows where new studies should be conducted.

The initial framework was developed as part of a grant Blake received from the National Science Foundation and subsequently published in the Journal of Biomedical Informatics in 2010. In recent years, Blake has continued to explore the framework in different domains, including an early application of the framework to social science that received an honorable mention at the 2013 iConference poster session. Blake presented detailed examples of the framework in medicine, toxicology, and epidemiology at the 2014 Inconsistency Robustness conference held at Stanford University as well as at a meeting of the Research Group for Information Retrieval at the University of Wisconsin, Milwaukee School of Information Studies. It has also been published as a chapter in the book Inconsistency Robustness published by College Publications.

“The beauty of this framework is that it is domain-independent. We have thus far demonstrated the automated methods in medicine, toxicology, and epidemiology, but we are now working on how to apply the framework to other fields where science is informed by empirical evidence,” said Blake. “We need a translational language in order to go from the way the science is written in full text articles to actually summarizing them in a meaningful way. It’s not about the quantity of data that we analyze. It’s about getting at what the result is.”

Ultimately, the Claim Framework will synthesize evidence and highlight new opportunities where further research is necessary to either fill a gap or highlight inconsistent findings.

“It is easy to find your lost keys under a street light, but much harder in the dark. By honing in on the gaps in the scientific literature, on what is missing, we can identify where we need to do more research. And ultimately targeted research will make for better health outcomes for all of us,” said Blake.

Blake's research team from left: master's student Maxwell Isenholt, doctoral student Ana Lucic, Blake, master's student Max McKittrick, and doctoral student Henry Gabb