Dissertation Proposal Defense: Jinlong (Kenney) Guo

Friday, May 4, 2018 9:00 AM - 12:00 PM

Room 131

Jinlong (Kenney) Guo will defend his dissertation proposal titled, "Extracting Outcome Claims from Randomized Control Trial Abstracts." Guo's committee includes Associate Professor Catherine Blake (chair and research director); Assistant Professor Jodi Schneider; Professor Michael Twidale; and Corina (Roxanna) Girju, associate professor of linguistics. The full proposal is available at the iSchool front office.

Abstract: With the rapid increase of medical literature, it is becoming increasingly difficult to keep up with the current state of medical evidence. Many automatic approaches have been developed to facilitate the evidence synthesis process and reduce the time needed for extracting key information from the literature. Clinical outcome is one of the most important information about a clinical study on which clinical decision making is based. However, clinical outcome is known to be difficult to define and extract and the strategies for extracting outcome information might differ greatly among different genre of text. In this dissertation proposal, we compared existing approaches for outcome extraction and clarify some key concepts related to outcome information. We then propose a claim-based approach to extract key claims about outcomes from the primary literature on treatment studies. We identified four main types of outcome claims: explicit claim, correlation claim, evaluative claim and comparison claim.

We further demonstrate that these claim types could be automatically extracted using entity and relation extraction technologies. For treatment entity extraction, we propose a machine learning (ML) model that trained on automatically labeled data by MetaMap and we demonstrate that a model that combines ML and UMLS semantic types achieves the best performance for treatment entity extraction. For outcome entity extraction, due to its diverse nature, we propose different approaches for different types of outcomes. For outcomes in the form of noun (phase), which is the majority of outcome types, we propose a ML model approach where a combination of context word, POS, cue words, semantic type features are used. The results are promising which are better than those reported in the literature. For outcomes not in the form of noun, we propose to use a rule-based approach as the type and number of such outcomes are relatively few. In terms of relation extraction, we distinguish between comparative relation and non-comparative relation. We plan to use a combination of ML and rule-based approach for the relation extraction task. This part is still under development.

The main contribution of this study is (1) a systematic comparison of outcome extraction approaches in the current literature and clarification of key concepts; (2) an annotation scheme for extracting the fine-grained relation between treatment and outcome in a sentence; (3) models and features we developed for automatically extracting treatment, outcome and their relations.

Questions? Contact Jinlong (Kenney) Guo