PhD student Xiaoliang Jiang will present his proposal defense, "Contextualized relevance of place names in scientific writing based on language models, linked data, and metadata." Jiang's preliminary examination committee includes Associate Professor Vetle Torvik (chair); Professor Stephen Downie; Assistant Professor Nigel Bosch; and Assistant Professor Meicen Sun.
Abstract: There is a growing interest in methods for extracting geographic references or place names from unstructured scientific text with practical applications across various disciplines, such as health care, epidemiology, earth science, ecology, and botany. The focus of Named Entity Recognition (NER) research has been on identification and disambiguation, while less attention has been given to the relevance assessment of place, which assesses the topic aboutness of it. This knowledge gap regarding relevance poses a challenge, e.g., when trying to determine if a natural or social phenomenon observed in a particular location generalizes to other (nearby) geographic locations. The degree of relevance varies depending on different purposes and contexts, e.g., the scientific article might be about a specific place, or the place mentioned might be just incidental to the subject matter. The proposed goal of this research is to probabilistically model the relevance of place, explore its various levels and contexts, and identify the factors that influence relevance. Places will be studied along various factors of metadata, linked data, and natural languages, such as the language embeddings, the IMRaD (Introduction, Methods, Results, and Discussion) structure, MeSH (Medical Subject Headings) terms, author affiliations, and citations. The potential methods include biomedical domain-specific pre-trained language models/embeddings, well-established tools for author/affiliation name disambiguation, and MeSH predictors. The findings of this research should enhance the overall quality of text mining applications that include place, across literature-based discovery, information extraction, and information retrieval.
Questions? Contact Xiaoliang Jiang.