CIRSS Seminar: ASIS&T Paper Previews

iSchool doctoral students Yi-Yun (Jessica) Cheng, Tzu-Kun (Esther) Hsiao, Jenna Kim and Janina Sarol will present previews of four ASIS&T papers related to citation analysis and text analytics.

1) Presented by Tzu-Kun (Esther) Hsiao:
Title: Knowledge Transfer from Technology to Science: The Longevity of Paper-to-Patent Citations
Authors: Tzu-Kun (Esther) Hsiao and Vetle I. Torvik

Abstract: Citations between papers and patents reflect transfer of knowledge between science and technology. Patents commonly cite papers but papers rarely cite patents. Here, we identified 6,033 paper-to-patent citations in a collection of 1.5 million PubMed Central open access articles. These citing papers and cited patents contained 132,536 paper-to-paper citations and 200,339 patent-to-patent citations. These three citation datasets were used to model the temporal patterns of knowledge transfer within and across patents and papers. We found that the cited patents are generally much older than the cited papers, regardless of whether they are cited by papers or patents. Discipline, affiliation type, and self-citation also affect the age of the cited papers and patents. The recency of the citations partly explains the asymmetry in citations between papers and patents.
 
Speaker bio: Tzu-Kun Hsiao (Esther) is a 2nd year Ph.D. student in the School of Information Sciences at the University of Illinois at Urbana-Champaign. Her advisor is Professor Vetle Torvik. Esther's research interests are broadly related to scientometrics, scholarly communication, and data mining. In the past year, she has worked with professor Torvik on studying the knowledge transfer between science and technology from the perspective of citations appearing in both scientific and technological realms. Currently, she's studying the motives behind the citations and exploring the reasons behind the knowledge transfer between science and technology.


2) Presented by Janina Sarol
Title:  Systematic Examination of Pre- and Post-Retraction Citations
Authors:  Ly Dinh, Janina Sarol, Yi-Yun Cheng, Tzu-Kun Hsiao, Nikolaus Parulian, Jodi Schneider

Abstract: Scientific retractions occur for a multitude of reasons. A growing body of research has studied the phenomenon of retraction through systematic analyses of the characteristics of retracted articles and their associated citations. In our study, we focus on the characteristics  of articles that cite retracted articles, and the changes in citation dynamics pre- and post-retraction. We leverage descriptive statistics and ego-network methods to examine 4,871 retracted articles and their citations before and after retraction. Our retracted  articles data was obtained from PubMed, Scopus, and Retraction Watch and their citing articles from Scopus. Our findings indicate a stark decrease in post-retraction citations and that most of these citations came from countries different from the retracted  article's country of publication. Citation context analyses of a subset of retracted articles also reveal that post-retraction citations came from articles with disciplinary and geographical boundaries different from that of the retracted article.

Speaker bio: Janina Sarol is a 2nd year Informatics Phd student at the University of Illinois at Urbana-Champaign working with Prof. Jana Diesner. Her research interests are in applying methods from network analysis, text mining, and natural language processing to bibliometrics and social science problems and understanding the effects of algorithmic choices and data preprocessing on the results of computational methods.
 

3) Presented by Yi-Yun (Jessica) Cheng
Title: ReTracker: Actively and Automatically Matching Retraction Metadata in Zotero 
Authors: Yi-Yun (Jessica) Cheng, Nikolaus Parulian, Tzu-Kun Hsiao, Ly Dinh, Janina Sarol, Jodi Schneider

Abstract: Retraction removes seriously flawed papers from the scientific literature. However, even papers retracted for scientific fraud continue to be cited and used as valid after their retraction. Retracted papers are inadequately identified on publisher pages and in scholarly databases, and scholars’ personal libraries frequently contain retracted papers. To address this, we are developing a tool called ReTracker (https://github.com/nikolausn/ReTrackers) that automatically checks a user’s Zotero library for retracted articles, and adds retraction status as a new metadata field directly in the library. In this paper, we present the current version of ReTracker, which automatically flags retracted articles from PubMed. We describe how we have iteratively improved ReTracker’s matching performance through its initial two versions. Our tests show that the current version of ReTracker is able to flag retracted articles from PubMed with high precision and recall, and to distinguish retracted articles from articles about retraction. In its current state, ReTracker can actively and automatically bring retraction metadata into Zotero, and in future work we will test its usability with scholars.

Speaker bio: Yi-Yun Cheng (Jessica) is a 4th year PhD student in the School of Information Sciences at University of Illinois at Urbana-Champaign, advised by Professor Bertram Ludaescher. Jessica’s research interests lies in the intersection of information organization and data science methods. Specifically, Jessica is interested in topics related to knowledge organization, semantic web technologies, ontologies, and last but not the least — taxonomy alignment. She has worked with Professor Ludaescher on the NSF-funded project Exploring Taxon Concepts (ETC), in which they employed a logic-based approach to solve taxonomy interoperability problems in biodiversity informatics. Further, Jessica has also worked in collaboration with Professor Nico Franz, an entomologist and taxon-concept expert from ASU, to explore the use of data science methods for taxonomy alignment problems in the WholeTale reproducibility in biodiversity project. Recently, Jessica has been exploring the ideas on geopolitical realities in different taxonomies in her research. Jessica is a student member of ASIS&T, and this will be her 4th time attending ASIS&T. 
 

4) Presented by Jenna Kim
Title: Empowering Citizens to Manage their Chemical Exposures: Step 1 - Identify Ingredients in Consumer Products 
Authors: Jenna Kim, Catherine Blake, Henry A. Gabb

Abstract: Our choices around consumer products directly influence our amount of chemical exposure. Although access to chemicals within individual products are available, we often use multiple products so ingredient names must be harmonized to accurately estimate cumulative exposure. We evaluated the accuracy and coverage of two strategies, PubChem and tmChem, with respect to a database of 55K products. More than half of the ingredients identified by PubChem were specific chemical names (55%), followed by natural or artificial colors (20%) and plants or plant derivatives (13%). The majority of ingredients identified by tmChem were chemical names (83.9%). Only 1,696 of the 8,247 (20.56%) were identified by both systems. Although tmChem had better coverage, ~70% of ingredients identified by tmChem need further work to align with a specific chemical. Both strategies are needed to provide an accurate, personalized, and cumulative measure of chemical exposure.

Speaker bio: Jenna Kim is a 2nd-year doctoral student at the School of Information Sciences at the University of Illinois at Urbana-Champaign, advised by Dr. Catherine Blake. Her research seeks to characterize the underlying mechanisms that impact data quality, particularly in the contexts of information integration. Jenna's special attention is given to health-related databases (e.g., clinical, biomedical, etc.) considering the severe impact of data quality problem. Her work improves data quality by formalizing processes to mitigate against information behaviors that limit data reuse and by developing methods for improvement at scale. She has explored data quality from 2.4 million products in order to unify chemical names and is pursuing a joint project with the US Department of Veterans Affairs that focuses on data quality in health-related databases in a clinical setting and the impact in operations and research. The goal of Jenna’s research focus during doctoral studies is to formulate a novel framework for resolving the issues from a socio-technical perspective and propose both theoretical and practical solutions to help researchers and practitioners cope well with data quality problems hindering reliable knowledge discovery from data.
 

This event is sponsored by CIRSS