Sara Lafia Presentation

Sara Lafia, a research methodologist at the University of Chicago, will present "Aligning Data Curation and Data Discovery for Research Impact."

Abstract: 
As scientific research becomes more data intensive, the demand to share and preserve research data is growing. To respond to these demands, archives need to develop evidence-based data curation efforts that maximize scholarly impact. I investigate data discovery and use in order to understand the impact of data curation decisions. In this talk, I describe computational approaches I have developed to analyze data curation work and measure data use through citations. The first study I present leverages text classification to analyze curatorial workflows at a large data archive and provides insights into the impacts of data curation decisions at scale. My second study develops a named entity recognition model to detect informal data references in text and constructs a network of data citations to identify data use patterns. Together, these projects provide a foundation for studying how curation decisions influence data use and respond to the evolving needs of diverse research communities.

Bio: 
Sara Lafia is a research methodologist in the Methodology & Quantitative Social Sciences Department at NORC at the University of Chicago. She holds a PhD in geography with an emphasis in information technology and society from UC Santa Barbara. From 2020-2023, she worked as a postdoctoral research fellow at ICPSR, a leading social science data archive at the University of Michigan, where she investigated data curation practices. 

In her research, Lafia applies computational methods including machine learning, natural language processing, and statistical analysis. She has also developed interactive visualizations and designed vocabularies to improve data discovery in open science data catalogs (e.g., DataONE) and geospatial data-sharing platforms (e.g., Esri’s ArcGIS Hub). Lafia’s work has been supported by awards from the National Science Foundation, the Institute of Museum and Library Services, and the Michigan Institute for Data Science. She has led the publication of twelve peer-reviewed articles in outlets such as the Journal of Documentation, Data Science Journal (DSJ), and Quantitative Science Studies (QSS).