Parulian defends dissertation

Doctoral candidate Nikolaus Parulian successfully defended his dissertation, "A Conceptual Model for Transparent, Reusable, and Collaborative Data Cleaning," on June 29.

His committee included Professor Bertram Ludäscher (chair), Professor J. Stephen Downie, Associate Professor Jana Diesner, and Assistant Professor Nigel Bosch.

Abstract: Data cleaning is an essential component of data preparation in machine learning and other data science workflows. It is a time-consuming and error-prone task that can greatly affect the reliability of subsequent analyses. Tools must capture provenance information to ensure transparent and auditable data-cleaning processes. However, existing provenance models have limitations in tracing and querying changes at different levels of granularity. To address this, we proposed a new conceptual model that captures fine-grained retrospective provenance and extends it with prospective provenance to represent operations or workflows that change the datasets. This hybrid model allows powerful queries and supports advanced use cases like auditing data cleaning workflows. Additionally, we extended the model to present a conceptual model focusing on reusability and collaboration in data cleaning. It addresses scenarios where multiple users contribute to dataset changes and enables tracking of curator actions, identifying dependencies between cleaning operations, and facilitating collaboration. Through an experimental case study, we demonstrated the reusability of data-cleaning workflows, different users' contributions, and collaboration's effectiveness in improving data quality.

Updated on
Backto the news archive

Related News

Get to know Cadence Cordell, MSLIS student

Cadence Cordell was inspired by her undergraduate work experience to pursue a degree in library and information science. She followed in her mother’s footsteps by selecting the iSchool for her MSLIS. After completing a recent research poster presentation, she combined her scholarly pursuit with her hobby by sewing her fabric poster into a squirrel plushie.

Cadence Cordell

Recent graduate committed to making libraries accessible and inclusive

Joshua Short knows firsthand the barriers to public library access that patrons living on modest wages experience. Having grown up in a self-professed "low-income environment," Short has made it his mission to reduce these barriers, such as library fines, inadequate transportation, and limited computer literacy.

Joshua Short

Spectrum Scholar Spotlight: Leslie Lopez

Twelve iSchool master's students were named 2024–2025 Spectrum Scholars by the American Library Association (ALA) Office for Diversity, Literacy, and Outreach Services. This “Spectrum Scholar Spotlight” series highlights the School’s scholars. MSLIS student Leslie Lopez graduated from the University of North Texas with a BA in psychology.

Leslie Lopez headshot

SafeRBot to assist community, police in crime reporting

Across the nation, 911 dispatch centers are facing a worker shortage. Unfortunately, this understaffing, plus the nature of the job itself, leads to dispatchers who are often overworked and stressed. Meanwhile, when community members need to report a crime, their options are to contact 911 for an emergency or, in a non-emergency situation, call a non-emergency number or fill out an online form. A new chatbot, SafeRBot, designed and developed by Associate Professor Yun Huang, Informatics PhD student Yiren Liu, and BSIS student Tony An seeks to improve the reporting process for non-emergency situations for both community members and dispatch centers.

Yun Huang

New digital collection sheds light on queer nightlife in Champaign County

Adam Beaty decided to pursue an MSLIS degree to combine his love of history, the arts, and community-centered spaces. This combination of interests culminated in a 244-item digital collection that showcases digitized materials depicting nearly thirty years of queer nightlife in Champaign County. 

Adam Beaty_headshot