Parulian defends dissertation

Doctoral candidate Nikolaus Parulian successfully defended his dissertation, "A Conceptual Model for Transparent, Reusable, and Collaborative Data Cleaning," on June 29.

His committee included Professor Bertram Ludäscher (chair), Professor J. Stephen Downie, Associate Professor Jana Diesner, and Assistant Professor Nigel Bosch.

Abstract: Data cleaning is an essential component of data preparation in machine learning and other data science workflows. It is a time-consuming and error-prone task that can greatly affect the reliability of subsequent analyses. Tools must capture provenance information to ensure transparent and auditable data-cleaning processes. However, existing provenance models have limitations in tracing and querying changes at different levels of granularity. To address this, we proposed a new conceptual model that captures fine-grained retrospective provenance and extends it with prospective provenance to represent operations or workflows that change the datasets. This hybrid model allows powerful queries and supports advanced use cases like auditing data cleaning workflows. Additionally, we extended the model to present a conceptual model focusing on reusability and collaboration in data cleaning. It addresses scenarios where multiple users contribute to dataset changes and enables tracking of curator actions, identifying dependencies between cleaning operations, and facilitating collaboration. Through an experimental case study, we demonstrated the reusability of data-cleaning workflows, different users' contributions, and collaboration's effectiveness in improving data quality.

Updated on
Backto the news archive

Related News

iSchool to present research at TPRC 2025

iSchool faculty, staff, and students will participate in the Research Conference on Communications, Information and Internet Policy (TPRC 2025), which will be held from September 18–20 in Washington, DC.

Get to know Simit Shah, MSIM student

Simit Shah worked as a consultant for Deloitte in India before enrolling in the MSIM program to strengthen his analytical and business skills. Over the summer, he applied the knowledge gained from his iSchool coursework during an internship as a technology risk consultant at EY.

Simit Shah

Pila awarded Ruth Fine Memorial Student Loan

MSLIS student Nathaniel Allen (Nat) Pila has been selected as the 2025 recipient of the Ruth Fine Memorial Student Loan, awarded annually by the District of Columbia Library Association (DCLA). The award will support Pila as he begins his studies in the iSchool at the University of Illinois. 

Nathaniel Allen Pila

New grant to help Multiple Sclerosis patients manage depression

Associate Professor Jessie Chin has received a $215,000 grant from the National Multiple Sclerosis Society (NMSS grant RFA-2411-44091) for a two-year project to improve how people with multiple sclerosis (PwMS) manage depression. 

Jessie Chin

Internship Spotlight: National Endowment for the Humanities

PhD student Owen Monroe reflects on his internship with the National Endowment for the Humanities Office of Digital Humanities, held from May to December 2024. Last month, the NEH programs officer Monroe worked with during his internship discussed some of their work at the Digital Humanities conference in Lisbon, Portugal. 

Owen Monroe