Parulian defends dissertation

Nikolaus Parulian
Nikolaus Parulian

Doctoral candidate Nikolaus Parulian successfully defended his dissertation, "A Conceptual Model for Transparent, Reusable, and Collaborative Data Cleaning," on June 29.

His committee included Professor Bertram Ludäscher (chair), Professor J. Stephen Downie, Associate Professor Jana Diesner, and Assistant Professor Nigel Bosch.

Abstract: Data cleaning is an essential component of data preparation in machine learning and other data science workflows. It is a time-consuming and error-prone task that can greatly affect the reliability of subsequent analyses. Tools must capture provenance information to ensure transparent and auditable data-cleaning processes. However, existing provenance models have limitations in tracing and querying changes at different levels of granularity. To address this, we proposed a new conceptual model that captures fine-grained retrospective provenance and extends it with prospective provenance to represent operations or workflows that change the datasets. This hybrid model allows powerful queries and supports advanced use cases like auditing data cleaning workflows. Additionally, we extended the model to present a conceptual model focusing on reusability and collaboration in data cleaning. It addresses scenarios where multiple users contribute to dataset changes and enables tracking of curator actions, identifying dependencies between cleaning operations, and facilitating collaboration. Through an experimental case study, we demonstrated the reusability of data-cleaning workflows, different users' contributions, and collaboration's effectiveness in improving data quality.

Updated on
Backto the news archive

Related News

Petrella defends dissertation

Doctoral candidate Julia Burns Petrella successfully defended her dissertation, "Educating Pre-Service School Librarians about Race, Racism, and Whiteness," on December 4.

Julia Burns Petrella

Guo defends dissertation

Doctoral candidate Qiuyan Guo successfully defended her dissertation, "Exploring Chinese Celebrity Fans’ Online Information Behaviors and Understandings of Their Practices," on December 6.

Qiuyan Guo

Tilley featured in comic book

Associate Professor Carol Tilley had an unexpected citation in her favorite medium—comic books! Dav Pilkey, author and illustrator of a number of bestselling and award-winning children’s books, including the popular Captain Underpants series, depicts Tilley's research on psychiatrist Fredric Wertham in his newest comic, Cat Kid Comic Club Influencers.

Dav Pilkey's comic depicting Carol Tilley

BIG projects span the globe

This fall, students in the Business Intelligence Group (BIG), the student consultancy group associated with Associate Professor Yoo-Seong Song's Applied Business Research class (IS 514), worked on projects related to supply chain management, digital health, biotechnology, and higher education for companies located at the University of Illinois Research Park and overseas in East Asia and Africa. 

Internship Spotlight: State Farm

BSIS student Evan Chen discusses his summer internship at State Farm, where he developed an interest in computer vision.

Evan Chen