IS 532 Theory and Practice Data Cleaning

Data cleaning (also: cleansing) is the process of assessing and improving data quality for later analysis and use, and is a crucial part of data curation and analysis. This course identifies data quality issues throughout the data lifecycle, and reviews specific techniques and approaches for checking and improving data quality. Techniques are drawn primarily from the database community, using schema-level and instance-level information, and from different scientific communities, which are developing practical tools for data pre-processing and cleaning.

Learning objectives

  • Understand how to detect and flag data quality problems.
  • Understand principles of data and information modeling.
  • Understand techniques that support automated data curation and cleaning.

Recent syllabus

Scheduled Offerings

  • Fall 2019

    • IS532A
      Thu, 5:00 pm - 7:00 pm
      Room 131
      On-Campus
      This is a hybrid course that meets with IS 532 AO.
      Instructor
      Bertram Ludaescher
      CRN
      70340
      Length
      16 weeks
    • IS532AO
      Thu, 5:00 pm - 7:00 pm
      Online
      This is a hybrid course that meets with IS 532 A.
      Instructor
      Bertram Ludaescher
      CRN
      70341
      Length
      16 weeks
  • Spring 2019

    • IS532A
      Thu, 5:00 pm - 7:00 pm
      Room 126
      On-Campus
      This is a hybrid course that meets with IS 532AO and CS 513A.
      Instructor
      Bertram Ludaescher
      CRN
      67443
      Length
      16 weeks
    • IS532AO
      Thu, 5:00 pm - 7:00 pm
      Online
      This is a hybrid course that meets with IS 532A and CS 513A.
      Instructor
      Bertram Ludaescher
      CRN
      67444
      Length
      16 weeks

Textbooks and Course Materials

Available from the Illinois Union Bookstore (IUB).