PhD, Information Sciences, Illinois (in progress)
MS, Information Management, Illinois
I am interested in providing a transparent data processing model using provenance information that can be easily queried and analyzed. Data cleaning and preparation are essential for data science workflow. The decision to clean the data to improve data quality or make the data ready to use can affect the analysis result output. It is important to make this process transparent to make it easily auditable and accessible, and, to some extent, reusable and replicable.