IS 477 Data Management, Curation & Reproducibility
This course addresses issues in Data Management, Curation & Reproducibility from a Data Science perspective. We discuss definitions of data science, and then introduce and use the Data Science Life Cycle as an intellectual foundation. Topics include Research Artifact Identification and Management, Metadata, Repositories, Economics of Artifact Preservation and Sustainability, and Data Management Plans. We use the case study to ground our discussions in both data sets and in specific data science research. This course requires a final project that applies course knowledge to a data science experiment and creates a data management plan for that experiment. 3 undergraduate hours. 4 graduate hours. Prerequisite: IS 205 or STAT 207 or equivalent programming experience.