Professor and Center for Informatics Research in Science and Scholarship (CIRSS) Director Bertram Ludäscher and collaborators are presenting their joint work and tools for data quality, cleaning, and provenance at the 33rd Annual Biodiversity Information Standards conference, TDWG 2017, from October 1-6 in Ottawa, Canada. The annual conference provides a forum for developing standards and demonstrating new technologies and tools for biodiversity informatics. This year's theme is "Data Integration in a Big Data Universe: Associating Occurrences with Genes, Phenotypes, and Environments."
Three of the abstracts presented at TDWG 2017 are outcomes of the Kurator project, a collaboration between Illinois and the Museum of Comparative Zoology (MCZ) at Harvard University. Kurator is a suite of biodiversity data quality tools aimed at collection management specialists with little or no programming experience, database administrators and researchers with some scripting language experience, and developers.
Ludäscher will talk about Using YesWorkflow hybrid queries to reveal data lineage from data curation activities, which is joint work with Qian Zhang, CIRSS postdoctoral researcher, and Timothy McPhillips, YesWorkflow architect and developer. Paul Morris, MCZ bioinformatics diversity manager, will talk about Fitness-for-Use-Framework-aware Data Quality workflows in Kurator, and John Wieczorek, a programmer/analyst at the Museum of Vertebrate Zoology, University of California, Berkeley, will present Darwin Cloud: Mapping real-world data to Darwin Core.