Knowledge Graphs and Semantic Computing Speaker Series: Michel Dumontier

Friday, October 13, 2023 9:00 - 10:00 AM

Dr. Michel Dumontier, Distinguished Professor of Data Science at Maastricht University, founder and executive director of the Institute of Data Science, and co-founder of the FAIR (Findable, Accessible, Interoperable and Reusable) data principles, will present "Towards Biomedical Neurosymbolic Al: From Knowledge Infrastructure to Explainable Predictions."

Access to previous talks can be found here.

Dr. Michel Dumontier is the Distinguished Professor of Data Science at Maastricht University, founder and executive director of the Institute of Data Science, and co-founder of the FAIR (Findable, Accessible, Interoperable and Reusable) data principles. His research explores socio-technological approaches for responsible discovery science, which includes collaborative multi-modal knowledge graphs, privacy-preserving distributed data mining, and AI methods for drug discovery and personalized medicine. His work is supported through the Dutch National Research Agenda, the Netherlands Organisation for Scientific Research, Horizon Europe, the European Open Science Cloud, the US National Institutes of Health, and a Marie-Curie Innovative Training Network. He is the editor-in-chief for the journal Data Science and is internationally recognized for his contributions in bioinformatics, biomedical informatics, and semantic technologies including ontologies and linked data.

Abstract:
The increased availability of biomedical data, particularly in the public domain, offers the opportunity to better understand human health and to develop effective therapeutics for a wide range of unmet medical needs. However, data scientists remain stymied by the fact that data remain hard to find and to productively reuse because data and their metadata i) are wholly inaccessible, ii) are in non-standard or incompatible representations, iii) do not conform to community standards, and iv) have unclear or highly restricted terms and conditions that preclude legitimate reuse. These limitations require a rethink on data can be made machine and AI-ready - the key motivation behind the FAIR Guiding Principles. Concurrently, while recent efforts have explored the use of deep learning to fuse disparate data into predictive models for a wide range of biomedical applications, these models often fail even when the correct answer is already known, and fail to explain individual predictions in terms that data scientists can appreciate. These limitations suggest that new methods to produce practical artificial intelligence are still needed.

In this talk, Dr. Dumontier will discuss their work in (1) building an integrative knowledge infrastructure to prepare FAIR and "AI-ready" data and services along with (2) neurosymbolic AI methods to improve the quality of predictions and to generate plausible explanations. Attention is given to standards, platforms, and methods to wrangle knowledge into simple, but effective semantic and latent representations, and to make these available into standards-compliant and discoverable interfaces that can be used in model building, validation, and explanation. Our work, and those of others in the field, creates a baseline for building trustworthy and easy to deploy AI models in biomedicine.

Relevant Readings:
Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18

Karamarie Fecho et al. Progress toward a universal biomedical data translator. Clinical and Translational Science 15(8):1838-1847 (2022). https://doi.org/10.1111/cts.13301

Celebi, R., Uyar, H., Yasar, E. et al. Evaluation of knowledge graph embedding approaches for drug-drug interaction prediction in realistic settings. BMC Bioinformatics 20, 726 (2019). https://doi.org/10.1186/s12859-019-3284-5

We continue the CIRSS speaker series in Fall 2023 with a focus on “Knowledge Graphs and Semantic Computing”. We will meet on Fridays, 9-10am Central Time, on Zoom. To join a session, go to the current week’s session and click the “access” link, which will lead you to a calendar entry. There, click the “PARTICIPATE online” button to join a session. Recordings of past talks can be found next to "access" if available. The event is open to the public, and everyone is welcome to attend! This series is hosted by the Center for Informatics Research in Science and Scholarship (CIRSS). If you have any questions, please contact Jana Diesner and Halil Kilicoglu.

If you are interested in this speaker series, please subscribe to our speaker series calendar: Google Calendar or Outlook Calendar.

This event is sponsored by Center for Informatics Research in Science and Scholarship