Skip to main content

Modeling Molecular Complexity: Building a Novel Multidisciplinary Machine Learning Framework to Understand Molecular Synthesis and Signatures

Jaden J.A. Hastings‚ Aaron C. Bell‚ Timothy Gebhard‚ Jian Gong‚ Atılım Güneş Baydin‚ Matthew Fricke‚ Massimo Mascaro‚ Michael Phillips‚ Kimberly Warren−Rhodes and Nathalie A. Cabrol

Abstract

The ability to analyze and compare the structure of every known molecule, let alone molecules not yet encountered, and be able to predict all the possible synthesis pathways to be able to build ever more complex molecules at the atomic scale is a bottleneck spanning multiple disciplines. These span the fundamental and applied sciences – from organic synthesis of novel pharmaceuticals to detecting biosignatures on distant planets. Fundamental to this effort is the identification and standardization of key features of complexity and generating datasets optimized for machine learning methods. Connecting molecules and their complexity measures within vast chemical synthesis and reaction networks is similarly promising.The Molecular Complexity Consortium (MCC) – a working group of subject matter experts across academic, government, and commercial sectors – advances both applied and theoretical research in molecular complexity. We argue key shared objectives for unlocking the vast potential of ML-driven modeling of molecular complexity: the requisite standardization of features, generation of well-curated training datasets, and optimization of computation by ML method selection. Here we offer an overview of the field of molecular complexity, from methods of mathematical modeling to forming a notion of molecular signatures, and pose a call to action as we seek out new avenues for collaboration in this exciting emergent field.

Book Title
American Geophysical Union (AGU) Fall Meeting‚ December 12–16‚ 2022
Year
2022