Machine Learning for Catalysis
Read our Publications
Harnessing Data Revolution for Catalyst Discovery

CataLST Traineeship Model
To train students how to harness data for catalyst discovery, we are building a unique, scalable, and transferable traineeship model called CataLST (pronounced catalyst).
In this model, trainees will learn how to conduct research at the interface between chemical and data sciences. Specifically, they will work in collaborative teams to:
- Catalog the literature with data mining,
- Learn from this knowledge base using machine learning to uncover new insights,
- Search for new catalysts and reaction conditions with these fundamental insights complemented by computational chemistry models, and
- Test and validate catalytic performance experimentally.
Collaborative Research
Our research does not simply involve passing information over a fence from one area of expertise to another. Rather, the training is designed to support collaborative learning, with each trainee contributing ideas toward a common goal.
This means that chemists, engineers and computer scientists all engage in dialog to catalog and interpret the literature, find ways to automatically extract key data, and then learn from the insights to uncover new relationships/patterns in the search for new catalysts to test experimentally.
The CataLST Model is Adaptable for Many Research Projects
Traditionally, machine learning has been applied to organic catalysts in the cheminformatics field, where researchers seek to predict how molecular structure affects activity.
Our revolutionary approach of machine learning and natural language processing is needed to understand authentic systems under real conditions, since the number of reported catalysts for key reactions has grown beyond the limit of what can readily be gleaned from traditional literature searches, review articles, and tabulations of experimental and computational data.