NRT Research

Our training is designed to support collaborative, student-driven research for transformative advances in chemical and data sciences.

CataLST Traineeship Model

To train students how to harness data for catalyst discovery, we are building a unique, scalable, and transferable traineeship model called CataLST (pronounced catalyst).

In this model, trainees will learn how to conduct research at the interface between chemical and data sciences. Specifically, they will work in collaborative teams to:

Catalog the literature with data mining,
Learn from this knowledge base using machine learning to uncover new insights,
Search for new catalysts and reaction conditions with these fundamental insights complemented by computational chemistry models, and
Test and validate catalytic performance experimentally.

Harnessing Data Revolution for Catalyst Discovery

Our CataLST training model seeks to bridge artificial intelligence (Catalog, Learn, Search) with human intelligence (Search and Test) in order to make extraordinary leaps in the field of catalysis.

ACS Editorial Explains More

How will this training work in practice?

We plan to start small. For example, in Phase I we will start with a limited subset of data from the literature or laboratory, to facilitate interdisciplinary training. Then, this model will be expanded to include extensive datasets in Phase II to develop a more comprehensive framework to assist the target catalysis research area.

Once data collection agents are in place, we expect that new literature from multiple sources will be automatically mined and added to the database for knowledge discovery. Besides practicing the skills learned in Phase I, students in Phase II will also learn information retrieval and data visualization techniques.

We envision that the CataLST cycle, when fully developed and implemented, can be performed on any reaction system, thus providing benefits beyond this NRT.

Collaborative Research

Our research will not simply involve passing information over a fence from one area of expertise to another. Rather, the training is designed to support collaborative learning, with each trainee in the interdisciplinary group contributing ideas toward a common goal.

This means that chemists, engineers and computer scientists will all engage in dialog to catalog and interpret the literature, find ways to automatically extract key data, and then learn from the insights to uncover new relationships/patterns in the search for new catalysts to test experimentally.

The CataLST Model is Adaptable for Many Research Projects

Traditionally, machine learning has been applied to organic catalysts in the cheminformatics field, where researchers seek to predict how molecular structure affects activity. The Internet of Catalysis program will initially target heterogeneous catalyst systems, which are significantly more complicated.

Recently, there has been some work on using machine learning algorithms to understand how calculated adsorption energies relate to catalytic activity on heterogenous catalysts. However, adsorption energies can be computationally expensive and are only applicable to model surfaces.

Our revolutionary approach of machine learning and natural language processing is needed to understand authentic systems under real conditions, since the number of reported catalysts for key reactions has grown beyond the limit of what can readily be gleaned from traditional literature searches, review articles, and tabulations of experimental and computational data.

The CataLST cycle will guide trainees as they collaborate within an interdisciplinary team, enhancing their research in a variety of areas.

Because of its adaptability, the CataLST training model is appropriate for many research projects. Students with an interest in interdisciplinary research are encouraged to apply for this exciting training opportunity.

Ready to Get Involved?

how to apply