Abstract
Machine learning techniques have emerged as a useful tool for identifying complex patterns and correlations in large datasets, such as associating catalyst performance to its physicochemical properties. In the heterogeneous catalysis communities, machine learning models have mostly been developed using high-throughput quantum chemistry calculations, with only a few case studies resulting in experimentally validated catalyst improvements. This limited success may be due to the use of simplified catalyst structures in computational studies and the lack of comprehensive experimental datasets. In this Review, we bring together studies integrating high-throughput approaches and machine learning for the advancement of solid heterogeneous catalysis, leveraging both experimental and computational data. We systematically analyse trends in the field, based on the descriptors used as model input and output; the materials, devices, or reactions investigated; the dataset size; and the overall achievements. Furthermore, for models reporting unitless R2 values, we compare the performances based on these mentioned trends.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
9,800 Yen / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
¥14,900 per year
only ¥1,242 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
References
Rothenberg, G. Catalysis. Concepts and Green Applications 127–187 (Wiley-VCH, 2008).
Tembhurne, S., Nandjou, F. & Haussener, S. A thermally synergistic photo-electrochemical hydrogen generator operating under concentrated solar irradiation. Nat. Energy 4, 399–407 (2019).
Steinfeld, A. Solar thermochemical production of hydrogen — a review. Sol. Energy 78, 603–615 (2005).
Fukushima, A. & Honda, K. Electrochemical photolysis of water at a semiconductor electrode. Nature 238, 37–38 (1972).
Taibi, E., Blanco, H., Miranda, R. & Carmo, M. Green hydrogen cost reduction: scaling up electrolysers to meet the 1.5 °C climate goal. International Renewable Energy Agency https://www.irena.org/publications/2020/Dec/Green-hydrogen-cost-reduction (2020).
Burwell, R. L. in Catalysis. Science and Technology (eds Anderson, J. R. & Boudart, M.) 1–87 (Springer, 1982).
Moulijn, J. A. & van Santen, R. A. in Contemporary Catalysis. Science, Technology, and Applications (eds Kamer, P. C. J., Vogt, D. & Thybaut, J. W.) 3–28 (Royal Society of Chemistry, 2017).
Baerlocher, C., McCusker, L. B. & Olson, D. H. Atlas of Zeolite Framework Types 6th edn (Elsevier, 2007).
Margeta, K. & Farkaš, A. in Zeolites - New Challenges (eds Margeta, K. & Farkaš, A.) Ch. 1 (IntechOpen, 2020).
Green, M. L. et al. Fulfilling the promise of the materials genome initiative with high-throughput experimental methodologies. Appl. Phys. Rev. 4, 011105 (2017).
Dar, Y. L. High-throughput experimentation: a powerful enabling technology for the chemicals and materials industry. Macromol. Rapid Commun. 25, 34–47 (2004).
Steinmann, S. N., Hermawan, A., Bin Jassar, M. & Seh, Z. W. Autonomous high-throughput computations in catalysis. Chem Catal. 2, 940–956 (2022).
Nørskov, J. K., Bligaard, T., Rossmeisl, J. & Christensen, C. H. Towards the computational design of solid catalysts. Nat. Chem. 1, 37–46 (2009).
Farrusseng, D. High-throughput heterogeneous catalysis. Surf. Sci. Rep. 63, 487–513 (2008).
Allen, C. L., Leitch, D. C., Anson, M. S. & Zajac, M. A. The power and accessibility of high-throughput methods for catalysis research. Nat. Catal. 2, 2–4 (2019).
Géron, A. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems 2nd edn (O’Reilly, 2019).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous distributed systems. OSDI’16: Proc. 12th USENIX conference on Operating Systems Design and Implementation 265–283 (OSDI, 2016).
Guido, S. & Müller, A. C. Introduction to Machine Learning with Python: a Guide for Data Scientists 1st edn (O’Reilly, 2016).
Royse, C., Wolter, S. & Greenberg, J. A. Emergence and distinction of classes in XRD data via machine learning. In Proc. SPIE 10999, Anomaly Detection and Imaging with X-Rays (ADIX) https://doi.org/10.1117/12.2519500 (SPIE, 2019).
Kalinin, S. V. et al. Machine learning in scanning transmission electron microscopy. Nat. Rev. Methods Primers 2, 11 (2022).
Carbone, M. R., Topsakal, M., Lu, D. & Yoo, S. Machine-learning X-ray absorption spectra to quantitative accuracy. Phys. Rev. Lett. 124, 156401 (2020).
Modarres, M. H. Neural network for nanoscience scanning electron microscope image recognition. Sci. Rep. 7, 13282 (2017).
Howarth, A., Ermanis, K. & Goodman, J. M. DP4-AI automated NMR data analysis: straight from spectrometer to structure. Chem. Sci. 11, 4351–4359 (2020).
Jensen, Z. et al. A machine learning approach to zeolite synthesis enabled by automatic literature data extraction. ACS Cent. Sci. 5, 892–899 (2019).
Huo, H. et al. Semi-supervised machine-learning classification of materials synthesis procedures. npj Comput. Mater. 5, 62 (2019).
Tang, B. et al. Machine learning-guided synthesis of advanced inorganic materials. Mater. Today 41, 72–80 (2020).
Shimizu, R., Kobayashi, S., Watanabe, Y., Ando, Y. & Hitosugi, T. Autonomous materials synthesis by machine learning and robotics. APL Mater. 8, 111110 (2020).
Shambhawi, Mohan, O., Choksi, T. S. & Lapkin, A. A. The design and optimization of heterogeneous catalysts using computational methods. Catal. Sci. Technol. 14, 515–532 (2024).
Günay, M. E. & Yıldırım, R. Recent advances in knowledge discovery for heterogeneous catalysis using machine learning. Catal. Rev. 63, 120–164 (2021).
McCullough, K., Williams, T., Mingle, K., Jamshidi, P. & Lauterbach, J. High-throughput experimentation meets artificial intelligence: a new pathway to catalyst discovery. Phys. Chem. Chem. Phys. 22, 11174–11196 (2020).
Goldsmith, B. R., Esterhuizen, J., Liu, J., Bartel, C. J. & Sutton, C. Machine learning for heterogeneous catalyst design and discovery. AIChE J. 64, 2311–2323 (2018).
Bahn, S. R. & Jacobsen, K. W. An object-oriented scripting interface to a legacy electronic structure code. Comput. Sci. Eng. 4, 56–66 (2002).
Ong, S. P. et al. Python materials genomics (pymatgen): a robust, open-source python library for materials analysis. Comput. Mater. Sci. 68, 314–319 (2013).
Jain, A. et al. FireWorks: a dynamic workflow system designed for high-throughput applications. Concurr. Comput. Pract. Exp. 27, 5037–5059 (2015).
Mölder, F. et al. Sustainable data analysis with Snakemake. F1000Research 10, 33 (2021).
Huber, S. P. et al. AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance. Sci. Data 7, 300 (2020).
Álvarez-Moreno, M. et al. Managing the computational chemistry big data problem: the ioChem-BD platform. J. Chem. Inf. Model. 55, 95–103 (2015).
Scheidgen, M. et al. NOMAD: a distributed web-based platform for managing materials science research data. J. Open Source Softw. 8, 5388 (2023).
Esters, M. et al. aflow.org: a web ecosystem of databases, software and tools. Comput. Mater. Sci. 216, 111808 (2023).
Saal, J. E., Kirklin, S., Aykol, M., Meredig, B. & Wolverton, C. Materials design and discovery with high-throughput density functional theory: the open quantum materials database (OQMD). JOM 65, 1501–1509 (2013).
Jain, A. et al. Commentary: the materials project: a materials genome approach to accelerating materials innovation. APL Mater. 1, 011002 (2013).
Bo, C., Maseras, F. & López, N. The role of computational results databases in accelerating the discovery of catalysts. Nat. Catal. 1, 809–810 (2018).
Tran, R. et al. The open catalyst 2022 (OC22) dataset and challenges for oxide electrocatalysts. ACS Catal. 13, 3066–3084 (2023).
Chanussot, L. et al. Open catalyst 2020 (OC20) dataset and community challenges. ACS Catal. 11, 6059–6072 (2021).
Tezak, C. et al. BEAST DB: grand-canonical database of electrocatalyst properties. J. Phys. Chem. C 128, 20165–20176 (2024).
Wilkinson, M. D. et al. The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).
Alegre-Requena, J. V., Sowndarya, S., Alturaifi, T., Pérez-Soto, R. & Paton, R. AQME: automated quantum mechanical environments for researchers and educators. Wiley Interdiscip. Rev. Comput. Mol. Sci. 13, e1663 (2023).
Senocrate, A. et al. Parallel experiments in electrochemical CO2 reduction enabled by standardized analytics. Nat. Catal. 7, 742–752 (2024).
Jones, R. J. R. et al. Accelerated screening of gas diffusion electrodes for carbon dioxide reduction. Digit. Discov. 3, 1144–1149 (2024).
Chammingkwan, P., Terano, M. & Taniike, T. High-throughput synthesis of support materials for olefin polymerization catalyst. ACS Comb. Sci. 19, 331–342 (2017).
Nguyen, T. N. et al. High-throughput experimentation and catalyst informatics for oxidative coupling of methane. ACS Catal. 10, 921–932 (2020).
Barad, H.-N. et al. Combinatorial growth of multinary nanostructured thin functional films. Mater. Today 50, 89–99 (2021).
Batchelor, T. A. A. et al. Complex-solid-solution electrocatalyst discovery by computational prediction and high-throughput experimentation. Angew. Chem. Int. Ed. 60, 6932–6937 (2021).
Strotkötter, V. et al. Discovery of high-entropy oxide electrocatalysts: from thin-film material libraries to particles. Chem. Mater. 34, 10291–10303 (2022).
Zerdoumi, R. et al. Combinatorial screening of electronic and geometric effects in compositionally complex solid solutions toward a rational design of electrocatalysts. Adv. Energy Mater. 14, 2302177 (2024).
Yang, K. et al. Development of a high-throughput methodology for screening coking resistance of modified thin-film catalysts. ACS Comb. Sci. 14, 372–377 (2012).
Abed, J. et al. Open catalyst experiments 2024 (OCx24): bridging experiments and computational models. Preprint at https://doi.org/10.48550/arXiv.2411.11783 (2024).
Reddington, E. et al. Combinatorial electrochemistry: a highly parallel, optical screening method for discovery of better electrocatalysts. Science 280, 1735–1737 (1998).
Seley, D., Ayers, K. & Parkinson, B. A. Combinatorial search for improved metal oxide oxygen evolution electrocatalysts in acidic electrolytes. ACS Comb. Sci. 15, 82–89 (2013).
Katz, J. E., Gingrich, T. R., Santori, E. A. & Lewis, N. S. Combinatorial synthesis and high-throughput photopotential and photocurrent screening of mixed-metal oxides for photoelectrochemical water splitting. Energy Environ. Sci. 2, 103–112 (2009).
Stein, H. S. et al. Functional mapping reveals mechanistic clusters for OER catalysis across (Cu–Mn–Ta–Co–Sn–Fe)Ox composition and pH space. Mater. Horiz. 6, 1251–1258 (2019).
Gregoire, J. M. et al. Combined catalysis and optical screening for high throughput discovery of solar fuels catalysts. J. Electrochem. Soc. 160, F337 (2013).
Shinde, A. et al. High-throughput screening for acid-stable oxygen evolution electrocatalysts in the (Mn–Co–Ta–Sb)Ox composition space. Electrocatalysis 6, 229–236 (2015).
Rohr, B. et al. Benchmarking the acceleration of materials discovery by sequential learning. Chem. Sci. 11, 2696–2706 (2020).
Guevarra, D. et al. High throughput discovery of complex metal oxide electrocatalysts for the oxygen reduction reaction. Electrocatalysis 13, 1–10 (2022).
Woodhouse, M. & Parkinson, B. A. Combinatorial discovery and optimization of a complex oxide with water photoelectrolysis activity. Chem. Mater. 20, 2495–2502 (2008).
Kafizas, A. et al. Optimizing the activity of nanoneedle structured WO3 photoanodes for solar water splitting: direct synthesis via chemical vapor deposition. J. Phys. Chem. C 121, 5983–5993 (2017).
Woodhouse, M., Herman, G. S. & Parkinson, B. A. Combinatorial approach to identification of catalysts for the photoelectrolysis of water. Chem. Mater. 17, 4318–4324 (2005).
Zhou, L. et al. Quaternary oxide photoanode discovery improves the spectral response and photovoltage of copper vanadates. Matter 3, 1614–1630 (2020).
Greeley, J. Theoretical heterogeneous catalysis: scaling relationships and computational catalyst design. Annu. Rev. Chem. Biomol. Eng. 7, 605–635 (2016).
Medford, A. J. et al. From the Sabatier principle to a predictive theory of transition-metal heterogeneous catalysis. J. Catal. 328, 36–42 (2015).
Thornton, A. W., Winkler, D. A., Liu, M. S., Haranczyk, M. & Kennedy, D. F. Towards computational design of zeolite catalysts for CO2 reduction. RSC Adv. 5, 44361–44370 (2015).
Ma, X., Li, Z., Achenie, L. E. K. & Xin, H. Machine-learning-augmented chemisorption model for CO2 electroreduction catalyst screening. J. Phys. Chem. Lett. 6, 3528–3533 (2015).
Chen, Y., Huang, Y., Cheng, T. & Goddard, W. A. Identifying active sites for CO2 reduction on dealloyed gold surfaces by combining machine learning with multiscale simulations. J. Am. Chem. Soc. 141, 11651–11657 (2019).
Gu, G. H. et al. Practical deep-learning representation for fast heterogeneous catalyst screening. J. Phys. Chem. Lett. 11, 3185–3191 (2020).
Chen, A., Zhang, X., Chen, L., Yao, S. & Zhou, Z. A machine learning model on simple features for CO2 reduction electrocatalysts. J. Phys. Chem. C 124, 22471–22478 (2020).
Yohannes, A. G. et al. Combined high-throughput DFT and ML screening of transition metal nitrides for electrochemical CO2 reduction. ACS Catal. 13, 9007–9017 (2023).
Yang, Z., Gao, W. & Jiang, Q. A machine learning scheme for the catalytic activity of alloys with intrinsic descriptors. J. Mater. Chem. A 8, 17507–17515 (2020).
Mok, D. H. & Back, S. Atomic structure-free representation of active motifs for expedited catalyst discovery. J. Chem. Inf. Model. 61, 4514–4520 (2021).
Noh, J., Back, S., Kim, J. & Jung, Y. Active learning with non-ab initio input features toward efficient CO2 reduction catalysts. Chem. Sci. 9, 5152–5159 (2018).
Zhong, M. et al. Accelerated discovery of CO2 electrocatalysts using active machine learning. Nature 581, 178–183 (2020).
Pankajakshan, P. et al. Machine learning and statistical analysis for materials science: stability and transferability of fingerprint descriptors and chemical insights. Chem. Mater. 29, 4190–4201 (2017).
Tran, K. & Ulissi, Z. W. Active learning across intermetallics to guide discovery of electrocatalysts for CO2 reduction and H2 evolution. Nat. Catal. 1, 696–703 (2018).
Friederich, P., Häse, F., Proppe, J. & Aspuru-Guzik, A. Machine-learned potentials for next-generation matter simulations. Nat. Mater. 20, 750–761 (2021).
Kocer, E., Ko, T. W. & Behler, J. Neural network potentials: a concise overview of methods. Annu. Rev. Phys. Chem. 73, 163–186 (2022).
Artrith, N. & Kolpak, A. M. Understanding the composition and activity of electrocatalytic nanoalloys in aqueous solvents: a combination of DFT and accurate neural network potentials. Nano Lett. 14, 2670–2676 (2014).
Ulissi, Z. W. et al. Machine-learning methods enable exhaustive searches for active bimetallic facets and reveal active site motifs for CO2 reduction. ACS Catal. 7, 6600–6608 (2017).
Lunger, J. R. et al. Towards atom-level understanding of metal oxide catalysts for the oxygen evolution reaction with machine learning. npj Comput. Mater. 10, 80 (2024).
Li, Z., Achenie, L. E. K. & Xin, H. An adaptive machine learning strategy for accelerating discovery of perovskite electrocatalysts. ACS Catal. 10, 4377–4384 (2020).
Flores, R. A. et al. Active learning accelerated discovery of stable iridium oxide polymorphs for the oxygen evolution reaction. Chem. Mater. 32, 5854–5863 (2020).
Andersen, M. & Reuter, K. Adsorption enthalpies for catalysis modeling through machine-learned descriptors. Acc. Chem. Res. 54, 2741–2749 (2021).
Andersen, M., Levchenko, S. V., Scheffler, M. & Reuter, K. Beyond scaling relations for the description of catalytic materials. ACS Catal. 9, 2752–2759 (2019).
Abed, J. et al. Pourbaix machine learning framework identifies acidic water oxidation catalysts exhibiting suppressed ruthenium dissolution. J. Am. Chem. Soc. 146, 15740–15750 (2024).
Chen, L. et al. A universal machine learning framework for electrocatalyst innovation: a case study of discovering alloys for hydrogen evolution reaction. Adv. Func. Mater. 32, 2208418 (2022).
Zheng, J. et al. High-throughput screening of hydrogen evolution reaction catalysts in MXene materials. J. Phys. Chem. C 124, 13695–13705 (2020).
Abraham, B. M., Sinha, P., Halder, P. & Singh, J. K. Fusing a machine learning strategy with density functional theory to hasten the discovery of 2D MXene-based catalysts for hydrogen generation. J. Mater. Chem. A 11, 8091–8100 (2023).
Ge, L. et al. Predicted optimal bifunctional electrocatalysts for the hydrogen evolution reaction and the oxygen evolution reaction using chalcogenide heterostructures based on machine learning analysis of in silico quantum mechanics based high throughput screening. J. Phys. Chem. Lett. 11, 869–876 (2020).
Wexler, R. B., Martirez, J. M. P. & Rappe, A. M. Chemical pressure-driven enhancement of the hydrogen evolving activity of Ni2P from nonmetal surface doping interpreted via machine learning. J. Am. Chem. Soc. 140, 4678–4683 (2018).
Parker, A. J., Opletal, G. & Barnard, A. S. Classification of platinum nanoparticle catalysts using machine learning. J. Appl. Phys. 128, 014301 (2020).
Sun, B., Barron, H., Opletal, G. & Barnard, A. S. From process to properties: correlating synthesis conditions and structural disorder of platinum nanocatalysts. J. Phys. Chem. C 122, 28085–28093 (2018).
Rück, M., Garlyyev, B., Mayr, F., Bandarenka, A. S. & Gagliardi, A. Oxygen reduction activities of strained platinum core–shell electrocatalysts predicted by machine learning. J. Phys. Chem. Lett. 11, 1773–1780 (2020).
Chun, H. et al. First-principle-data-integrated machine-learning approach for high-throughput searching of ternary electrocatalyst toward oxygen reduction reaction. Chem Catal. 1, 855–869 (2021).
Kang, J. et al. First-principles database driven computational neural network approach to the discovery of active ternary nanocatalysts for oxygen reduction reaction. Phys. Chem. Chem. Phys. 20, 24539–24544 (2018).
Batchelor, T. A. A. et al. High-entropy alloys as a discovery platform for electrocatalysis. Joule 3, 834–845 (2019).
Svane, K. L. & Rossmeisl, J. Theoretical optimization of compositions of high-entropy oxides for the oxygen evolution reaction. Angew. Chem. Int. Ed. 61, e202201146 (2022).
Wan, X. et al. Machine-learning-assisted discovery of highly efficient high-entropy alloy catalysts for the oxygen reduction reaction. Patterns 3, 100553 (2022).
Xu, W., Diesen, E., He, T., Reuter, K. & Margraf, J. T. Discovering high entropy alloy electrocatalysts in vast composition spaces with multiobjective optimization. J. Am. Chem. Soc. 146, 7698–7707 (2024).
Pedersen, J. K. et al. Bayesian optimization of high-entropy alloy compositions for electrocatalytic oxygen reduction. Angew. Chem. Int. Ed. 60, 24144–24152 (2021).
Jinnouchi, R., Hirata, H. & Asahi, R. Extrapolating energetics on clusters and single-crystal surfaces to nanoparticles by machine-learning scheme. J. Phys. Chem. C 121, 26397–26405 (2017).
Jinnouchi, R. & Asahi, R. Predicting catalytic activity of nanoparticles by a DFT-aided machine-learning algorithm. J. Phys. Chem. Lett. 8, 4279–4283 (2017).
Hutchinson, M. L. et al. Overcoming data scarcity with transfer learning. Preprint at https://doi.org/10.48550/arXiv.1711.05099 (2017).
Zafari, M., Kumar, D., Umer, M. & Kim, K. S. Machine learning-based high throughput screening for nitrogen fixation on boron-doped single atom catalysts. J. Mater. Chem. A 8, 5209–5216 (2020).
Kim, M. et al. Artificial intelligence to accelerate the discovery of N2 electroreduction catalysts. Chem. Mater. 32, 709–720 (2020).
Shakouri, K., Behler, J., Meyer, J. & Kroes, G.-J. Accurate neural network description of surface phonons in reactive gas–surface dynamics: N2 + Ru(0001). J. Phys. Chem. Lett. 8, 2131–2136 (2017).
Boes, J. R. & Kitchin, J. R. Neural network predictions of oxygen interactions on a dynamic Pd surface. Mol. Simul. 43, 346–354 (2017).
Li, Z., Wang, S., Chin, W. S., Achenie, L. E. & Xin, H. High-throughput screening of bimetallic catalysts enabled by machine learning. J. Mater. Chem. A 5, 24131–24138 (2017).
Back, S. et al. Convolutional neural network of atomic surface structures to predict binding energies for high-throughput screening of catalysts. J. Phys. Chem. Lett. 10, 4401–4408 (2019).
Davran-Candan, T., Günay, M. E. & Yıldırım, R. Structure and activity relationship for CO and O2 adsorption over gold nanoparticles using density functional theory and artificial neural networks. J. Chem. Phys. 132, 174113 (2010).
Tomacruz, J. G. T., Pilario, K. E. S., Remolona, M. F. M., Padama, A. A. B. & Ocon, J. D. A machine learning-accelerated density functional theory (ML-DFT) approach for predicting atomic adsorption energies on monometallic transition metal surfaces for electrocatalyst screening. Chem. Eng. Trans. 94, 733–738 (2022).
Panapitiya, G. et al. Machine-learning prediction of CO adsorption in thiolated, Ag-alloyed Au nanoclusters. J. Am. Chem. Soc. 140, 17508–17514 (2018).
Pablo-García, S. et al. Fast evaluation of the adsorption energy of organic molecules on metals via graph neural networks. Nat. Comput. Sci. 3, 433–442 (2023).
Dasgupta, A., Gao, Y., Broderick, S. R., Pitman, E. B. & Rajan, K. Machine learning-aided identification of single atom alloy catalysts. J. Phys. Chem. C 124, 14158–14166 (2020).
Jung, H., Sauerland, L., Stocker, S., Reuter, K. & Margraf, J. T. Machine-learning driven global optimization of surface adsorbate geometries. npj Comput. Mater. 9, 114 (2023).
Ock, J., Badrinarayanan, S., Magar, R., Antony, A. & Farimani, A. B. Multimodal language and graph learning of adsorption configuration in catalysis. Nat. Mach. Intell. 6, 1501–1511 (2024).
Noh, J. & Chang, H. Data-driven prediction of configurational stability of molecule-adsorbed heterogeneous catalysts. J. Chem. Inf. Model. 63, 5981–5995 (2023).
Toyao, T. et al. Toward effective utilization of methane: machine learning prediction of adsorption energies on metal alloys. J. Phys. Chem. C 122, 8315–8326 (2018).
Singh, A. R., Rohr, B. A., Gauthier, J. A. & Nørskov, J. K. Predicting chemical reaction barriers with a machine learning model. Catal. Lett. 149, 2347–2354 (2019).
Takahashi, K. & Miyazato, I. Rapid estimation of activation energy in heterogeneous catalytic reactions via machine learning. J. Comput. Chem. 39, 2405–2408 (2018).
Bang, G. J., Gu, G. H., Noh, J. & Jung, Y. Activity trends of methane oxidation catalysts under emission conditions. ACS Catal. 12, 10255–10263 (2022).
Li, X.-T., Chen, L., Wei, G.-F., Shang, C. & Liu, Z.-P. Sharp increase in catalytic selectivity in acetylene semihydrogenation on Pd achieved by a machine learning simulation-guided experiment. ACS Catal. 10, 9694–9705 (2020).
Ulissi, Z. W., Singh, A. R., Tsai, C. & Nørskov, J. K. Automated discovery and construction of surface phase diagrams using machine learning. J. Phys. Chem. Lett. 7, 3931–3935 (2016).
Ulissi, Z. W., Medford, A. J., Bligaard, T. & Nørskov, J. K. To address surface reaction network complexity using scaling relations machine learning and DFT calculations. Nat. Commun. 8, 14621 (2017).
Gu, G. H. & Vlachos, D. G. Group additivity for thermochemical property estimation of lignin monomers on Pt(111). J. Phys. Chem. C 120, 19234–19241 (2016).
Natarajan, S. K. & Behler, J. Neural network molecular dynamics simulations of solid–liquid interfaces: water at low-index copper surfaces. Phys. Chem. Chem. Phys. 18, 28704–28725 (2016).
Artrith, N. & Kolpak, A. M. Grand canonical molecular dynamics simulations of Cu–Au nanoalloys in thermal equilibrium using reactive ANN potentials. Comput. Mater. Sci. 110, 20–28 (2015).
Lansford, J. L. & Vlachos, D. G. Infrared spectroscopy data- and physics-driven machine learning for characterizing surface microstructure of complex materials. Nat. Commun. 11, 1513 (2020).
Zhai, H. & Alexandrova, A. N. Ensemble-average representation of Pt clusters in conditions of catalysis accessed through GPU accelerated deep neural network fitting global optimization. J. Chem. Theory Comput. 12, 6213–6226 (2016).
Fernandez, M., Barron, H. & Barnard, A. S. Artificial neural network analysis of the catalytic efficiency of platinum nanoparticles. RSC Adv. 7, 48962–48971 (2017).
Su, Y.-Q. et al. Stability of heterogeneous single-atom catalysts: a scaling law mapping thermodynamics to kinetics. npj Comput. Mater. 6, 144 (2020).
Saadun, A. J. et al. Performance of metal-catalyzed hydrodebromination of dibromomethane analyzed by descriptors derived from statistical learning. ACS Catal. 10, 6129–6143 (2020).
Pablo-García, S. et al. Generalizing performance equations in heterogeneous catalysis from hybrid data and statistical learning. ACS Catal. 12, 1581–1594 (2022).
Corma, A., Serra, J., Serna, P. & Moliner, M. Integrating high-throughput characterization into combinatorial heterogeneous catalysis: unsupervised construction of quantitative structure/property relationship models. J. Catal. 232, 335–341 (2005).
Corma, A. et al. Optimisation of olefin epoxidation catalysts with the application of high-throughput and genetic algorithms assisted by artificial neural networks (softcomputing techniques). J. Catal. 229, 513–524 (2005).
Baumes, L. A., Serna, P. & Corma, A. Merging traditional and high-throughput approaches results in efficient design, synthesis and screening of catalysts for an industrial process. Appl. Catal. A Gen. 381, 197–208 (2010).
Baumes, L. A., Serra, J. M., Serna, P. & Corma, A. Support vector machines for predictive modeling in heterogeneous catalysis: a comprehensive introduction and overfitting investigation based on two real applications. J. Comb. Chem. 8, 583–596 (2006).
Serra, J. M., Chica, A. & Corma, A. Development of a low temperature light paraffin isomerization catalysts with improved resistance to water and sulphur by combinatorial methods. Appl. Catal. A Gen. 239, 35–42 (2003).
Holeňa, M. & Baerns, M. Feedforward neural networks in catalysis. Catal. Today 81, 485–494 (2003).
Klanner, C. et al. The development of descriptors for solids: teaching “catalytic intuition” to a computer. Angew. Chem. Int. Ed. 43, 5347–5349 (2004).
Artrith, N., Lin, Z. & Chen, J. G. Predicting the activity and selectivity of bimetallic metal catalysts for ethanol reforming using machine learning. ACS Catal. 10, 9438–9444 (2020).
Jayakumar, T. P., Suresh Babu, S. P., Nguyen, T. N., Le, S. D. & Taniike, T. Exploration of ethanol-to-butadiene catalysts by high-throughput experimentation and machine learning. Appl. Catal. A Gen. 666, 119427 (2023).
Hattori, T. & Kito, S. Neural network as a tool for catalyst development. Catal. Today 23, 347–355 (1995).
Madaan, N., Shiju, N. R. & Rothenberg, G. Predicting the performance of oxidation catalysts using descriptor models. Catal. Sci. Technol. 6, 125–133 (2016).
Arcotumapathy, V., Siahvashi, A. & Adesina, A. A. A new weighted optimal combination of ANNs for catalyst design and reactor operation: methane steam reforming studies. AIChE J. 58, 2412–2427 (2012).
Baysal, M., Günay, M. E. & Yıldırım, R. Decision tree analysis of past publications on catalytic steam reforming to develop heuristics for high performance: a statistical review. Int. J. Hydrog. Energy 42, 243–254 (2017).
Şener, A. N., Günay, M. E., Leba, A. & Yıldırım, R. Statistical review of dry reforming of methane literature using decision tree and artificial neural network analysis. Catal. Today 299, 289–302 (2018).
Hossain, M. A., Ayodele, B. V., Cheng, C. K. & Khan, M. R. Artificial neural network modeling of hydrogen-rich syngas production from methane dry reforming over novel Ni/CaFe2O4 catalysts. Int. J. Hydrog. Energy 41, 11119–11130 (2016).
Han, X. et al. Using data mining technology in screening potential additives to Ni/Al2O3 catalysts for methanation. Catal. Sci. Technol. 7, 6042–6049 (2017).
Zavyalova, U., Holena, M., Schlögl, R. & Baerns, M. Statistical analysis of past catalytic data on oxidative methane coupling for new insights into the composition of high-performance catalysts. ChemCatChem 3, 1935–1947 (2011).
Takahashi, K., Takahashi, L., Nguyen, T. N., Thakur, A. & Taniike, T. Multidimensional classification of catalysts in oxidative coupling of methane through machine learning and high-throughput data. J. Phys. Chem. Lett. 11, 6819–6826 (2020).
Taniike, T., Fujiwara, A., Nakanowatari, S., García-Escobar, F. & Takahashi, K. Automatic feature engineering for catalyst design using small data without prior knowledge of target catalysis. Commun. Chem. 7, 11 (2024).
Palkovits, S. A primer about machine learning in catalysis – a tutorial with code. ChemCatChem 12, 3995–4008 (2020).
Pirro, L. et al. Descriptor–property relationships in heterogeneous catalysis: exploiting synergies between statistics and fundamental kinetic modelling. Catal. Sci. Technol. 9, 3109–3125 (2019).
Schmack, R. et al. A meta-analysis of catalytic literature data reveals property-performance correlations for the OCM reaction. Nat. Commun. 10, 441 (2019).
Takahashi, K., Miyazato, I., Nishimura, S. & Ohyama, J. Unveiling hidden catalysts for the oxidative coupling of methane based on combining machine learning with literature data. ChemCatChem 10, 3223–3228 (2018).
Kondratenko, E. V., Schlüter, M., Baerns, M., Linke, D. & Holena, M. Developing catalytic materials for the oxidative coupling of methane through statistical analysis of literature data. Catal. Sci. Technol. 5, 1668–1677 (2015).
Odabaşı, Ç., Günay, M. E. & Yıldırım, R. Knowledge extraction for water gas shift reaction over noble metal catalysts from publications in the literature between 2002 and 2012. Int. J. Hydrog. Energy 39, 5733–5746 (2014).
Günay, M. E. & Yildirim, R. Modeling preferential CO oxidation over promoted Au/Al2O3 catalysts using decision trees and modular neural networks. Chem. Eng. Res. Des. 91, 874–882 (2013).
Günay, M. E. & Yildirim, R. Knowledge extraction from catalysis of the past: a case of selective CO oxidation over noble metal catalysts between 2000 and 2012. ChemCatChem 5, 1395–1406 (2013).
Günay, M. E. & Yildirim, R. Developing global reaction rate model for CO oxidation over Au catalysts from past data in literature using artificial neural networks. Appl. Catal. A Gen. 468, 395–402 (2013).
Günay, M. E. & Yildirim, R. Neural network analysis of selective CO oxidation over copper-based catalysts for knowledge extraction from published data in the literature. Ind. Eng. Chem. Res. 50, 12488–12500 (2011).
Smith, A., Keane, A., Dumesic, J. A., Huber, G. W. & Zavala, V. M. A machine learning framework for the analysis and prediction of catalytic activity from experimental data. Appl. Catal. B Environ. 263, 118257 (2020).
Li, J., Pan, L., Suvarna, M. & Wang, X. Machine learning aided supercritical water gasification for H2-rich syngas production with process optimization and catalyst screening. Chem. Eng. J. 426, 131285 (2021).
Baumes, L., Farrusseng, D., Lengliz, M. & Mirodatos, C. Using artificial neural networks to boost high-throughput discovery in heterogeneous catalysis. QSAR Comb. Sci. 23, 767–778 (2004).
Günay, M. E., Türker, L. & Tapan, N. A. Decision tree analysis for efficient CO2 utilization in electrochemical systems. J. CO2 Util. 28, 83–95 (2018).
Sun, Y., Yang, G., Wen, C., Zhang, L. & Sun, Z. Artificial neural networks with response surface methodology for optimization of selective CO2 hydrogenation using K-promoted iron catalyst in a microchannel reactor. J. CO2 Util. 24, 10–21 (2018).
Suvarna, M., Araújo, T. P. & Pérez-Ramírez, J. A generalized machine learning framework to predict the space-time yield of methanol from thermocatalytic CO2 hydrogenation. Appl. Catal. B Environ. 315, 121530 (2022).
Estahbanati, M. R. K., Feilizadeh, M. & Iliuta, M. C. Photocatalytic valorization of glycerol to hydrogen: optimization of operating parameters by artificial neural network. Appl. Catal. B Environ. 209, 483–492 (2017).
Leonard, K. C. & Bard, A. J. Pattern recognition correlating materials properties of the elements to their kinetics for the hydrogen evolution reaction. J. Am. Chem. Soc. 135, 15885–15889 (2013).
Can, E. & Yildirim, R. Data mining in photocatalytic water splitting over perovskites literature for higher hydrogen production. Appl. Catal. B Environ. 242, 267–283 (2019).
Hickman, R. J., Häse, F., Roch, L. M. & Aspuru-Guzik, A. Gemini: dynamic bias correction for autonomous experimentation and molecular simulation. Preprint at https://doi.org/10.48550/arXiv.2103.03391 (2021).
Jenewein, K. J. et al. Navigating the unknown with AI: multiobjective Bayesian optimization of non-noble acidic OER catalysts. J. Mater. Chem. A 12, 3072–3083 (2024).
Serra, J. M. & Vert, V. B. Quaternary mixture designs applied to the development of multi-element oxygen electrocatalysts based on the Ln0.58Sr0.4Fe0.8Co0.2O3−δ system (Ln = La1−x−y−zPrxSmyBaz: predictive modeling approaches. Catal. Today 159, 47–54 (2011).
Hong, W. T., Welsch, R. E. & Shao-Horn, Y. Descriptors of oxygen-evolution activity for oxides: a statistical evaluation. J. Phys. Chem. C 120, 78–86 (2016).
Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. In Proc. 33rd International Conference on Neural Information Processing Systems (eds Wallach, A. M. et al.) 8026–8037 (Curran, 2019).
Bezanson, J., Edelman, A., Karpinski, S. & Shah, V. B. Julia: a fresh approach to numerical computing. SIAM Rev. 59, 65–98 (2017).
Chen, T. & Guestrin, C. XGBoost: a scalable tree boosting system. KDD ‘16: Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (ACM, 2016).
Wu, Y., Walsh, A. & Ganose, A. M. Race to the bottom: Bayesian optimisation for chemical problems. Digit. Discov. 3, 1086–1100 (2024).
Sui, F., Guo, R., Zhang, Z., Gu, G. X. & Lin, L. Deep reinforcement learning for digital materials design. ACS Mater. Lett. 3, 1433–1439 (2021).
Pizzuto, G. et al. Accelerating laboratory automation through robot skill learning for sample scraping. In 2024 IEEE 20th International Conference on Automation Science and Engineering (CASE) 2103–2110 (IEEE, 2024).
Lan, T., Wang, H. & An, Q. Enabling high throughput deep reinforcement learning with first principles to investigate catalytic reaction mechanisms. Nat. Commun. 15, 6281 (2024).
Lan, T. & An, Q. Discovering catalytic reaction networks using deep reinforcement learning from first-principles. J. Am. Chem. Soc. 143, 16804–16812 (2021).
Mamun, O., Winther, K. T., Boes, J. R. & Bligaard, T. A Bayesian framework for adsorption energy prediction on bimetallic alloy catalysts. npj Comput. Mater. 6, 177 (2020).
Farrusseng, D. et al. Design of discovery libraries for solids based on QSAR models. QSAR Comb. Sci. 24, 78–93 (2005).
Farrusseng, D., Clerc, F., Mirodatos, C. & Rakotomalala, R. Virtual screening of materials using neuro-genetic approach: concepts and implementation. Comput. Mater. Sci. 45, 52–59 (2009).
Tapan, N. A., Günay, M. E. & Yildirim, R. Constructing global models from past publications to improve design and operating conditions for direct alcohol fuel cells. Chem. Eng. Res. Des. 105, 162–170 (2016).
Alper Tapan, N., Yıldırım, R. & Günay, M. E. Analysis of past experimental data in literature to determine conditions for high performance in biodiesel production: determining conditions for high performance in biodiesel production. Biofuels Bioprod. Biorefin. 10, 422–434 (2016).
Suvarna, M., Preikschas, P. & Pérez-Ramírez, J. Identifying descriptors for promoted rhodium-based catalysts for higher alcohol synthesis via machine learning. ACS Catal. 12, 15373–15385 (2022).
Bozal-Ginesta, C. et al. Performance prediction of high-entropy perovskites La0.8Sr0.2MnxCoyFezO3 with automated high-throughput characterization of combinatorial libraries and machine learning. Adv. Mater. 36, e2407372 (2024).
Szymanski, N. J. et al. An autonomous laboratory for the accelerated synthesis of novel materials. Nature 624, 86–91 (2023).
Cheetham, A. K. & Seshadri, R. Artificial intelligence driving materials discovery? Perspective on the article: scaling deep learning for materials discovery. Chem. Mater. 36, 3490–3495 (2024).
Leeman, J. Challenges in high-throughput inorganic materials prediction and autonomous synthesis. PRX Energy 3, 011002 (2024).
Chen, X., Singh, M. M. & Geyer, P. Utilizing domain knowledge: robust machine learning for building energy performance prediction with small, inconsistent datasets. Knowl.-Based Syst. 294, 111774 (2024).
Murdock, R. J., Kauwe, S. K., Wang, A. Y.-T. & Sparks, T. D. Is domain knowledge necessary for machine learning materials properties? Integr. Mater. Manuf. Innov. 9, 221–227 (2020).
Wang, L., He, T. & Ouyang, B. The impact of domain knowledge on universal machine learning models. Preprint at https://doi.org/10.26434/chemrxiv-2024-fmq8p (2024).
Veeramani, M., Doss, S. S., Narasimhan, S. & Bhatt, N. Semi-supervised machine learning approach for reaction stoichiometry and kinetic model identification using spectral data from flow reactors. React. Chem. Eng. 9, 355–368 (2024).
Kunz, M. R. et al. Data driven reaction mechanism estimation via transient kinetics and machine learning. Chem. Eng. J. 420, 129610 (2021).
Kollenz, P., Herten, D.-P. & Buckup, T. Unravelling the kinetic model of photochemical reactions via deep learning. J. Phys. Chem. B 124, 6358–6368 (2020).
Esterhuizen, J. A., Goldsmith, B. R. & Linic, S. Interpretable machine learning for knowledge generation in heterogeneous catalysis. Nat. Catal. 5, 175–184 (2022).
Xin, H., Mou, T., Pillai, H. S., Wang, S.-H. & Huang, Y. Interpretable machine learning for catalytic materials design toward sustainability. Acc. Mater. Res. 5, 22–34 (2024).
Fare, C., Fenner, P., Benatan, M., Varsi, A. & Pyzer-Knapp, E. O. A multi-fidelity machine learning approach to high throughput materials screening. npj Comput. Mater. 8, 257 (2022).
Goodlett, S. M., Turney, J. M. & Schaefer, H. F. III Comparison of multifidelity machine learning models for potential energy surfaces. J. Chem. Phys. 159, 044111 (2023).
Liu, X., De Breuck, P.-P., Wang, L. & Rignanese, G.-M. A simple denoising approach to exploit multi-fidelity data for machine learning materials properties. npj Comput. Mater. 8, 233 (2022).
Artrith, N. et al. Best practices in machine learning for chemistry. Nat. Chem. 13, 505–508 (2021).
Acknowledgements
C.B.-G. acknowledges funding from a Marie Skłodowska Curie Actions Postdoctoral Fellowship grant (101064374). C.C. and S.P.-G. acknowledge that this material is based upon work supported by the U.S. Department of Energy, Office of Science, Subaward by “University of Minnesota, Project title: Development of Machine Learning and Molecular Simulation Approaches to Accelerate the Discovery of Porous Materials for Energy-Relevant Applications” under Award Number DE-SC0023454. A.T. acknowledges support from the Generalitat de Catalunya (2021-SGR-00750, NANOEN). A.A.-G. thanks A. G. Frøseth for his generous support. A.A.-G. also acknowledges the generous support from the Acceleration Consortium, the Natural Resources Canada and the Canada 150 Research Chairs programme.
Author information
Authors and Affiliations
Contributions
C.B.-G., S.P.-G. and C.C. researched data for the article. C.B.-G. and S.P.-G. wrote the article. C.B.-G., S.P.-G., A.T. and A.A.-G. reviewed and edited the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Reviews Chemistry thanks Hongliang Xin and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Related links
2024 Nobel Prize in Chemistry: https://www.nobelprize.org/prizes/chemistry/2024/summary/
2024 Nobel Prize in Physics: https://www.nobelprize.org/prizes/physics/2024/summary/
Catalysis Hub: https://www.catalysis-hub.org/
Crystallography Open Database: http://www.crystallography.net/cod/
NIST databases: https://www.nist.gov/
NREL Materials Database: https://materials.nrel.gov/
OpenAI’s ChatGPT: https://chatgpt.com
Open Catalyst Project: https://opencatalystproject.org/
Pauling File – Inorganic Materials Database: https://paulingfile.com/
Supplementary information
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bozal-Ginesta, C., Pablo-García, S., Choi, C. et al. Developing machine learning for heterogeneous catalysis with experimental and computational data. Nat Rev Chem (2025). https://doi.org/10.1038/s41570-025-00740-4
Accepted:
Published:
DOI: https://doi.org/10.1038/s41570-025-00740-4