Your privacy, your choice

We use essential cookies to make sure the site can function. We also use optional cookies for advertising, personalisation of content, usage analysis, and social media.

By accepting optional cookies, you consent to the processing of your personal data - including transfers to third parties. Some third parties are outside of the European Economic Area, with varying standards of data protection.

See our privacy policy for more information on the use of your personal data.

for further information and to change your choices.

Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Perspective
  • Published:

The future of machine learning for small-molecule drug discovery will be driven by data

Abstract

Many studies have prophesied that the integration of machine learning techniques into small-molecule therapeutics development will help to deliver a true leap forward in drug discovery. However, increasingly advanced algorithms and novel architectures have not always yielded substantial improvements in results. In this Perspective, we propose that a greater focus on the data for training and benchmarking these models is more likely to drive future improvement, and explore avenues for future research and strategies to address these data challenges.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Benchmark performance with respect to publication date.

Similar content being viewed by others

Data availability

Source data for Fig. 1 is available with this paper.

References

  1. Makurvet, F. D. Biologics vs. small molecules: drug costs and patient access. Med. Drug Discov. 9, 100075 (2021).

    Article  Google Scholar 

  2. Midlam, C. Status of Biologic Drugs in Modern Therapeutics-Targeted Therapies vs. Small Molecule Drugs 31–46 (Wiley, 2020).

  3. Liu, Z. et al. An overview of PROTACs: a promising drug discovery paradigm. Mol. Biomed. 3, 46 (2022).

    Article  Google Scholar 

  4. Dong, G., Ding, Y., He, S. & Sheng, C. Molecular glues for targeted protein degradation: from serendipity to rational discovery. J. Med. Chem. 64, 10606–10620 (2021).

    Article  Google Scholar 

  5. Scannell, J. W., Blanckley, A., Boldon, H. & Warrington, B. Diagnosing the decline in pharmaceutical R&D efficiency. Nat. Rev. Drug Discov. 11, 191–200 (2012).

    Article  Google Scholar 

  6. Taylor, D. The pharmaceutical industry and the future of drug development. Pharm. Environ. https://doi.org/10.1039/9781782622345-00001 (2015).

    Article  Google Scholar 

  7. Wouters, O. J., McKee, M. & Luyten, J. Estimated research and development investment needed to bring a new medicine to market, 2009–2018. JAMA 323, 844–853 (2020).

    Article  Google Scholar 

  8. Blanco-Gonzalez, A. et al. The role of AI in drug discovery: challenges, opportunities, and strategies. Pharmaceuticals 16, 891 (2023).

    Article  Google Scholar 

  9. Ramesh, A. et al. Zero-shot text-to-image generation. In International Conference on Machine Learning 8821–8831 (PMLR, 2021).

  10. Croitoru, F.-A., Hondru, V., Ionescu, R. T. & Shah, M. Diffusion models in vision: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 10850–10869 (2023).

    Article  Google Scholar 

  11. Bubeck, S. et al. Sparks of artificial general intelligence: early experiments with GPT4. Preprint at https://arxiv.org/abs/2303.12712 (2023).

  12. Gozalo-Brizuela, R. & Garrido-Merchán, E. C. ChatGPT is not all you need. A State of the Art Review of large generative AI models. GRACE 1, 1 (2023).

    Google Scholar 

  13. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  Google Scholar 

  14. Bertoline, L. M., Lima, A. N., Krieger, J. E. & Teixeira, S. K. Before and after AlphaFold2: an overview of protein structure prediction. Front. Bioinform. 3, 1120370 (2023).

    Article  Google Scholar 

  15. Lipinski, C. F., Maltarollo, V. G., Oliveira, P. R., Da Silva, A. B. & Honorio, K. M. Advances and perspectives in applying deep learning for drug design and discovery. Front. Robot. AI 6, 108 (2019).

    Article  Google Scholar 

  16. Reymond, J.-L. The chemical space project. Acc. Chem. Res. 48, 722–730 (2015).

    Article  Google Scholar 

  17. Meyers, J., Fabian, B. & Brown, N. De novo molecular design and generative models. Drug Discov. Today 26, 2707–2715 (2021).

    Article  Google Scholar 

  18. Jiang, Y. et al. Artificial intelligence for retrosynthesis prediction. Engineering https://doi.org/10.1016/j.eng.2022.04.021 (2022).

    Article  Google Scholar 

  19. Sánchez-Cruz, N. Deep graph learning in molecular docking: advances and opportunities. Artif. Intell. Life Sci. 3, 100062 (2023).

    Google Scholar 

  20. Mitchell, JohnB. O. Machine learning methods in chemoinformatics. Wiley Interdiscip. Rev. Comput. Mol. Sci. 4, 468–481 (2014).

    Article  Google Scholar 

  21. McNutt, A. T. et al. GNINA 1.0: molecular docking with deep learning. J. Cheminform. 13, 43 (2021).

    Article  Google Scholar 

  22. Zhu, H., Yang, J. & Huang, N. Assessment of the generalization abilities of machine-learning scoring functions for structure-based virtual screening. J. Chem. Inf. Model. 62, 5485–5502 (2022).

    Article  Google Scholar 

  23. Wallach, I. & Heifets, A. Most ligand-based classification benchmarks reward memorization rather than generalization. J. Chem. Inf. Model. 58, 916–932 (2018).

    Article  Google Scholar 

  24. Buttenschoen, M., Morris, G. M. & Deane, C. M. PoseBusters: AI-based docking methods fail to generate physically valid poses or generalise to novel sequences. Chem. Sci. 15, 3130–3139 (2024).

    Article  Google Scholar 

  25. Mokaya, M. et al. Testing the limits of SMILES-based de novo molecular generation with curriculum and deep reinforcement learning. Nat. Mach. Intell. 5, 386–394 (2023).

    Article  Google Scholar 

  26. Tran-Nguyen, V.-K., Jacquemard, C. & Rognan, D. LIT-PCBA: an unbiased data set for machine learning and virtual screening. J. Chem. Inf. Model. 60, 4263–4273 (2020).

    Article  Google Scholar 

  27. Torren-Peraire, P. et al. Models matter: the impact of single-step retrosynthesis on synthesis planning. Digit. Discov. 3, 558–572 (2024).

    Article  Google Scholar 

  28. Ivanenkov, Y. et al. The hitchhiker’s guide to deep learning driven generative chemistry. ACS Med. Chem. Lett. 14, 901–915 (2023).

    Article  Google Scholar 

  29. Handa, K., Thomas, M. C., Kageyama, M., Iijima, T. & Bender, A. On the difficulty of validating molecular generative models realistically: a case study on public and proprietary data. J. Cheminform. 15, 112 (2023).

    Article  Google Scholar 

  30. Harris, C. et al. PoseCheck: generative models for 3D structure-based drug design produce unrealistic poses. In NeurIPS 2023 Generative AI and Biology (GenBio) Workshop (2023).

  31. Neves, B. J. et al. QSAR-based virtual screening: advances and applications in drug discovery. Front. Pharmacol. 9, 1275 (2018).

    Article  Google Scholar 

  32. Yan, X. et al. Chemical structure similarity search for ligand-based virtual screening: methods and computational resources. Curr. Drug Targets 17, 1580–1585 (2016).

    Article  Google Scholar 

  33. Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).

    Article  Google Scholar 

  34. Pereira, J. et al. High-accuracy protein structure prediction in CASP14. Proteins 89, 1687–1699 (2021).

    Article  Google Scholar 

  35. Su, M. et al. Comparative assessment of scoring functions: the CASF-2016 update. J. Chem. Inf. Model. 59, 895–913 (2019).

    Article  Google Scholar 

  36. Lowe, D. M. Extraction of Chemical Structures and Reactions from the Literature. PhD thesis, Univ. Cambridge (2012).

  37. Wu, Z. et al. MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9, 513–530 (2018).

    Article  Google Scholar 

  38. Mysinger, M. M., Carchia, M., Irwin, J. J. & Shoichet, B. K. Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J. Med. Chem. 55, 6582–6594 (2012).

    Article  Google Scholar 

  39. Francoeur, P. G. et al. Three-dimensional convolutional neural networks and a crossdocked data set for structure-based drug design. J. Chem. Inf. Model. 60, 4200–4215 (2020).

    Article  Google Scholar 

  40. Vaswani, A. et al. Attention is all you need. In Proc. 31st International Conference on Neural Information Processing Systems 6000–6010 (ACM, 2017).

  41. Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. International Conference of Learning Representations (ICLR) (2017).

  42. Merchant, A. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023).

    Article  Google Scholar 

  43. Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 180, 688–702.e13 (2020).

    Article  Google Scholar 

  44. Wong, F. et al. Discovery of a structural class of antibiotics with explainable deep learning. Nature 626, 177–185 (2023).

    Article  Google Scholar 

  45. Jiang, D. et al. Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models. J. Cheminform. 13, 12 (2021).

    Article  Google Scholar 

  46. Korolev, V., Mitrofanov, A., Korotcov, A. & Tkachenko, V. Graph convolutional neural networks as ‘general-purpose’ property predictors: the universality and limits of applicability. J. Chem. Inf. Model. 60, 22–28 (2020).

    Article  Google Scholar 

  47. Geiger, M. & Smidt, T. e3nn: Euclidean neural networks. Preprint at https://arxiv.org/abs/2207.09453 (2022).

  48. Satorras, V. G., Hoogeboom, E. & Welling, M. E(n) equivariant graph neural networks. PLMR 139, 9323–9332 (2021).

    Google Scholar 

  49. Scantlebury, J. et al. A small step toward generalizability: training a machine learning scoring function for structure-based virtual screening. J. Chem. Inf. Model. 63, 2960–2974 (2023).

    Article  Google Scholar 

  50. Corso, G. et al. DiffDock: diffusion steps, twists, and turns for molecular docking. In International Conference on Learning Representations (2023).

  51. Igashov, I. et al. Equivariant 3D-conditional diffusion model for molecular linker design. Nat. Mach. Intell. 6, 417–427 (2024).

    Article  Google Scholar 

  52. Jing, B., Corso, G., Chang, J., Barzilay, R. & Jaakkola, T. Torsional diffusion for molecular conformer generation. In Proc. 36th International Conference on Neural Information Processing Systems article no. 1760, 24240–24253 (ACM, 2022).

  53. Schneuing, A. et al. Structure-based drug design with equivariant diffusion models. Preprint at https://arxiv.org/abs/2210.13695v2 (2022).

  54. Ho, J., Jain, A. & Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 33, 6840–6851 (2020).

    Google Scholar 

  55. Reed, J., Alterio, B., Coblenz, H., O’Lear, T. & Metz, T. AI image-generation as a teaching strategy in nursing education. J. Interact. Learn. Res. 34, 369–399 (2023).

    Google Scholar 

  56. Yildirim, E. In Art and Architecture: Theory, Practice and Experience 97 (2022).

  57. Azuaje, G. et al. Exploring the use of AI text-to-image generation to downregulate negative emotions in an expressive writing application. R. Soc. Open Sci. 10, 220238 (2023).

    Article  Google Scholar 

  58. Fishman, N., Klarner, L., Mathieu, E., Hutchinson, M. & De Bortoli, V. Metropolis sampling for constrained diffusion models. In Proc. 37th International Conference on Neural Information Processing Systems article no. 2721, 62296–6233 (ACM, 2024).

  59. Song, Y., Dhariwal, P., Chen, M. & Sutskever, I. Consistency models. In International Conference on Machine Learning 32211–32252 (PMLR, 2023).

  60. Lipman, Y., Chen, R. T., Ben-Hamu, H., Nickel, M. & Le, M. Flow matching for generative modeling. In The Eleventh International Conference on Learning Representations (2022).

  61. Sun, C., Shrivastava, A., Singh, S. & Gupta, A. Revisiting unreasonable effectiveness of data in deep learning era. In Proc. IEEE International Conference on Computer Vision 843–852 (IEEE, 2017).

  62. Betker, J. et al. Improving image generation with better captions. Open AI https://cdn.openai.com/papers/dall-e-3.pdf (2023).

  63. Liu, Z. et al. PDB-wide collection of binding data: current status of the PDBbind database. Bioinformatics 31, 405–412 (2014).

    Article  Google Scholar 

  64. Rose, P. W. et al. The RCSB protein data bank: integrative view of protein, gene and 3D structural information. Nucleic Acids Res. 45, D271–D281 (2016).

    Google Scholar 

  65. Zdrazil, B. et al. The ChEMBL database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Res 52, D1180–D1192 (2024).

    Article  Google Scholar 

  66. Ramesh, A., Dhariwal, P., Nichol, A., Chu, C. & Chen, M. Hierarchical text-conditional image generation with clip latents. Preprint at https://arxiv.org/abs/2204.06125 (2022).

  67. Touvron, H. et al. Llama 2: open foundation and fine-tuned chat models. Preprint at https://arxiv.org/abs/2307.09288 (2023).

  68. Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).

    Article  Google Scholar 

  69. Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

    Article  MathSciNet  Google Scholar 

  70. Mitchell, A. L. et al. MGnify: the microbiome analysis resource in 2020. Nucleic Acids Res 48, D570–D578 (2019).

    Google Scholar 

  71. Suzek, B. E., Huang, H., McGarvey, P., Mazumder, R. & Wu, C. H. UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics 23, 1282–1288 (2007).

    Article  Google Scholar 

  72. Tang, J. et al. Making sense of large-scale kinase inhibitor bioactivity data sets: a comparative and integrative analysis. J. Chem. Inf. Model. 54, 735–743 (2014).

    Article  Google Scholar 

  73. Huang, R. et al. Tox21challenge to build predictive models of nuclear receptor and stress response pathways as mediated by exposure to environmental chemicals and drugs. Front. Environ. Sci. 3, 85 (2016).

    Article  Google Scholar 

  74. Voitsitskyi, T. et al. Augmenting a training dataset of the generative diffusion model for molecular docking with artificial binding pockets. RSC Adv. 14, 1341–1353 (2024).

    Article  Google Scholar 

  75. Volkov, M. et al. On the frustration to predict binding affinities from protein–ligand structures with deep neural networks. J. Med. Chem. 65, 7946–7958 (2022).

    Article  Google Scholar 

  76. Blundell, T. L. & Patel, S. High-throughput X-ray crystallography for drug discovery. Curr. Opin. Pharmacol. 4, 490–496 (2004).

    Article  Google Scholar 

  77. Polizzi, N. F. & DeGrado, W. F. A defined structural unit enables de novo design of small-molecule-binding proteins. Science 369, 1227–1233 (2020).

    Article  Google Scholar 

  78. Stark, H., Jing, B., Barzilay, R. & Jaakkola, T. Harmonic prior self-conditioned flow matching for multi-ligand docking and binding site design. In NeurIPS 2023 AI for Science Workshop (2023).

  79. Corso, G., Deng, A., Polizzi, N., Barzilay, R. & Jaakkola, T. The discovery of binding modes requires rethinking docking generalization. In NeurIPS 2023 Generative AI and Biology (GenBio) Workshop (2023).

  80. Liu, L. et al. Pre-training on large-scale generated docking conformations with helixdock to unlock the potential of protein–ligand structure prediction models. Preprint at https://arxiv.org/abs/2310.13913 (2023).

  81. McFee, M. & Kim, P. M. GDockScore: a graph-based protein–protein docking scoring function. Bioinform. Adv. 3, vbad072 (2023).

    Article  Google Scholar 

  82. Réau, M., Langenfeld, F., Zagury, J.-F., Lagarde, N. & Montes, M. Decoys selection in benchmarking datasets: overview and perspectives. Front. Pharmacol. 9, 11 (2018).

    Article  Google Scholar 

  83. Strieth-Kalthoff, F. et al. Machine learning for chemical reactivity: the importance of failed experiments. Angew. Chem. Int. Ed. 61, 29 (2022).

    Article  Google Scholar 

  84. Mlinarić, A., Horvat, M. & Šupak Smolčić, V. Dealing with the positive publication bias: why you should really publish your negative results. Biochem. Med. 27, 447–452 (2017).

    Article  Google Scholar 

  85. McCloskey, K. et al. Machine learning on DNA-encoded libraries: a new paradigm for hit finding. J. Med. Chem. 63, 8857–8866 (2020).

    Article  Google Scholar 

  86. Maloney, M. P. et al. Negative data in data sets for machine learning training. Org. Lett. 25, 2945–2947 (2023).

    Article  Google Scholar 

  87. McEwen, L. & Mustafa, F. Worldfair chemistry: making IUPAC assets fair. Chem. Int. 45, 14–17 (2023).

    Article  Google Scholar 

  88. Steinbeck, C. et al. NFDI4chem—towards a national research data infrastructure for chemistry in Germany. Res. Ideas Outcomes 6, e55852 (2020).

    Article  Google Scholar 

  89. Segler, M. H., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555, 604–610 (2018).

    Article  Google Scholar 

  90. Ball, P. Computer gleans chemical insight from lab notebook failures. Nature https://doi.org/10.1038/nature.2016.19866 (2016).

    Article  Google Scholar 

  91. Swain, M. C. & Cole, J. M. ChemDataExtractor: a toolkit for automated extraction of chemical information from the scientific literature. J. Chem. Inf. Model. 56, 1894–1904 (2016).

    Article  Google Scholar 

  92. Rajan, K., Brinkhaus, H. O., Agea, M. I., Zielesny, A. & Steinbeck, C. DECIMER.ai: an open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications. Nat. Commun. 14, 5045 (2023).

    Article  Google Scholar 

  93. Blecher, L., Cucurull, G., Scialom, T. & Stojnic, R. Nougat: neural optical understanding for academic documents. Preprint at https://arxiv.org/abs/2308.13418 (2023).

  94. Chodera, J., Lee, A. A., London, N. & von Delft, F. Crowdsourcing drug discovery for pandemics. Nat. Chem. 12, 581 (2020).

    Article  Google Scholar 

  95. The COVID Moonshot Consortium. COVID Moonshot: open science discovery of SARS-CoV-2 main protease inhibitors by combining crowdsourcing, high-throughput experiments, computational simulations, and machine learning. Preprint at bioRxiv https://doi.org/10.1101/2020.10.29.339317 (2020).

  96. Boby, M. L. et al. Open science discovery of potent noncovalent SARS-CoV-2 main protease inhibitors. Science 382, eabo7201 (2023).

  97. Hanser, T. Federated learning for molecular discovery. Curr. Opin. Struct. Biol. 79, 102545 (2023).

    Article  Google Scholar 

  98. Hanser, T. et al. Using privacy-preserving federated learning to enable pre-competitive cross-industry knowledge sharing and improve QSAR models. In Society of Toxicology (SOT) Annual Meeting (2022).

  99. Wang, R., Chaudhari, P. & Davatzikos, C. Bias in machine learning models can be significantly mitigated by careful training: evidence from neuroimaging studies. Proc. Natl Acad. Sci. USA 120, e2211613120 (2023).

    Article  Google Scholar 

  100. Van Giffen, B., Herhausen, D. & Fahse, T. Overcoming the pitfalls and perils of algorithms: a classification of machine learning biases and mitigation methods. J. Bus. Res. 144, 93–106 (2022).

    Article  Google Scholar 

  101. Leavy, S. Gender bias in artificial intelligence: the need for diversity and gender theory in machine learning. In Proc. 1st International Workshop on Gender Equality in Software Engineering 14–16 (2018).

  102. Lee, N. T. Detecting racial bias in algorithms and machine learning. J. Inf. Commun. Ethics Soc. 16, 252–260 (2018).

    Article  Google Scholar 

  103. Subramanian, G., Ramsundar, B., Pande, V. & Denny, R. A. Computational modeling of β-secretase 1 (BACE-1) inhibitors using ligand based approaches. J. Chem. Inf. Model. 56, 1936–1949 (2016).

    Article  Google Scholar 

  104. Martins, I. F., Teixeira, A. L., Pinheiro, L. & Falcao, A. O. A Bayesian approach to in silico blood–brain barrier penetration modeling. J. Chem. Inf. Model. 52, 1686–1697 (2012).

    Article  Google Scholar 

  105. Delaney, J. S. ESOL: estimating aqueous solubility directly from molecular structure. J. Chem. Inf. Comput. Sci. 44, 1000–1005 (2004).

    Article  Google Scholar 

  106. Xie, Y., Xu, Z., Ma, J. & Mei, Q. How much space has been explored? Measuring the chemical space covered by databases and machine-generated molecules. In The Eleventh International Conference on Learning Representations (2022).

  107. Thakkar, A. et al. Unbiasing retrosynthesis language models with disconnection prompts. ACS Cent. Sci. 9, 1488–1498 (2023).

    Article  Google Scholar 

  108. Cleves, A. E. & Jain, A. N. Effects of inductive bias on computational evaluations of ligand-based modeling and on drug discovery. J. Comput. Aided Mol. Des. 22, 147–159 (2008).

    Article  Google Scholar 

  109. Chen, L. et al. Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening. PLoS ONE 14, e0220113 (2019).

    Article  Google Scholar 

  110. Sieg, J., Flachsenberg, F. & Rarey, M. In need of bias control: evaluating chemical data for machine learning in structure-based virtual screening. J. Chem. Inf. Model. 59, 947–961 (2019).

    Article  Google Scholar 

  111. Jacobsson, M. & Karlén, A. Ligand bias of scoring functions in structure-based virtual screening. J. Chem. Inf. Model. 46, 1334–1343 (2006).

    Article  Google Scholar 

  112. Chaput, L., Martinez-Sanz, J., Saettel, N. & Mouawad, L. Benchmark of four popular virtual screening programs: construction of the active/decoy dataset remains a major determinant of measured performance. J. Cheminform. 8, 56 (2016).

    Article  Google Scholar 

  113. Jiang, D. et al. Interactiongraphnet: a novel and efficient deep graph representation learning framework for accurate protein–ligand interaction predictions. J. Med. Chem. 64, 18209–18232 (2021).

    Article  Google Scholar 

  114. Shen, C. et al. A generalized protein–ligand scoring framework with balanced scoring, docking, ranking and screening powers. Chem. Sci. 14, 8129–8146 (2023).

    Article  Google Scholar 

  115. Farahani, A., Voghoei, S., Rasheed, K. & Arabnia, H. R. A brief review of domain adaptation. Advances in Data Science and Information Engineering: Proc. ICDATA 2020 and IKE 2020 877–894 (2021).

  116. Han, X., Baldwin, T. & Cohn, T. Towards equal opportunity fairness through adversarial learning. Preprint at https://arxiv.org/abs/2203.06317 (2022).

  117. Shao, S., Ziser, Y. & Cohen, S. B. Gold doesn’t always glitter: spectral removal of linear and nonlinear guarded attribute information. In The 17th Conference of the European Chapter of the Association for Computational Linguistics 1611–1622 (Association for Computational Linguistics, 2023).

  118. Klarner, L. et al. Drug discovery under covariate shift with domain-informed prior distributions over functions. In Proc. 40th International Conference on Machine Learning article no. 706, 17176–17197 (ACM, 2023).

  119. Kramer, C., Beck, B., Kriegl, J. M. & Clark, T. A composite model for hERG blockade. ChemMedChem 3, 254–265 (2008).

    Article  Google Scholar 

  120. Kausar, S. & Falcao, A. O. An automated framework for QSAR model building. J. Cheminform. https://jcheminf.biomedcentral.com/articles/10.1186/s13321-017-0256-5 (2018).

  121. Simeon, S. & Jongkon, N. Construction of quantitative structure activity relationship (QSAR) models to predict potency of structurally diversed Janus kinase 2 inhibitors. Molecules 24, 4393 (2019).

    Article  Google Scholar 

  122. Kalliokoski, T., Kramer, C., Vulpetti, A. & Gedeck, P. Comparability of mixed IC50 data—a statistical analysis. PLoS ONE 8, e61007 (2013).

    Article  Google Scholar 

  123. Kramer, C., Kalliokoski, T., Gedeck, P. & Vulpetti, A. The experimental uncertainty of heterogeneous public Ki data. J. Med. Chem. 55, 5165–5173 (2012).

    Article  Google Scholar 

  124. Landrum, G. A. & Riniker, S. Combining IC50 or Ki values from different sources is a source of significant noise. J. Chem. Inf. Model. 64, 1560–1567 (2024).

    Article  Google Scholar 

  125. Hernández-Garrido, C. A. & Sánchez-Cruz, N. Experimental uncertainty in training data for protein–ligand binding affinity prediction models. Artif. Intell. Life Sci. 4, 100087 (2023).

    Google Scholar 

  126. Speck-Planche, A. & Kleandrova, V. V. Multi-condition QSAR model for the virtual design of chemicals with dual pan-antiviral and anti-cytokine storm profiles. ACS Omega 7, 32119–32130 (2022).

    Article  Google Scholar 

  127. Baell, J. B. & Nissink, J. W. M. Seven year itch: pan-assay interference compounds (PAINs) in 2017 utility and limitations. ACS Chem. Biol. 13, 36–44 (2018).

    Article  Google Scholar 

  128. Brenk, R. et al. Lessons learnt from assembling screening libraries for drug discovery for neglected diseases. ChemMedChem 3, 435–444 (2008).

    Article  Google Scholar 

  129. Jadhav, A. et al. Quantitative analyses of aggregation, autofluorescence, and reactivity artifacts in a screen for inhibitors of a thiol protease. J. Med. Chem. 53, 37–51 (2010).

    Article  Google Scholar 

  130. Walters, P. We need better benchmarks for machine learning in drug discovery. Practical Cheminformatics Blog https://practicalcheminformatics.blogspot.com/2023/08/we-need-better-benchmarks-for-machine.html (2023).

  131. Klarner, L., Reutlinger, M., Schindler, T., Deane, C. & Morris, G. Bias in the benchmark: systematic experimental errors in bioactivity databases confound multi-task and meta-learning algorithms. In ICML 2022 2nd AI for Science Workshop (2022).

  132. Wigh, D. S., Arrowsmith, J., Pomberger, A., Felton, K. C. & Lapkin, A. A. Orderly: data sets and benchmarks for chemical reaction data. J. Chem. Inf. Model. 64, 3790–3798 (2024).

    Article  Google Scholar 

  133. Durant, G., Boyles, F., Birchall, K., Marsden, B. & Deane, C. Robustly interrogating machine learning based scoring functions: what are they learning? Preprint at bioRxiv https://doi.org/10.1101/2023.10.30.564251 (2023).

  134. Li, S. et al. Structure-aware interactive graph neural networks for the prediction of protein–ligand binding affinity. In KDD21: Proc. 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining https://doi.org/10.1145/3447548.3467311 (ACM, 2021).

  135. Wójcikowski, M., Kukiełka, M., Stepniewska-Dziubinska, M. M. & Siedlecki, P. Development of a protein–ligand extended connectivity (PLEC) fingerprint and its application for binding affinity predictions. Bioinformatics 35, 1334–1341 (2019).

    Article  Google Scholar 

  136. Wang, Z. et al. OnionNet-2: a convolutional neural network model for predicting protein–ligand binding affinity based on residue-atom contacting shells. Front. Chem. 9, 913 (2021).

    Google Scholar 

  137. Browne, C. B. et al. A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4, 1–43 (2012).

    Article  Google Scholar 

  138. Huang, K. et al. Therapeutics data commons: machine learning datasets and tasks for drug discovery and development. Preprint at https://arxiv.org/abs/2102.09548v2 (2021).

  139. Gan, J. L. et al. Benchmarking ensemble docking methods in D3R Grand Challenge 4. J. Comput. Aided Mol. Des. 36, 87–99 (2022).

    Article  Google Scholar 

  140. Ackloo, S. et al. CACHE (critical assessment of computational hit-finding experiments): a public–private partnership benchmarking initiative to enable the development of computational methods for hit-finding. Nat. Rev. Chem. 6, 287–295 (2022).

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by funding from the Engineering and Physical Sciences Research Council (EPSRC) (grant number EP/S024093/1).

Author information

Authors and Affiliations

Authors

Contributions

G.D., F.B. and C.M.D. conceived the overall structure of the paper. G.D. wrote the paper. F.B., C.M.D. and K.B. reviewed and edited the paper.

Corresponding author

Correspondence to Charlotte M. Deane.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks Diwakar Shukla and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Kaitlin McCardle, in collaboration with the Nature Computational Science team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Source data

Source Data Fig. 1

Collated papers and ML models for CASF-2016, USPTO-50k and MoleculeNet HIV benchmarks.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Durant, G., Boyles, F., Birchall, K. et al. The future of machine learning for small-molecule drug discovery will be driven by data. Nat Comput Sci (2024). https://doi.org/10.1038/s43588-024-00699-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s43588-024-00699-0

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research