PERSPECTIVE

Protein structure prediction improves the quality of amino-acid sequence alignment

Arthur M. Lesk

Corresponding Author

Arthur M. Lesk

Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, USA

Correspondence

Arthur M. Lesk, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, 16802, USS.

Email: aml25@psu.edu

Search for more papers by this author
Arun S. Konagurthu

Arun S. Konagurthu

Department of Data Science and Artificial Intelligence, Monash University, Clayton, Victoria, Australia

Search for more papers by this author
First published: 26 June 2022
Citations: 1

Abstract

The basic operation in analysis of protein evolution is alignment: the specification of residue-residue correspondences. A structural alignment is a specification of residue-residue correspondences based on the atomic positions in the structures of two or more proteins. It is well-known that structural alignments are more accurate, over a much wider range of divergence, than pairwise alignments based solely on sequences—for instance computed with the Needleman–Wunsch algorithm with affine gap penalties. Given the amino-acid sequences of two proteins, alignments based solely on the sequences fall into “daylight”, “twilight”, and “midnight” zones, in which the fidelity of the correspondences diminishes in accuracy, and in strength of ability to distinguish true homology from noise. The success of AlphaFold2 in template-free modeling of three-dimensional structures from one-dimensional amino-acid sequence information implies that: given the amino-acid sequences of two or more proteins, in the absence of experimentally determined structures, reliable alignments—even for very highly diverged proteins—could in many cases be achieved by applying AlphaFold2 to the sequences, and performing structural alignments of the models.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.