Transformers as Universal Turing Machines: Explaining Through Lag Systems

shashank Jain
6 min read21 hours ago

Transformers have revolutionized the field of natural language processing (NLP) and beyond, powering models like GPT, BERT, and T5. But beyond their practical applications, there’s a fascinating theoretical aspect that positions Transformers as more than just effective language models. Some research suggests that Transformers can be considered Universal Turing Machines — in other words, they have the computational power to perform any task that a Turing machine can.

This bold claim rests on connecting Transformers to Lag systems, simplified models of computation that are known to be Turing-complete. In this blog, we’ll explore:
1. What Lag systems are, with examples.
2. Why Lag systems are equivalent to Universal Turing Machines.
3. How Transformers can simulate Lag systems, thereby proving their Turing-completeness.

1. What Are Lag Systems?

At first glance, Lag systems might sound obscure, but they are actually elegant, simplified computational models that bear a strong resemblance to Turing machines. The main difference lies in how they operate.

How Lag Systems Work:
A Lag system manipulates a memory string of symbols using a set of simple rules. The rules act on a fixed-length prefix (the first few symbols of the string), replacing that prefix with new symbols. The system then shifts the string forward and repeats the process…

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in