created: 9 Dec 2012; modified: 13 Mar 2017; status: finished; confidence: highly likely; importance: 6
Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.–Greenspun’s Tenth Law
Turing-completeness (TC) is the property of a system being able to, under some simple representation of input & output, compute any program.
TC, besides being foundational to computer science and understanding many key issues like why a perfect antivirus program is impossible
, is also weirdly common: one might think that such universality as a system being smart enough to be able to run any program might be difficult or hard to achieve, but it turns out to be the opposite and it is difficult to write a useful system which does not immediately tip over into TC. It turns out that given even a little control over input into something which transforms input to output, one can typically leverage that control into full-blown TC. This can be amusing, useful (although usually not), harmful, or extremely insecure & a cracker’s delight (see language-theoretic security
, based on exploiting weird machines
1). Surprising
examples of this behavior remind us that TC lurks everywhere, and security is extremely difficult.
Too powerful
languages can also manifest as nasty DoS attacks; the fuzz tester afl found in OpenBSD’s roff that it could create an infinite loop by abusing some of the string substitution rules.
They are probably best considered as a subset of discovered
or found
esoteric programming languages (esolangs). So FRACTRAN, as extraordinarily minimalist as it is, does not count; nor would a deliberately-obfuscated language like Malbolge (where it took years to write a trivial program) count because it was designed to be an esolang; but neither would Conway’s Game of Life count because questions about whether it was TC appeared almost immediately upon publication and so it turning out to be TC is not surprising, and given the complexity of packet-switching networks & routers it’s not necessarily too surprising if one can build a cellular automaton into them or encode logical circuits. Many configuration or special-purpose languages or tools or complicated games turn out to violate the Rule of least power & be accidentally Turing-complete
, like MediaWiki templates, sed
(any form of templates or compile-time computation is highly likely to be TC since they often turn out to support a lambda calculus or a term-rewriting language), XSLT, Infinite Minesweeper, Dwarf Fortress2, Starcraft, Minecraft, Ant, Transport Tycoon, C++ templates & Java generics, DNA computing etc are TC but these are not surprising either: many games support scripting (ie TC-ness) to make their development easier and enable fan modifications, so games’ TC may be as simple as including syntax for calling out to a better-known language like Perl, or it may just be an obscure part of a standard format (most people these days are probably unaware that TrueType & many fonts are PostScript programs based on stack machines, similar to DWARF debugging and ELF metadata, or that some music formats go beyond MIDI in providing scripting capabilities and must be interpreted to be displayed; once one knows this, then fonts being TC are no more surprising than TeX documents being TC, leading of course, to many severe & fascinating font or media security vulnerabilities such as the BLEND vulnerability or SNES & NES code exploiting Linux systems). Similarly, such feats as creating a small Turing machine using Legos would not count, since we already know that mechanical computers work. On the other hand, the vein of computer security research called weird machines
is a fertile ground of that’s TC?
reactions. What is surprising may differ from person to person.
- Peano arithmetic: addition & multiplication on natural numbers is enough to be TC; in contrast, Presburger arithmetic removes multiplication and hence is not TC
- Wang tiles: multi-colored squares, whose placement is governed by the rule that adjacent colors must be the same
- CSS: was designed to be a declarative markup language for tweaking the visual appearance of HTML pages, but the declarations interact with each other just enough to allow an encoding of the cellular automaton Rule 110 (assumption: requires mechanical mouse clicks on the web browser to advance state; see also the Magic: The Gathering example)
X86 shenanigans:
- MMU shuffle computer RAM around to make programming easier; if a program sets up its share of memory properly, it can execute arbitrary computations via MMU page-faults (comments; paper) without ever running code itself by turning the MMU faulting mechanism into a one-instruction set computer.
: the apparently innocuous x86 assembler instructionmov
is Turing-completemov
, which copies data between the CPU & RAM, can be used to implement a transport-triggered-architecture one instruction set computer (and for bonus points, it can be done usingxor
too)x86 is Turing-complete with no registers
return-into-libc attacks
: software libraries provide pre-packaged functions, each of which is intended to do one useful thing; a fully TClanguage
can be cobbled out of just calls to these functions and nothing else, which enables evasion of security mechanisms since the attacker is not running any recognizable code of his own. See, among many others,The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls (on the x86)
&On the Expressiveness of Return-into-libc Attacks
.- Pokemon Yellow:
Pokemon Yellow Total Control Hack
outlines an exploit of a memory corruption attack which allows one to write arbitrary Game Boy assembler programs by repeated in-game walking and item purchasing. (There are similar feats which have been developed by speedrun aficionados, but I tend to ignore most of them as they areimpure
: for example, one can turn the SNES Super Mario World into an arbitrary game like Snake or Pong but you need the new programs loaded up into extra hardware, so in my opinion, it’s not really showing SMW to be unexpectedly TC and is different from the other examples. Similarly, one can go from Super Game Boy to SNES to arbitrary code like IRC. This distinction is debatable.) - Magic: the Gathering: TC, but like the CSS one, only with an assumption (specifically, that players mechanically take any option they are given, but otherwise all actions are forced)
- Braid: TC
- musical notation: given instructions for transposing successive notes, musical notation becomes the esolang Choon
heart cells: interact in a way allowing logic gates and hence TC (perhaps not too surprising since cellular automatons were biologically motivated)
See also
External links
- Discussion: HN
- Accidentally Quadratic
An active area of research is into languages & systems carefully designed and proven to not be TC (eg. total functional programming). Why this effort to make a language in which many programs can’t be written? Because TC is intimately tied to Godel’s incompleteness theorems & Rice’s theorem, allowing TC means that one is forfeiting all sorts of provability properties: in a non-TC language, one may be able to easily prove all sorts of useful things to know; for example, that programs terminate, that they are type-safe or not, that they can be easily converted into a logical theorem, that they consume a bounded amount of resources, that one implementation of a protocol is correct or equivalent to another implementation, that there are a lack of side-effects and a program can be transformed into a logically-equivalent but faster version (particularly important for declarative languages like SQL where the query optimizer being able to transform queries is key to acceptable performance, but of course some SQL extensions make it TC anyway by allowing either a cyclic tag system to be encoded, the
model
DSL, or to call out to PL/SQL) etc. Some of the literature onweird machines
:Exploit Programming: From Buffer Overflows to
, Bratus et al 2011Weird Machines
and Theory of ComputationThe Halting Problems of Network Stack Insecurity
, Sassaman et al 2011The Page-Fault Weird Machine: Lessons in Instruction-less Computation
, Bangert et al 2013
, Shapiro et al 2013Weird Machines
in ELF: A Spotlight on the Underappreciated MetadataInterrupt-oriented Bugdoor Programming: A minimalist approach to bugdooring embedded systems firmware
, Tan et al 2014The Weird Machines in Proof-Carrying Code
, Vanegue 2014Framing Signals - A Return to Portable Shellcode
, Bosman & Bos 2014
Dwarf Fortress provides clockwork mechanisms, so TC is unsurprising; but the water is implemented as a simple cellular automation, so there might be more ways of getting TC in DF! The DF wiki currently lists 4 potential ways of creating logic gates: the fluids, the clockwork mechanisms, minecarts, and creature/animal logic gates involving doors+pressure-sensors.↩