(cache)Writing A Compiler In Go

It's the sequel to … a programming language

Cover of Writing An Interpreter In Go, the first part

Writing A Compiler In Go is the sequel to Writing An Interpreter In Go. It starts right where the first one stopped, with a fully-working, fully-tested Monkey interpreter in hand, connecting both books seamlessly, ready to build a compiler and a virtual machine for Monkey.

In this book, we use the codebase (included in the book!) from the first part and extend it. We take the lexer, the parser, the AST, the REPL and the object system and use them to build a new, faster implementation of Monkey, right next to the tree-walking evaluator we built in the first book.

The approach is unchanged, too. Working, tested code is the focus, we build everything from scratch, do baby steps, write tests firsts, use no 3rd-party-libraries and see and understand how all the pieces fit together.

It's a continuation in prose and in code.

Do you need to read the first part before this one? If you're okay with treating the code from the first book as black box, then no. But that's not what these books are about; they're about opening up black boxes, looking inside and shining a light. You'll have the best understanding of where we're going in this book, if you know where we started.

Learn how to write a compiler and a virtual machine

Our main goal in in this book is to evolve Monkey. We change its architecture and turn it into a bytecode compiler and virtual machine.

We'll take the lexer, the parser, the AST and the object system we wrote in the first book and use them to build our own Monkey compiler and virtual machine … from scratch! We'll build them side-by-side so that we'll always have a running system we can steadily evolve.

Read a sample

What we end up with is not only much closer to the programming languages we use every day, giving us a better understanding of how they work, but also 3x faster. And that's without explicitly aiming for performance.

Here's what we'll do:

We define our own bytecode instructions, specifying their operands and their encoding. Along the way, we also build a mini-disassembler for them.
We write a compiler that takes in a Monkey AST and turns it into bytecode by emitting instructions
At the same time we build a stack-based virtual machine that executes the bytecode in its main loop

View table of contents

We'll learn a lot about computers, how they work, what machine code and opcodes are, what the stack is and how to work with stack pointers and frame pointers, what it means to define a calling convention, and much more.

We also

build a symbol table and a constant pool
do stack arithmetic
generate jump instructions
build frames into our VM to execute functions with local bindings and arguments!
add built-in functions to the VM
get real closures working in the virtual machine and learn why closure-compilation is so tricky

The Monkey Programming Language

The implementation of Monkey we build in this book is markedly different from the tree-walking interpreter we built in Writing An Interpreter In Go, but Monkey stays the same.

At the end, Monkey will still look and work like this:

// Integers & arithmetic expressions
let version = 1 + (50 / 2) - (8 * 3);

// Strings
let name = "The Monkey programming language";

// Booleans
let isMonkeyFastNow = true;

// Arrays & Hashes
let people = [{"name": "Anna", "age": 24}, {"name": "Bob", "age": 99}];

// Functions
let getName = fn(person) { person["name"]; };
getName(people[0]); // => "Anna"
getName(people[1]); // => "Bob"

And it will still support recursive functions, conditionals, implicit and explicit returning of values:

let fibonacci = fn(x) {
  if (x == 0) {
    0
  } else {
    if (x == 1) {
      return 1;
    } else {
      fibonacci(x - 1) + fibonacci(x - 2);
    }
  }
};

The crown-jewel we'll build into our new Monkey implementation are closures:

// `newAdder` returns a closure that makes use of the free variables `a` and `b`:
let newAdder = fn(a, b) {
    fn(c) { a + b + c };
};
// This constructs a new `adder` function:
let adder = newAdder(1, 2);

adder(8); // => 11

Yes, we'll compile all of that to bytecode and execute it in a **stack-based virtual machine. That not only works, but is also tremendous fun to build.

What readers said about the first book:

"Compilers was the most surprisingly useful university course I ever took. Learning to write a parser and runtime for a toy language helps take away a lot of "magic" in various parts of computer science. I recommend any engineer who isn't familiar with lexers, parsers, and evaluators to read Thorsten's book."
Mitchell Hashimoto (@mitchellh)
Founder of HashiCorp

"Thorsten took a topic that is usually very dry and CS heavy and made it accessible and easy to understand. After reading this book I felt confident enough to write Plush, the templating language I’ve always wanted in Go! If you have yet to read Thorsten's book, I can't recommend it enough. Please go and buy it!"
Mark Bates (@markbates)
Creator of gobuffalo.io

"Great book. I loved it because everything is built by hand, so you get to think about all the details, and it does so in a gradual way, which is didactic. The implementation itself is also nice and simple 🙌"
Xavier Noria (@fxn)
Everlasting student, Rails Core Team, Ruby Hero, Freelance, Live lover

"This book demystifies and makes the topic of interpreters approachable and fun. Don't be surprised if you become a better Go programmer after working your way through it."
Johnny Boursiquot (@jboursiquot)
Go Programmer, @BaltimoreGolang Organizer, @GolangBridge Core Member

About the author

Hello there! My name is Thorsten Ball. I'm a writer and programmer living in Aschaffenburg, Germany. In my professional life as a software developer, I build web applications and platforms. I have deployed Ruby, JavaScript, Go and C code to production systems.

In my spare time, though, I like to do what I call recreational programming, where I dig deep into various topics and their codebases, taking them apart, recreating them in my own code, trying to peel away layers of what seems like magic, hoping to come out the other end with a better understanding of what it is that I do when I program.

For the last few years, the two topics that kept my attention are systems programming and programming languages: interpreters, compilers, virtual machines, JIT compilers, assembly language – I can't get enough of it.

Writing an interpreter from scratch and writing a book about that has been one of the most wonderful and satisfying things I ever did as a programmer. It has been so much fun, in fact, that I couldn't stop doing it, which is why you're looking at the next book.

You can follow @thorstenball on Twitter to know what I'm up to. If you want to know more about me, visit my blog and website or take a look at my GitHub profile.