A guest post by Stephen Malina, my partner in crime on Mu.
Most programmers agree that we don't read enough code. The interviews in Peter Seibel's book, “Coders at work” highlight a comical contradiction: almost all the programmers interviewed by Seibel recommend that others read code for fun, but none of them routinely do so themselves. Seibel even asked Hal Abelson (of SICP fame) directly about this phenomenon:
Seibel, James Hague and others have all tried to justify why code reading is so uncommon, and they make good points. But perhaps the conversation is led astray by use of the word ‘read’. I wonder if Abelson and the others would have had more examples if Seibel had asked them what code they had learned about for fun. Perhaps the word ‘read’ put them in a passive frame of mind, causing them to filter out programs they'd hacked on?
We all read code already; it’s just that we usually read when we want to edit. And the comprehension that questions about reading are really concerned with—it comes from both reading and writing, interleaved in complex ways.
That hacking produces better comprehension than passive, linear reading fits with what we know about learning. Barbara Oakley, Herbert Simon, Cal Newport, and Anders Ericsson all describe how solid understanding emerges from active exploration, critical examination, repetition, and synthesis. Hacking beats passive reading on three out of four of these criteria:
- Active exploration: When you hack, you want to eventually produce a
change in the codebase. This desire guides your path through the code. When
you read passively you let the code’s linear flow guide you.
- Critical examination: When you hack, you evaluate existing code in light
of the change you want to make. Deciding what to use and remove keeps you from
accepting the existing system as canon. When you read linearly, you lack a
goal against which you can critically examine the existing code.
- Synthesis: To change the program as you desire, you synthesize existing
code with new code.
- Repetition: Neither hacking nor linear reading involve useful repetition,
unless you treat your change to make like a kata and mindfully
re-implement it multiple times.
Learning through hacking also leverages the natural structure of a codebase. Good books guide their readers through series of questions and their answers, but codebases are inherently non-linear, like a map. You can ask an infinite number of questions of a map. How far is it from A to B? Which is the nearest town to C? But you can’t expect a map to tell you what questions to ask, and it makes no sense to read a map linearly from top to bottom, left to right.
Reframing reading as ‘navigation’ suggests that our conventional discussions of clean code and interfaces ignore the things that actually make unfamiliar code accessible to outsiders. Clean, solidified abstractions are like well-marked, easy-to-follow paths through a forest — very useful if they lead in the direction we need to go, but less useful when we want to blaze arbitrary new paths through the forest.
Instead, let's focus on guiding exploration, making it easier for readers to answer their own questions about codebases. I’m still figuring out how to do this; so far I have just a couple of preliminary ideas:
- Suggest features in your code that make good exercises for re-implementation.
Provide an initial Git commit without the feature, give them hints where
necessary, and link them to the actual change plus others’ attempts at
producing it.
- Rather than conceiving of documentation as something that explains
individual modules, focus on overviews of how the modules fit together (like
Fabien Sanglard's
for Git).
Afterword
Others have explored similar ideas from different perspectives:
- “How To Be A Hacker”:
Eric Raymond discusses what he calls the “the incremental-hacking
cycle”, a process by which someone gradually expands their understanding
of a codebase by making bigger and bigger changes to it.
- “How to read a math textbook”:
David Maciver describes a problem- and theorem-driven approach for learning
math which you could adapt to reading programs.
- “The
Benjamin Franklin method of reading programming books”: James
Koppel's take on Anders Ericsson.