September/October 2019 issue of acmqueue The September/October 2019 issue of acmqueue is out now

Subscribers and ACM Professional members login here



The Bike Shed

Development

  Download PDF version of this article PDF

A Generation Lost in the Bazaar

Quality happens only when someone is responsible for it.


Poul-Henning Kamp


Thirteen years ago, Eric Raymond's book The Cathedral and the Bazaar (O'Reilly Media, 2001) redefined our vocabulary and all but promised an end to the waterfall model and big software companies, thanks to the new grass-roots open source software development movement. I found the book thought provoking, but it did not convince me. On the other hand, being deeply involved in open source, I couldn't help but think that it would be nice if he was right.

The book I brought to the beach house this summer is also thought provoking, much more so than Raymond's (which it even mentions rather positively): Frederick P. Brooks's The Design of Design (Addison-Wesley Professional, 2010). As much as I find myself nodding in agreement and as much as I enjoy Brooks's command of language and subject matter, the book also makes me sad and disappointed.

Thirteen years ago also marks the apogee of the dot-com euphoria, where every teenager was a Web programmer and every college dropout had a Web startup. I had genuine fun trying to teach some of those greenhorns about the good old-fashioned tricks of the trade—test-restoring backups, scripting operating-system installs, version control, etc. Hindsight, of course, is 20/20 (i.e., events may have been less fun than you remember), and there is no escaping that the entire dot-com era was a disaster for IT/CS in general and for software quality and Unix in particular.

I have not seen any competent analysis of how much bigger the IT industry became during the dot-com years. My own estimate is that—counted in the kinds of jobs that would until then have been behind the locked steel doors of the IT department—our trade grew by two orders of magnitude, or if you prefer, by more than 10,000 percent.

Getting hooked on computers is easy—almost anybody can make a program work, just as almost anybody can nail two pieces of wood together in a few tries. The trouble is that the market for two pieces of wood nailed together—inexpertly—is fairly small outside of the "proud grandfather" segment, and getting from there to a decent set of chairs or fitted cupboards takes talent, practice, and education. The extra 9,900 percent had neither practice nor education when they arrived in our trade, and before they ever had the chance to acquire it, the party was over and most of them were out of a job. I will charitably assume that those who managed to hang on were the most talented and most skilled, but even then there is no escaping that as IT professionals they mostly sucked because of their lack of ballast.

The bazaar meme advocated by Raymond, "Just hack it," as opposed to the carefully designed cathedrals of the pre-dot-com years, unfortunately did, not die with the dot-com madness, and today Unix is rapidly sinking under its weight.

I updated my laptop. I have been running the development version of FreeBSD for 18 years straight now, and compiling even my Spartan work environment from source code takes a full day, because it involves trying to make sense and architecture out of Raymond's anarchistic software bazaar.

At the top level, the FreeBSD ports collection is an attempt to create a map of the bazaar that makes it easy for FreeBSD users to find what they need. In practice this map consists, right now, of 22,198 files that give a summary description of each stall in the bazaar—a couple of lines telling you roughly what that stall offers and where you can read more about it. Also included are 23,214 Makefiles that tell you what to do with the software you find in each stall. These Makefiles also try to inform you of the choices you should consider, which options to choose, and what would be sensible defaults for them. The map also conveniently comes with 24,400 patch files to smooth over the lack of craftsmanship of many of the wares offered, but, generally, it is lack of portability that creates a need for these patch files.

Finally, the map helpfully tells you that if you want to have www/firefox, you will first need to get devel/nspr, security/nss, databases/sqlite3, and so on. Once you look up those in the map and find their dependencies, and recursively look up their dependencies, you will have a shopping list of the 122 packages you will need before you can get to www/firefox.

Modularity and code reuse are, of course, A Good Thing. Even in the most trivially simple case, however, the CS/IT dogma of code reuse is totally foreign in the bazaar: the software in the FreeBSD ports collection contains at least 1,342 copied and pasted cryptographic algorithms.

If that resistance/ignorance of code reuse had resulted in self-contained and independent packages of software, the price of the code duplication might actually have been a good tradeoff for ease of package management. But that was not the case: the packages form a tangled web of haphazard dependencies that results in much code duplication and waste.

Here is one example of an ironic piece of waste: Sam Leffler's graphics/libtiff is one of the 122 packages on the road to www/firefox, yet the resulting Firefox browser does not render TIFF images. For reasons I have not tried to uncover, 10 of the 122 packages need Perl and seven need Python; one of them, devel/glib20, needs both languages for reasons I cannot even imagine.

Further down the shopping list are repeated applications of the Peter Principle, the idea that in an organization where promotion is based on achievement, success, and merit, that organization's members will eventually be promoted beyond their level of ability. The principle is commonly phrased, "Employees tend to rise to their level of incompetence." Applying the principle to software, you will find that you need three different versions of the make program, a macroprocessor, an assembler, and many other interesting packages. At the bottom of the food chain, so to speak, is libtool, which tries to hide the fact that there is no standardized way to build a shared library in Unix. Instead of standardizing how to do that across all Unixen—something that would take just a single flag to the ld(1) command—the Peter Principle was applied and made it libtool's job instead. The Peter Principle is indeed strong in this case—the source code for devel/libtool weighs in at 414,740 lines. Half that line count is test cases, which in principle is commendable, but in practice it is just the Peter Principle at work: the tests elaborately explore the functionality of the complex solution for a problem that should not exist in the first place. Even more maddening is that 31,085 of those lines are in a single unreadably ugly shell script called configure. The idea is that the configure script performs approximately 200 automated tests, so that the user is not burdened with configuring libtool manually. This is a horribly bad idea, already much criticized back in the 1980s when it appeared, as it allows source code to pretend to be portable behind the veneer of the configure script, rather than actually having the quality of portability to begin with. It is a travesty that the configure idea survived.

The 1980s saw very different Unix implementations: Cray-1s with their 24-bit pointers, Amdahl UTS mainframe Unix, a multitude of more or less competently executed SysV+BSD mashups from the minicomputer makers, the almost—but not quite—Unix shims from vendors such as Data General, and even the genuine Unix clone Coherent from the paint company Mark Williams.

The configure scripts back then were written by hand and did things like figure out if this was most like a BSD- or a SysV-style Unix, and then copied one or the other Makefile and maybe also a .h file into place. Later the configure scripts became more ambitious, and as an almost predictable application of the Peter Principle, rather than standardize Unix to eliminate the need for them, somebody wrote a program, autoconf, to write the configure scripts.

Today's Unix/Posix-like operating systems, even including IBM's z/OS mainframe version, as seen with 1980 eyes are identical; yet the 31,085 lines of configure for libtool still check if <sys/stat.h> and <stdlib.h> exist, even though the Unixen, which lacked them, had neither sufficient memory to execute libtool nor disks big enough for its 16-MB source code.

How did that happen?

Well, autoconf, for reasons that have never made sense, was written in the obscure M4 macro language, which means that the actual tests look like this:

## Whether `make' supports order-only prerequisites.
AC_CACHE_CHECK([whether ${MAKE-make} supports order-only prerequisites],
  [lt_cv_make_order_only],
  [mkdir conftest.dir
   cd conftest.dir
   touch b
   touch a
cat >confmk << 'END'
a: b | c
a b c:
       touch $[]@
END
  touch c
  if ${MAKE-make} -s -q -f confmk >/dev/null 2>&1; then
    lt_cv_make_order_only=yes
  else
    lt_cv_make_order_only=no
  fi
  cd ..
  rm -rf conftest.dir
])
if test $lt_cv_make_order_only = yes; then
  ORDER='|'
else
  ORDER=''
fi
AC_SUBST([ORDER])

Needless to say, this is more than most programmers would ever want to put up with, even if they had the skill, so the input files for autoconf happen by copy and paste, often hiding behind increasingly bloated standard macros covering "standard tests" such as those mentioned earlier, which look for compatibility problems not seen in the past 20 years.

This is probably also why libtool's configure probes no fewer than 26 different names for the Fortran compiler my system does not have, and then spends another 26 tests to find out if each of these nonexistent Fortran compilers supports the -g option.

That is the sorry reality of the bazaar Raymond praised in his book: a pile of old festering hacks, endlessly copied and pasted by a clueless generation of IT "professionals" who wouldn't recognize sound IT architecture if you hit them over the head with it. It is hard to believe today, but under this embarrassing mess lies the ruins of the beautiful cathedral of Unix, deservedly famous for its simplicity of design, its economy of features, and its elegance of execution. (Sic transit gloria mundi, etc.)

One of Brooks's many excellent points is that quality happens only if somebody has the responsibility for it, and that "somebody" can be no more than one single person—with an exception for a dynamic duo. I am surprised that Brooks does not cite Unix as an example of this claim, since we can pinpoint with almost surgical precision the moment that Unix started to fragment: in the early 1990s when AT&T spun off Unix to commercialize it, thereby robbing it of its architects.

More than once in recent years, others have reached the same conclusion as Brooks. Some have tried to impose a kind of sanity, or even to lay down the law formally in the form of technical standards, hoping to bring order and structure to the bazaar. So far they have all failed spectacularly, because the generation of lost dot-com wunderkinder in the bazaar has never seen a cathedral and therefore cannot even imagine why you would want one in the first place, much less what it should look like. It is a sad irony, indeed, that those who most need to read it may find The Design of Design entirely incomprehensible. But to anyone who has ever wondered whether using m4 macros to configure autoconf to write a shell script to look for 26 Fortran compilers in order to build a Web browser was a bit of a detour, Brooks offers well-reasoned hope that there can be a better way.

LOVE IT, HATE IT? LET US KNOW

feedback@queue.acm.org

Poul-Henning Kamp (phk@FreeBSD.org) has programmed computers for 26 years and is the inspiration behind bikeshed.org. His software has been widely adopted as under-the-hood building blocks in both open source and commercial products. His most recent project is the Varnish HTTP accelerator, which is used to speed up large Web sites such as Facebook.

© 2012 ACM 1542-7730/12/0800 $10.00

acmqueue

Originally published in Queue vol. 10, no. 8
see this item in the ACM Digital Library





Related:

Terence Kelly - Persistent Memory Programming on Conventional Hardware
The persistent memory style of programming can dramatically simplify application software.


Tom Killalea - Velocity in Software Engineering
From tectonic plate to F-16


Russ Cox - Surviving Software Dependencies
Software reuse is finally here but comes with risks.


Natasha Noy, Yuqing Gao, Anshu Jain, Anant Narayanan, Alan Patterson, Jamie Taylor - Industry-scale Knowledge Graphs: Lessons and Challenges
Five diverse technology companies show how it's done



Comments

(newest first)

Displaying 10 most recent comments. Read the full list here

Don Hopkins | Sun, 03 Dec 2017 17:33:39 UTC

Gnu Autoconf is the epitome of FUBAR Bazaar.


ToneLoc | Wed, 28 Sep 2016 11:18:26 UTC

Andrew Punch comments " If the author doesn't like the FreeBSD way there are plenty of other Linux and FreeBSD distributions. "

Something tells me the author knows this, thanks for the laugh.

another good laugh from "whatever" "Oh, you wrote some crappy code to speed up Facebook?"

Great article, funny comments. Glad to be here.


Anton Gerasimov | Wed, 10 Aug 2016 13:09:21 UTC

Sound legit, but we've seen Plan9 and Inferno which were build in a cathederal fashion by first-class architects (some of them from Unix team also). I'm not discussing architectural decisions but generally, in terms of popularity both were failures. It seems that while designs of GNU/Linux and FreeBSD systems (taken as a whole) are somewhat suboptimal they're just "good enough", i.e. nobody has managed to build something so much better, that benefits of better architecture overweight benefits of bazaar organization.


peter | Tue, 09 Aug 2016 22:03:49 UTC

I think Poul was having a bad day choosing the colour of his bike shed and had a bit of rant.


Tim | Tue, 09 Aug 2016 02:24:17 UTC

Fred Brooks is that author that everybody says they read and look up to, but I have not anywhere in the world found a software team that's willing to put into practice what he preached, so many decades ago.

I think it would be a fantastic recruiting tool. It'd work on me. "Have you read Fred Brooks? Do you wish your software team were as well run as a surgical team? Come work with us! We actually do all that stuff that everybody else ignores."


Andrew Punch | Tue, 09 Aug 2016 00:42:10 UTC

If only there was some kind of software that would manage packages - we could call it a package manager!

If the author doesn't like the FreeBSD way there are plenty of other Linux and FreeBSD distributions.


sciolist | Wed, 10 Jun 2015 17:40:02 UTC

Democracy is based on the assumption that a million men are wiser than one man. Hows that again? I missed something.

Autocracy is based on the assumption that one man is wiser than a million men. Lets play that over again, too. Who decides?

-RAH

So I guess it comes down to which side of this argument you take. Where your bias lies, in other words. I think the key through the horns of the dilemma is being able to communicate about the things you care about; being able to admit you might, possibly, have been wrong, or even worse, right about something; being able to make the singular contribution that only you can make in the context of what might be best for the group you are working for or with. The assumption here, of course, is that everyone is always striving to be the best person that they can be. Constantly striving and discerning how they can make the best out of the situation for everyone.

But we're lazy. And greedy. And willful. And deceitful. And noble. And brilliant. And humble. And industrious. Welcome to the Human Condition. We keep trying to define normative behaviors ("should" or "ought" things) in terms of an either/or model. Maybe we should re-think that. Maybe it should be an "and" model. In this case it doesn't matter much since this is basically a blog post. But at the very least, let's be civil about it, whatever the personal bias you have on the matter. Ad hominem attacks just don't help anyone and tend to move focus away from the point at hand and into the cults of personality and rhetoric. So STFU 'whatever'.


whatever | Mon, 12 Jan 2015 01:40:08 UTC

Igor: He's not going to write a line of code to address it, because he's just a bitch--not a problem solver.

Yet arrogant tight ass pretends that he's superior to others while lamenting that the world isn't perfect. There are few things worse than snobby coders, which is what you are. You're nothing special. You haven't contributed anything that matters. Oh, you wrote some crappy code to speed up Facebook? Wow, you're really important. You're as replaceable as the people you're criticizing, and you're just as much of a code monkey as they are.

Coding is just like anything else humans do. It's a big mess. You can't clean it up without tearing it down, and nobody wants to lose 20 years of functionality that's been built because some crybaby can't handle reality. People like you are a dime a dozen and tend not to be the people who actually get things done.

Deal with it."


Igor | Sat, 10 Jan 2015 00:50:35 UTC

The author might be completely right (I don't have as many years of experience, haven't seen the UNIX in the 80's or early 90's), but I think that the solution is to fix code instead of complaining. Hacks are not going anywhere. autoconf is bad? libtool is bad? Start a project to rewrite it, dedicate some FreeBSD resources to make it better, or hell - rewrite it yourself. I would love to help, but I am a hack at writing code (IT professional with scripting).

What is definitely right is that software in general is more and more bloated and eats more and more resources. The aforementioned elegance is gone. I attribute a lot of this however to using higher and higher languages, not caring about resources.


gbonehead | Wed, 31 Dec 2014 18:49:47 UTC

Ah, the UNIX-HATERS Handbook is being reincarned! :)

http://www.amazon.com/UNIX-Haters-Handbook-UNIX-Haters--line/dp/1568842031


Displaying 10 most recent comments. Read the full list here
Leave this field empty

Post a Comment:







© 2019 ACM, Inc. All Rights Reserved.