Docker is the dangerous gamble which we will regret
(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com
There is perhaps one good argument for using Docker. It is hidden by the many bad arguments for using Docker. I’m going to try to explain why so much Docker rhetoric is stupid, and then look at what reason might be good.
Every time I criticize Docker I get angry responses. When I wrote Why would anyone choose Docker over fat binaries? 6 months ago I saw some very intelligent responses on Hacker News, but also some angry ones. So in writing this current essay, I am trying to answer some of the criticism I expect to get in the future.
But I guess I am lucky because so far I have not gotten a reaction as angry as what The HFT Guy had to face when he talked about his own failed attempt to use Docker in production at the financial firm where he works:
I received a quite insulting email from a guy who is clearly in the amateur league to say that “any idiot can run Docker on Ubuntu” then proceed to give a list of software packages and advanced system tweaks that are mandatory to run Docker on Ubuntu, that allegedly “anyone could have found in 5 seconds with Google“.
On the Internet, that kind of anger is normal. I don’t know why, but it is. Many developers get angry when they hear someone criticize a technology which they favor. That anger gets in the way of their ability to figure out the long-term reality of the situation.
Docker promises portability and security and resource management and orchestration. Therefore the question that rational people should want to answer is “Is Docker the best way to gain portability and security and resource management and orchestration?”
I’m going to respond to some of the responses I got. Some of these are easy to dismiss. There is one argument that is not easy to dismiss. I’ll save that for the end.
One person wrote:
Because choosing Docker requires boiling fewer oceans, and whether those oceans should or should not be boiled has no bearing on whether I can afford to boil them right now.
Okay, but compared to what? Having your devops person write some scripts to standardize the build, the deployment, the orchestration, and the resource use? The criticism seems to imply “I don’t want the devops person to do this, because the result will be ad-hoc, and I want something standardized.”
Docker wins because developers and managers see it as offering something less custom, less chaotic, less ad-hoc, and more standardized. Or at least, having the potential to do so. The reality of Docker has been an incredible mess so far (see Docker in production: a history of failure). But many are willing to argue that all of the problems will soon be resolved, and Docker will emerge as the stable, consistent standard for containerization. This is a very large gamble. Nearly every company that’s taken this gamble so far has ended up burned, but companies keep taking this gamble on the assumption it is going to pay off big at some point soon.
Every company that I have worked with, over the last two years, was either using Docker or was drawing up plans to soon use Docker. They are implicitly paying a very high price to have a standardized solution, rather than an ad-hoc build/deploy/orchestrate script. I personally have not yet seen a case where this was the economically rational choice, so either companies are implicitly hoping this will pay off in the long-run, or they are being irrational.
I use the word “implicitly” because I’ve yet to hear a tech manager verbalize this gamble explicitly. Most people who defend Docker talk about how it offers portability or security or orchestration or configuration. Docker can give us portability or security or orchestration or configuration, but at a cost of considerable complexity. Writing an ad-hoc script would be easier in most cases.
The best articles about Docker emphasize the trade-offs that one makes by choosing to use it:
It’s best to think of Docker as an advanced optimization. Yes, it is extremely cool and powerful, but it adds significantly to the complexity of your systems and should only be used in mission critical systems if you are an expert system administrator that understands all the essential points of how to use it safely in production.
At the moment, you need more systems expertise to use Docker, not less. Nearly every article you’ll read on Docker will show you the extremely simple use-cases and will ignore the complexities of using Docker on multi-host production systems. This gives a false impression of what it takes to actually use Docker in production.
In the world of computer programming, we have the saying “Premature optimization is the root of all evil.” Yet most of my clients this year have insisted “We must Dockerize everything, right from the start.” Rather than build a working system, and then put it in production, and then maybe see if Docker offers an advantage over simpler tools, the push has been to standardize the development and deployment around Docker.
A common conversation:
Me: “We don’t need Docker this early in the project.”
Them: “This app requires Nginx and PostGres and Redis and a bunch of environment variables. How are you going to set all that up without Docker?”
Me: “I can write a bash script. Or “make”. Or any other kind of installer, of which there are dozens. We have all been doing this for many years.”
Them: “That’s insane. The bash script might break, and you’ll have to spend time debugging it, and it won’t necessarily work the same on your machine, compared to mine. With Docker, we write the build script and then its guaranteed to work the same everywhere, on your machine as well as mine.”
Like all sales pitches, this is seductive because it leads with the most attractive feature of Docker. As a development tool, Docker can seem less messy and more consistent than other approaches. It’s the second phase of Docker use, when people try to use it in production, where life becomes painful.
Your dev team might have one developer who owns a Windows machine, another who runs a Mac, another who has installed Ubuntu, and another who has installed RedHat. Perhaps you, the team lead, have no control over what machines they run. Docker can seem like a way to be sure they all have the same development environment. (However, when you consider the hoops you have to jump through to use Docker from a Windows machine, anyone who tells you that Docker simplifies development on a Windows machine is clearly joking with you.)
But when you go to production, you will have complete control over what machines you run in production. If you want to standardize on CentOS, you can. You can have thousands of CentOS servers, and you can use an old technology, such as Puppet, to be sure those servers are identical. The argument for Docker is therefore weaker for production. But apparently having used Docker for development, developers feel it is natural to also use it in production. Yet this is a tough transition.
I can cite a few examples, regarding the problems with Docker, but after a certain point the examples are boring. There are roughly a gazillion blog posts where people have written about the flaws of Docker. Anyone who fails to see the problems with Docker is being willfully blind, and this essay will not change their mind. Rather, they will ignore this essay, or if they read it, they will say, “The Docker eco-system is rapidly maturing and by next year it is going to be really solid and production ready.” They have said this every year for the last 5 years. At some point it will probably be true. But it is a dangerous gamble.
Despite all the problems with Docker, it does seem to be winning — every company I work with seems eager to convert to Docker. Why is that? As near as I can tell, the main thing is standardization.
Again, from the Hacker News responses to my previous essay, “friend-monoid” wrote this defense of Docker:
We have a whole lot of HTTP services in a range of languages. Managing them all with [uber] binaries would be a chore – the author would have to give me a way to set port and listen address, and I have to keep track of every way to set a port. With a net namespace and clever iptables routing, docker can do that for me.
notyourday wrote the response that I wish I’d written:
Of course if you had the same kind of rules written and followed in any other method, you would arrive at the exactly the same place. In fact, you probably would arrive at a better place because you would stop thinking that your application works because of some clever namespace and iptables thing.
anilakar wrote this response to notyourday:
I think that the main point was that docker skills are transferable, i.e. you can expect a new hire to be productive in less time. Too many companies still have in-house build/deploy systems that are probably great for their purpose but don’t offer valuable experience that would be usable outside that company.
And as near as I can tell, this is 100% why Docker is winning. Forget all the nonsense you read about Docker making deployment or security or orchestration easier. It doesn’t. But it is emerging as a standard, something a person can learn at one company and then take to another company. It isn’t messy and ad-hoc the way a custom bash script would be. And that is the real argument in favor of Docker. Whether it can live up to that promise is the gamble.
At the risk of being almost petty, I should point out that these arguments confuse containers with Docker. And I think many pro-Docker people deliberately confuse the issue. Even if containers are a great idea, Docker is driven forward by a specific company which has specific problems. Again from HTF Guy:
Docker has no business model and no way to monetize. It’s fair to say that they are releasing to all platforms (Mac/Windows) and integrating all kind of features (Swarm) as a desperate move to 1) not let any competitor have any distinctive feature 2) get everyone to use docker and docker tools 3) lock customers completely in their ecosystem 4) publish a ton of news, articles and releases in the process, increasing hype 5) justify their valuation.
It is extremely tough to execute an expansion both horizontally and vertically to multiple products and markets. (Ignoring whether that is an appropriate or sustainable business decision, which is a different aspect).
In the meantime, the competitors, namely Amazon, Microsoft, Google, Pivotal and RedHat all compete in various ways and make more money on containers than Docker does, while CoreOS is working an OS (CoreOS) and competing containerization technology (Rocket).
So even if you believe containers are a fantastic idea because they make everyone’s setup consistent, Docker itself remains a dangerous gamble.
But okay, let’s treat Docker and containers as somewhat the same thing for now. Of the criticisms that were thrown at my earlier essay, which criticisms were valid?
One mistake I made in that earlier essay was using the phrase “fat binary.” That lead to a lot of confusion. After a few hours I added this disclaimer:
In this essay, I use the phrase “fat binary” to refer to a binary that has included all of its dependencies. I am not using it to refer to the whole 32 bit versus 64 bit transition. If I was only writing about the world of Java and the JVM, I would have used the word “uberjar” but I avoided that word because I also wanted to praise the Go language and its eco-system.
I wish I’d used the phrase “uber binary” which might be a little bit better, though it is a biased phrased, as it shows how much I’ve worked in the JVM world. But it’s the best I can think of, so I’ll use “uber binary” in this essay.
I wish developers were more willing to consider the possibility that their favorite computer programming may not be ideal for a world of distributed computing in the cloud. Apparently I’m shouting into the wind on this issue. Developers feel strongly that the world needs to adapt to their PHP code, or their Ruby code, or their Python code. It’s never their code that needs to adapt to a changing world.
If you are starting a new project today, and you expect it grow large enough that you will have to worry about scale, or you simply want it to be highly available, you have the option to use a modern language that has been designed for cloud computing. There are many new languages that have some wonderful features. The only two that I have experience with are Go and Clojure. I don’t have much experience with Rust or Scala or Kotlin, so I can not say much about them. Maybe they are wonderful (in my previous essay, many readers seem to think I was insulting the languages that I did not mention. I don’t mean to insult these languages, but I can only praise the languages that I’ve had some exposure to). Everything I’ve seen and read about Scala’s Akka framework makes me think there is a lot of good ideas there. I have not used it, but it seems smart and modern.
Responding to my earlier essay, btown wrote:
Anyone who thinks that all modern web applications are made in Golang or on the JVM is in a pretty weird echo chamber.
Again, it’s great that there are so many languages out there, but I don’t know all of them, and I can only meaningfully praise the one’s I’ve had some experience with. But I can also meaningfully criticize the older languages that I’ve worked with, and that includes many years working with PHP, then Ruby, and more recently Python. They arose from an old paradigm, which is not suited to a world of microservices and distributed computing in the cloud. I wrote about this in Docker protects a programming paradigm that we should get rid of. Nobody seems to be listening to this point right now. I’m reminded of the mania for Object Oriented Programming which peaked around the year 2000. At that time, it was almost impossible to speak out against that paradigm. The tech industry considers itself open minded, but in fact it is full of movements which gather momentum, then shut down all competing conversations, for a few years, then recede, and then it becomes acceptable for all of us to poke fun at how excessive some of the arguments were. In 2000 the excesses were XML and Object Oriented Programming. Nowadays it is Docker.
I have used Clojure a lot. To write an app and create an uberjar which binds up the dependencies seems like a very wise step. And I know how to set up a system such as Jenkins, so my Clojure builds are automated. And I know that some companies have gone to incredible extremes, in terms of building apps that have no outside dependencies, including the astonishing step of bundling the database inside of the uberjar. Consider “60,000% growth in 7 months using Clojure and AWS“:
This led to Decision Two. Because the data set is small, we can “bake in” the entire content database into a version of our software. Yep, you read that right. We build our software with an embedded instance of Solr and we take the normalized, cleansed, non-relational database of hotel inventory, and jam that in as well, when we package up the application for deployment.
We earn several benefits from this unorthodox choice. First, we eliminate a significant point of failure – a mismatch between code and data. Any version of software is absolutely, positively known to work, even fetched off of disk years later, regardless of what godawful changes have been made to our content database in the meantime. Deployment and configuration management for differing environments becomes trivial.
Second, we achieve horizontal shared-nothing scalabilty in our user-facing layer. That’s kinda huge. Really huge.
If you can get this kind of massive scaling without having to introduce new technologies (such as Docker) then you should do so. Solve your problems in the simplest way you can. If the switch away from Ruby/Python/Perl to a newer language and eco-system allows you to achieve massive scale with less technologies and less moving parts, then you absolutely have a professional obligation to do so. Again, the ideal is to achieve your goals in the simplest way possible, and most times this means using the least number of technologies. Inspired by Rich Hickey, I would contrast “simple” versus “easy”. Using an old language that you already know is easy, whereas it is hard but simple learning a new language that allows you to reduce the total number of technologies in your system. “Simple” here means that your system ends up being simpler than it would be otherwise — that is, it has less code or less configuration or a smaller number of technologies in use.
I know, from previous essays, that as soon as I mention Jenkins, some people will suggest that my mindset is out of date, but Sometimes Boring Is Better:
The nice thing about boringness (so constrained) is that the capabilities of these things are well understood. But more importantly, their failure modes are well understood. […] But for shiny new technology the magnitude of unknown unknowns is significantly larger, and this is important.
In other words, software that has been around for a decade is well understood and has fewer unknowns. Fewer unknowns mean less operational overhead, which is a good thing.
I’ve learned that many developers have a strong biases, and when they read essays like this they tend to be looking for an excuse to dismis the whole essay. So if I mention Jenkins or Ansible or Go or Clojure or Kotlin or Akka or any other tech, and they know of a flaw with any of those technologies, they go “This guy is stupid, so I can ignore this essay.” I don’t know of any way to reach those people, other than putting in these disclaimers, and even these disclaimers probably won’t convince those who really don’t want to be convinced.
Regarding my earlier essay, tytso wrote:
And the statement that the Go language is the pioneer for fat binaries is, well, just wrong. People were using static binaries (with, in some cases, built-in data resources) to solve that problem for literally decades. MacOS, for one.
I apologize for any confusion, but the idea I am trying communicate is a binary file that contains all dependencies, plus all necessary configuration, plus any resource that you can possibly put in it, if putting that resource in simplifies the overall system. The concept is somewhat broader than simply linking static libraries. Modern continuous integration systems can be configured to be sure that each binary is given specific configuration information, which might be unique to that particular instance of the binary, so even if you need to have a thousand instances of the same app, you can build a thousand variants with slight variations of the configuration. You can do this using slightly older build systems, such as Jenkins, which are well understood and which are boring in all of the good ways. You should be careful about jumping up to the level of complexity of Docker. (And please don’t obsess over the fact that I used Jenkins in this example, if you prefer to use Travis, TeamCity, or Bamboo then use those. Use any of these. I am often stunned at developers’s willingness to dismiss a whole essay because they didn’t like one technology whose name is used merely as an example.)
A few people said this:
Docker protects against the danger of vendor lock-in on the part of the cloud providers
This is bunk. Any devops tool that standardizes deployment protects you from vendor lock-in. And there is an abundance of such tools.
friend-monoid wrote:
If my colleagues don’t have to understand how do deploy applications properly, their work is simplified greatly. If I can just take their work, put it in a container, no matter the language or style, my work is greatly simplified. I don’t have to worry about how to handle crashes, infinite loops, or other bad code they write.
notyourday response sums up my own attitude:
Of course you do, you just moved the logic into the “orchestration” and “management” layer. You still need to write the code to correctly handle it. Throwing K8S at it is putting lipstick on a pig. It is still a pig.
More so, if you are at a small company, and there are only three developers, then you will have to deal with each other’s code, which might be code that crashes, but if you are at a large company, where the devops team is separate from the programming team, then this is an issue that devops has been dealing with for many years, typically using health checks of some kind. And the health checks still need to be written, and this is something that Docker has not standardized. You, the developer of the app, need to create some endpoint that can send a 200 response that the devops team can test regularly. It is bogus to mention Docker in this context, since it contributes nothing. Many devops teams have scripts that test if an app is alive, and if it seems to be non-responsive, then the app is killed and restarted.
Lazare wrote:
Let’s say I’m working on an existing code base that has been built in the old-style scripting paradigm using a scripting language like Ruby, PHP, or (god help us) node.js.
…I can just about see how we can package up all our existing code into docker containers, sprinkle some magic orchestration all over the top, and ship that.
I can also see, as per the article, an argument that we’d be much better off with [uber] binaries. But here’s the thing: You can dockerise a PHP app. How am I meant to make a [uber] binary out of one? And if your answer is “rewrite your entire codebase in golang”, then you clearly don’t understand the question; we don’t have the resources to do that, we wouldn’t want to spend them on a big bang rewrite even if we did, and in any case, we don’t really like golang.
In this example a company has had a PHP app for a long time, and now it needs to Dockerize that app. Why is this? What has changed such that the app needs to be Dockerized now? This feels like an artificially constrained example. How did the app work before you Dockerized it? What was the problem with the old approach, that you feel that Docker will solve?
Is the end goal orchestration of this app with other (perhaps newer) apps? Docker plus orchestration generally means Docker plus Kubernetes. (Or you could pursue a more unusual choice, using Mesos or Nomad or some other alternative, rather than Kubernetes.) This is a very complex setup, and a company should think long and hard before committing to this path. Read Is K8s Too Complicated? If your app is small, then a complete re-write might be a good option, and if your app is large, ask yourself if there is a simpler path forward for your company, such as writing some Chef or Ansible scripts.
I once worked with a massive monolithic app which had been written in PHP, using the Symfony framework. It suffered terrible performance issues. We began to very gradually pull it apart, keeping the Symfony monolith for the HTML template rendering, but pulling out the real performance blocks and re-writing them in more performant languages. I wrote about this in An architecture of small apps. At that time, the devops crew was using Puppet plus some custom scripts to handle deployments. And that was enough for us. And that had the wonderful benefit of using boring, stable technologies.
Remember, you only have a finite amount of time. Whatever time you spend Dockerizing your PHP code is time you are not modernizing your app in other ways. Be sure that investment is worth it.
A curious fact is that the apps I’ve seen where orchestration is needed are not the ones people bring up in examples when discussing Docker. I’ve seen long running data analysis scripts that are run on Spark, and then Nomad was used for orchestration. I’m aware of a massive system where data (a terabyte a day) is added to Kafka, then it goes to Apache Storm and then to ElasticSearch. That system has a complex set of health checks and monitoring but for the bulk of the work, Storm itself is the orchestration tool. Web apps need to be dealing with massive amounts of data before they need massive orchestration. Twitter deals with massive amounts of data, and uses Aurora for orchestration. Are you Twitter scale? If you are using Docker and Kubernetes for an ordinary website, then please stop. There are simpler ways to run a website. We’ve been building websites for 25 years now, and we didn’t need Docker.
mwcampbell wrote:
“It seems sad that so much effort should be made to keep old technologies going”
I strongly disagree with this part. To make progress as a technological civilization, without constantly wasting time reinventing things, we need to keep old technologies working. So, if Docker keeps that Rails app from 2007 running, that’s great. And maybe we should still develop new apps in Rails, Django, PHP, and the like. It’s good to use mature platforms and tools, even if they’re not fashionable.
That word “fashionable” brings me to something that really rubs me the wrong way about this piece, and our field in general. Can we stop being so fashion-driven? It’s tempting to conflate technology with pop culture, to assume that anything developed during the reign of grunge music, for example, must not be good now. But good technology isn’t like popular music; something that was a good idea and well executed in 1993 is probably still good today.
This is a wildly ironic comment. Apparently sober realists think we should Dockerize everything, whereas crazy people like me think we should use older devops tools, combined with newer languages. My choice is driven by “fashion” whereas their love of Docker is driven by a desire “to make progress as a technological civilization.”
We should use older, boring technologies as long as they can still do their job well without inflicting additional costs because of the paradigm they bind us to. However, when there are significant changes in the way technology works, we should ask ourselves whether there are some technologies that are no longer the correct choice for the new circumstances. In particular, the shift to cloud computing, and the rise of microservices that run in the cloud, should force us to rethink which technologies we choose to use. The guiding rule should be “What is the simplest way to do what we need to do?” If the older technology gets the job done, and is the simpler approach, then it should be preferred. But if there is a new technology that allows us to simplify our systems, then we should use the new technology.
chmike wrote:
Containers are not only a solution for dependencies. It’s also protection boundary.
neilwilson replied:
It’s just a process with a fancy chroot. Don’t believe all the docker hype. Sensible admins have been doing something similar for years. We just didn’t have a massive PR budget
I’ve nothing to add to that.
Above, I asked “Why not use an uber binary that has no outside dependencies?” You could respond, “That is exactly what Docker is! It is a big fat binary that contains everything the app needs! And it isn’t language dependent! You can use it with Ruby, PHP, Python, Java, Haskell, everything!”
All of which is true, though I would recommend that a company look for chances where it can consolidate the number of technologies that it uses, and perhaps use modern languages and eco-systems that can do this kind of uber binary natively.
A great many people argue for Docker with the assumption that they have no power to effect the technologies in use at their company. The assumption is that the company is automatically going to use a heterogenus mix of technologies, including some old ones that are not well suited to distributed computing in the cloud. And so Docker is the bandaid that hides the penalty that the company pays for not using a language and eco-system that is suited for distributed computing in the cloud.
This kind of passiveness can destroy a company in the long-run. I don’t favor chasing the latest fashions in tech, but I do favor a constant reassessment of what is best for the company, with a view to how the overall landscape of computing is changing. Passive acceptance of legacy apps that become pain points for the company will slow the company over time, and when that day finally arrives when the legacy app can no longer be kept alive, the re-write will be more dangerous for the company, because it will have to be a complete re-write. It is better if a company looks for ways to pull apart pieces of legacy of apps and modernize them. Indeed, one of the most important aspects of microservices is that it allows the piecemeal, incremental modernization of old apps. I wrote about this in The Coral Reef Pattern of incremental improvement.
Above, I mentioned an analytics firm, where data (a terabyte a day) is added to Kafka, then it goes to Apache Storm and then to ElasticSearch. This firm was strongly committed to Python for a long time. As they ran into performance issues, they looked to use Python concurrency systems, such as Tornado, to build massive concurrency into their system. They gave this project to a very intelligent engineer, previously from Intel, and they gave him 3 months to build the new system. Utter failure was the result. They could not get the performance they needed, and even Tornado failed to give them the level of fine-grained concurrency and parallelism they were looking for. Finally, they confronted the idea that they could not use Python for everything. They are now examining Go and Elixir as languages that might give them what they need. (I believe there is a bit of sadness at this company — they had been idealists regarding Python, true Pythonistas.)
I approve of this reappraisal, but I think it should happen constantly.
This is the strongest argument for Docker (written by pmoriarty):
You could make pretty much the exact same set of complaints against all those configuration management tools (ansible/chef/salt/cfengine/puppet). They’re all a huge mess of spaghetti and hackery that works when they work and can be a nightmare otherwise.
All these tools need at least a couple of decades more to mature.
That is true.
Do you need a recipe for running WordPress? Ansible has that, but so does Docker. Likewise, Docker has what you need for running Drupal, or MySQL, or hundreds of other tasks. Docker has somewhat caught up, in terms of offering default setups for common devops needs.
If Chef or Ansible were more mature then the argument for Docker would be much weaker. I knew of a startup, in 2013, that was focused on building a framework for Chef (their framework aimed to be sort of the Ruby On Rails of devops). They ran into some problems, and also they got so much lucrative devops work that they ended up getting distracted. But even if they fail, something similar to what they were working on might one day succeed. Such a framework would have the advantage of relying directly on OS features, without doing as much as Docker to mask the OS.
Both Chef and Ansible promised that there would soon be thousands of scripts for all of the common devops tasks, for every possible type of machine. They have failed to fully deliver on this promise. As late as 2005 it still seemed normal that each company would have a devops person who would write custom scripts for all of the devops tasks in the company. By 2010 it was clear that there should be a central store of recipes for common tasks, much more specific than what had been offered by the old Perl CPAN libraries, and focused especially on the issues of consistency (to help with portability) and security and resource management and orchestration. And Chef took off, and then a little later Ansible took off, and Docker was only a few years behind. And many developers felt that Docker finally delivered some of the things that Chef and Ansible had promised, though for a long time Docker could only deliver for local development. As late as 2015, trying to use Docker in production was suicide. And even the companies that tried to use Docker in production in 2016 ran into an inferno of pain. But clearly, over the last year, things have gotten a bit more stable.
Docker strikes me as a direction that one day will be seen as a mistake. The strongest arguments for it are that it might be a standard if it can mature, and it offers a bandaid for many of the other failures that the tech industry is currently suffering from. Those are bad reasons to love Docker.
I suspect that 5 years from now, looking back, it’ll be clear that there was a less complex way to standardize devops for distributed computing in the cloud. But for now, Docker is winning everywhere.
Source
March 27, 2018 8:49 am
From lawrence on Why are women being pushed away from the tech industry?
"Chris, thank you for writing. In terms of the evidence, the case seems overwhelming. It is damning that the pe..."