Search This Blog

Loading...

Thursday, May 1, 2014

KVM and Docker LXC Benchmarking with OpenStack

Forward

Linux containers (LXCs) are rapidly becoming the new "unit of deployment" changing how we develop, package, deploy and manage applications at all scales (from test / dev to production service ready environments). This application life cycle transformation also enables fluidity to once frictional use cases in a traditional hypervisor Virtual Machine (VM) environment. For example, developing applications in virtual environments and seamlessly "migrating" to bare metal for production. Not only do containers simplify the workflow and life cycle of application development / deployment, but they also provide performance and density benefits which cannot be overlooked.

At the forefront of Linux Container tooling we have docker -- a LXC framework / runtime which abstracts out various aspects of the underlying realization by providing a pluggable architecture supporting various storage types, LXC engines / providers, etc.. In addition to making LXC dead easy and fun, docker also brings a set of capabilities to the table which make containers more productive including; automated builds (make files for LXC images), versioning support, fully featured REST API + CLI, the notion of image repositories and more.

Before going any further, let me reiterate some of the major benefits of Linux Containers from a docker perspective: 

  • Fast
    • Runtime performance at near bare metal speeds (typically 97+ percent or bare metal -- a few ticks shaven off for bean counters).
    • Management operations (boot, stop, start, reboot, etc.) in seconds or milliseconds.
  • Agile
    • VM-like agility -- it's still "virtualization".
    • Seamlessly move between virtual and bare metal environments permitting new development workflows which reduce costs (e.g. develop on VMs and move to bare metal in the "click of a button" for production).
  • Flexible
    • Containerize a "system" (OS less the kernel).
    • Containerize "application(s)".
  • Lightweight
    • Just enough Operating System (JeOS); include only what you need reducing image and container bloat.
    • Minimal per container penalty which equates to greater density and hence greater returns on existing assets -- imagine packing 100s or 1000s of containers on a single host node.
  • Inexpensive
    • Open source -- free -- lower TCO.
    • Supported with out-of-the-box modern Linux kernels.
  • Ecosystem
    • Growing in popularity -- just checkout the google trends for docker or LXC.
    • Vibrant community and numerous 3rd party applications (1000s of prebuilt images on docker index and 100s of open source apps on github or other public sources).
  • Cloudy
    • Various Cloud management frameworks provide support for creating and managing Linux Containers -- including OpenStack my personal favorite.

Using google-foo (searching), it's fairly easy to find existing information describing the docker LXC workflow, portability, etc.. However to date I haven't seen many data points illustrating the Cloudy operational benefits of LXC or the density potential gained by using a LXC technology vs. a traditional VM. As a result I decided to provide some Cloudy benchmarking using OpenStack with docker LXC and KVM -- the topic of this post and a recent presentation I posted to slideshare.




OpenStack benchmarking with docker LXC

As luck would have it my favorite Cloud framework, OpenStack, provides some level of integration with docker LXC. Specifically there has been a nova virt driver for docker LXC (which includes a glance translator to support docker based images) since the Havana time-frame and now in Icehouse we have heat integration via a plugin for docker. The diagrams below depict the high level architecture of these components in OpenStack (courtesy of docker presentations).


docker nova virt driver and heat plugin for OpenStack


Rather than taking a generic approach to the Cloudy benchmarks, I set out with a specific high level use case in mind -- "As an OpenStack Cloud user, I want a Ubuntu based VM with MySQL" -- and sought to answer the question "why would I choose docker LXC vs KVM". Honing the approach in this fashion allowed me to focus in on a more constrained set of parameters and in particular to use a MySQL enabled VM / Container for the tests.

To drive the Cloudy tests from an OpenStack perspective, OpenStack project rally fits the bill perfectly. Rally allowed me to drive the OpenStack operations in a consistent, automated fashion while collecting operational times from a user point of view. Thus, rally was used to drive 3 sets of tests:
  • Serial packing of 15 VMs on a single compute node (KVM vs. docker LXC).
  • Serial soft reboot of VMs on a single compute node (KVM vs docker LXC).
  • VM boot and snapshot the VM to image (KVM vs docker LXC).

While rally was used to drive the tests through OpenStack and collect Cloudy operational times, additional metrics were needed from a compute node perspective to gain incite as to the resource usage during the Cloudy operations. I used dstat to collect various system resource metrics at 1 second intervals (in CSV format) while the Cloudy benchmarks were running. These metrics where then imported into a spread sheet where they could be analyzed and graphed. The result was a baseline set of data reflecting the average operational times through OpenStack as well as compute node resource metrics collected during those tests.

While I don't want to regurgitate all the results here (see the presentation for complete details), I do want to call out a few high level findings based on the captured data:
  • From an OpenStack Cloudy operational time perspective (boot, reboot, delete, snapshot, etc.) docker LXC outperformed KVM ranging from 1.09x (delete) to 49x (reboot).
  • Based on the compute node resource usage metrics during the serial VM packing test:
    • Docker LXC CPU growth is approximately 26x lower than KVM. On this surface this indicates a 26x density potential increase from a CPU point of view using docker LXC vs a traditional hypervisor.
    • Docker LXC memory growth is approximately 3x lower than KVM. On the surface this indicates a 3x density potential increase from a memory point of view using docker LXC vs a traditional hypervisor.

The presentation in full can be found here:




I've also made the raw data results available below. Feel free to use them however you see fit. Note that I didn't have the bandwidth to format / label all the data clearly so if you have questions let me know and I'll do my best to describe / update.

benchmark raw data

Note that although the performance and density factor look promising using the docker integration in OpenStack, these components are still under heavy development and thus lack a full set of feature parity. I believe we will see these gaps closed moving forward, so I would encourage you to put on your python hat and help out if you can.



Finally, I will be discussing these results as well as providing a quick overview on realizing linux containers at some upcoming conferences:


If you attend either of these conferences I would encourage you to stop by my sessions and chat containers. 



Parting thoughts for docker image creators

As noted in the benchmark results, docker uses a SIGTERM signal to stop your container's processes. By default bash ignores SIGTERM and hence if your container's CMD is bootstrapped using a bash script (or invocation) you will experience longer than necessary stop times for your container unless explicitly trapping / handling the SIGTERM. I would encourage docker image developers to build awareness of this fact and develop their images with SIGTERM handling to ensure optimal container stop times.

4 comments:

  1. Told you so - https://openvz.org/Performance

    ReplyDelete
  2. Your load avg graphs thrash any credibility you might have had.

    ReplyDelete
    Replies
    1. Unfortunately your comment does not provide enough details for me act upon. Should you care to elaborate on your concern I'd be happy to revisit the data, rerun the tests, etc. to double check any potential errors in the data.

      Moreover you can find raw data collect from the system using dstat linked in the blog post above.

      Delete
  3. FYI -- comments about your posting @ https://news.ycombinator.com/item?id=7696011

    ReplyDelete