The focus of our ZEIT Day Keynote this year was on the new capabilities of the Now cloud platform. In particular, we emphasized our focus on Serverless Docker Deployments.
Today, we are announcing their availability as a public beta, which features:
A 10x-20x improvement in cold boot performance, based on data from 1.5M+ deployments
This translates into a sub-second cold boot (full round trip) for most workloads
A new slot configuration property which defines the resource allocation in terms of CPU and Memory, defaulting to c.125-m512 (.125 of a vCPU and 512MB of memory)
This enables fitting your application into the most appropriate set of constraints, paving the road to special CPU features, GPU cores, etc.
Strictly specified tunable limits
Maximum execution time (defaulting to 5 minutes, with a maximum of 30 minutes)
A shutdown timeout after the last request (defaulting to 1 minute, max 5)
Maximum request concurrency before automatically scaling (defaulting to 10)
Support for HTTP/2.0 and WebSocket connections to the deployments
Let's go a bit deeper into the power of this technology. The next example will take an image right from the Docker registry, of a program written in Go.
This demo highlights:
The usage of an unmodified Dockerfile from the public Docker registry
A different programming language and runtime: Go
Transient statefulness, as evidenced by our ability to inspect the filesystem - After 5 minutes (the default duration), the state will recycle
Sub-500ms cold roundtrip. Go exhibits better startup performance, even though this is a larger application (usually 400ms-500ms for this example)
The service responds to HTTP requests to serve the initial HTML and then a WebSocket connection to exchange the PTY data
This infrastructure works remarkably well in combination with Global Now. In other words, it takes one flag to deploy serverlessly to all our global locations.
This is equivalent to the rest of the examples, but we scaled right from the get-go to all regions by running now --regions all.
It's also possible to scale after you have already deployed, by running:
// scale to sfo
now scale rust-http-microservice-v2.now.sh sfo
// scale to all regions
now scale rust-http-microservice-v2.now.sh all
// disable everywhere
now scale rust-http-microservice-v2.now.sh 0
To underline the ability of this system to automatically scale with the parameters you define, within the boundaries that you define, here is an example that stress tests with wrk, a load-testing tool:
This is, in our opinion, the most important defining characteristic of Serverless Deployments. However, it's not the only one, as we will see next.
We selected these demos in particular to underline a very important point. We think Serverless can be a very general computing model. One that does not require new protocols, new APIs and can support every programming language and framework without large rewrites.
Here are three of the underlying ideas behind this new architecture.
Serverless enables engineers to focus on code rather than managing servers, VMs, registries, clusters, load balancers, availability zones, and so on.
This, in turn, allows you to define your engineering workflow solely around source control and its associated tools (like pull requests). Our recent GitHub integration, therefore, makes it possible to deploy a Docker container in the cloud solely by creating a Dockerfile.
It is not sufficient to ignore that the infrastructure is there, or forget about it. The execution model must make it so that manual intervention, inspection, replication, and monitoring or alert-based uptime assurance is completely unnecessary, which takes us to our next two points.
A very common category of failure of software applications is associated with failures that occur after programs get into states that the developers didn't anticipate, usually arising after many cycles.
In other words, programs can fail unexpectedly from accumulating state over a long lifespan of operation. Perhaps the most common example of this is a memory leak: the unanticipated growth of irreclaimable memory that ultimately concludes in a faulty application.
Serverless means never having to "try turning it off and back on again"
Serverless models completely remove this category of issues, ensuring that no request goes unserviced during the recycling, upgrading or scaling of an application, even when it encounters runtime errors.
Your deployment instances are constantly recycling and rotating. Because of the request-driven nature of scheduling execution, combined with limits such as maximum execution length, you avoid many common operational errors completely.
Perhaps the most important or appealing aspect of the serverless paradigm is the promise of automatic scalability.
In its most basic form, a function automatically scales with a 1:1 mapping of requests to resource allocations. A request comes in, a new function is provisioned or an existing one is re-used.
We have taken this a step further, by allowing you to customize the concurrency your process can handle.
This new infrastructure is already available to Docker deployments made in the free tier, or for paying subscriptions that opt-into the feature via now.json:
While in beta, we require a paid subscription to be able to go over the maximum of 3 concurrent deployment instances. Current rates apply and are subject to change.
This beta contains the lessons and the experiences of a massively distributed and diverse user base, that has completed millions of deployments, over the past two years.
To get started, we suggest you take a look at the comprehensive list of examples we put together for this release.
Over the coming weeks, we will share more in-depth articles and documentation about our new offering.
Your feedback is crucial during this period. Please let us know how well it works for you.