Google Cloud Platform
Compute Engine

Cloud CDN

Google Cloud CDN (Content Delivery Network) uses Google's globally distributed edge caches to cache HTTP(S) Load Balanced content close to your users. Caching content at the edges of Google's network provides faster delivery of content to your users while reducing the load on your servers.

Contents

Before you begin

Cloud CDN works as part of an HTTP(S) load balancing configuration. You must use Compute Engine HTTP(S) load balancing on your instances to use Cloud CDN.

How Cloud CDN works

When a user requests content from your site, that request passes through network locations at the edges of Google's network, usually far closer to the user than your actual instances. The first time that content is requested, the edge cache sees that it can't fulfill the request and forwards the request on to your instances. Your instances respond back to the edge cache, and the cache immediately forwards the content to the user while also storing it for future requests. For subsequent requests for the same content that pass through the same edge cache, the cache responds directly to the user, shortening the round trip time and saving your instances the overhead of processing the request.

Basic Edge Cache
Basic Edge Cache

Once you enable Cloud CDN, caching happens automatically for all cacheable content.

Cloud CDN is activated by a single command on a load balancing Backend Service. After that, you control which content is cached through normal web server configuration. Caching is off by default.

Caching is reactive in that an object is stored in a particular cache if a request goes through that cache and if the object is cacheable. An object stored in one cache does not automatically replicate into other caches, and you cannot pre-load caches except by causing the individual caches to respond to requests.

Which responses get cached

You don't have to specify which responses are cached in which locations. Any response that meets the following requirements will be cached when the response passes through a particular cache. Some of these requirements are specified by RFC 7234, and others are particular to Cloud CDN. A response can only be stored in the caches if all of the following are true:

  • It was served by a backend service with caching enabled.
  • It was a response to a GET request.
  • The status code was 200, 203, 300, 301, 302, 307, or 410.
  • It has a Cache-Control: public directive.
  • It has a Cache-Control: s-maxage, Cache-Control: max-age, or Expires header. (If more than one is present, Cache-Control: s-maxage takes precedence over Cache-Control: max-age, and Cache-Control: max-age takes precedence over Expires.)
  • It has a valid Date header that doesn’t specify a future time.
  • It has either a Content-Length header or a Transfer-Encoding header.

Additionally, there are checks that will block caching of responses. A response will not be cached if any of the following are true:

  • It has a Set-Cookie header.
  • Its body exceeds 4 MB.
  • It has a Vary header with a value other than Accept, Accept-Encoding, or Origin.
  • It has a Cache-Control: no-store, no-cache, or private directive.

Because of the nature of caches, it is impossible to predict whether a particular request will be served out of cache. You can, however, expect that popular requests for cacheable content will be served from cache a high percentage of the time, yielding significantly reduced latencies and reduced load on your instances.

Serving responses from caches

If a request URL matches a response URL that’s in a cache, the cache will only serve it if it’s not too old, as determined by the various HTTP headers. Any Cache-Control: s-maxage, Cache-Control: max-age, or Expires headers in the response are checked in accordance with RFC 7234 to determine whether the response is too old. If it’s fresh enough, the response is served from cache; otherwise, a request is sent through your load balancer to one of your instances. If the response does not have a Cache-Control: s-maxage, Cache-Control: max-age, or Expires header, the item will not be cached.

Enabling Cloud CDN for a backend service

Enabling Cloud CDN while creating a backend service

If you do not yet have your HTTP(S) load balancer configured, enable Cloud CDN when creating your backend service with the following command.

Cloud CDN is disabled by default, so to create a backend service without caching, just use the normal command.

gcloud alpha compute backend-services create NAME --enable-cdn \
    --http-health-checks example-health-check

Enable Cloud CDN for a pre-existing backend service

If you already have your HTTP(S) load balancer configured, you can enable caching with the following command.

gcloud alpha compute backend-services update NAME --enable-cdn

Disabling Cloud CDN for a backend service

This command disables Cloud CDN for a backend service. Disabling Cloud CDN caching does not invalidate or purge the cache. If you turn Cloud CDN off and back on again, most or all of your cached content might still be cached. To forcibly prevent content from being used by the caches, you must invalidate that content.

gcloud alpha compute backend-services update NAME --no-enable-cdn

Invalidating cached content for a URL map

Cached objects are invalidated using the load balancer's URL map. If you have more than one URL map pointing to the same backend service, you will need to invalidate the objects for each URL map.

If an object is not accessed for some period, it might be removed to make room for more recent content. If an object is not accessed for 30 days, it will be deleted regardless of space. You can force an object, or set of objects, to be ignored by the cache by submitting an invalidation request. (The cache may still use an invalidated object, but only if your backend service first validates the object by sending a 304 Not Modified response.)

Note that invalidation requests are rate limited. For Alpha, only one request per minute is allowed.

An invalidation request might specify a single file, or it might specify an entire directory structure. Either one counts as only one request.

PATH specifies the set of paths to invalidate. PATH must start with /, and cannot include ? or #. If PATH ends with /*, the preceding string is a prefix, and all objects whose paths begin with it are invalidated. PATH must not include a * except as a final character following a /. Only URLs that match the path exactly will be invalidated.

The PATH (either a path to a specific object or a wildcard path ending in /*) is matched against URL paths, which means everything between the hostname and any ? or # that might be present.

gcloud alpha compute url-maps invalidate-cache URL_MAP --path PATH

Example

If a file located at https://example.com/images/foo.jpg, served by a URL map named myUrlMap1, has been cached and needs to be invalidated, there are several commands you can use to invalidate it, depending on whether you wanted to affect just that file or a wider scope.

To invalidate just one file:

gcloud alpha compute url-maps invalidate-cache myUrlMap1 --path “/images/foo.jpg”

To invalidate the whole directory:

gcloud alpha compute url-maps invalidate-cache myUrlMap1 --path “/images/*”

To invalidate everything:

gcloud alpha compute url-maps invalidate-cache myUrlMap1 --path “/*”

To invalidate objects that have ? in the URL:

If you have URLs that contain ?, you cannot selectively invalidate objects that appear in the URL after the ?. For example, if you have two images, example.com/images.php?image=fred.png and example.com/images.php?image=barney.png, you cannot only invalidate fred.png. To invalidate all images served by images.php, run the following:

gcloud alpha compute url-maps invalidate-cache myUrlMap1 --path “/images.php”

Invalidate only what you must, because invalidating too much might cause a large spike of requests that were being served by the caches to suddenly hit your instances.

Because Cloud CDN is a distributed system, it might report that an invalidation has completed even though some small number of caches have not yet processed the invalidation request. This situation is extremely rare and will correct itself automatically.

Pricing

For Alpha, existing HTTP(S) Load Balancing pricing applies to Cloud CDN. Pricing might change for GA.

Known issues

The following known issues and limitations affect the Alpha release of Cloud CDN:

  • Content is cached at only a small subset of Google Points of Presence. Full edge caching will be available for GA.
  • Responses with bodies larger than 4 MB (4,194,304 bytes) are not cached.
  • Cache invalidations are rate limited to one invalidation per URL map per minute.

Error codes

Cache Invalidation Errors
Error Code Notes
Invalid value for field 'resource.path' The path value had an invalid format. Paths must begin with a /, must not contain a ? or #, and must have only a single *, which must be a final character after a /. Paths must not be longer than 1024 characters.
This error only addresses the format of the path. A path that is of valid format, but which doesn't exist, is still treated as valid.
Rate Limit Exceeded The caching system restricts the number of requests it will process at a time. For Alpha, only one invalidation request per minute is allowed. However, each request can be a path that contains any number of objects.

Data location settings of other Cloud Platform Services

Please note that using Cloud CDN means data may be stored at serving locations outside of the region or zone of your origin server. This is normal and how HTTP caching works on the Internet. Under the Service Specific Terms of the Google Cloud Platform Terms of Service, the Data Location Setting that is available for certain Cloud Platform Services will not apply to Core Customer Data for the respective Cloud Platform Service when used with other Google products and services (in this case the Cloud CDN service). If you do not want this outcome, please do not use the Cloud CDN service.