Delivering compressed content through Fastly
Much of the data delivered by Fastly to end users is highly compressible, especially text based formats like HTML, JavaScript and CSS. Compressing these types of data can yield huge improvements in performance for end users, and reduce costs.
Fastly automatically optimizes requests to more efficiently cache compressed responses, and supports compressing data at the edge using both GZip and Brotli algorithms.
Optimizing Accept-Encoding
When requests are received by Fastly, the Accept-Encoding
header sent by the client tells Fastly whether it is capable of accepting compressed data, and when a response is generated by an origin server or by Fastly, the Vary
header tells Fastly whether to store separate variations of the response based on characteristics of the request. In practical terms, that means that if a request carries an Accept-Encoding: gzip, br
header, and your origin server returns a response containing a Vary: Accept-Encoding
header, Fastly will reuse that response only for future requests that have an Accept-Encoding
header with a value of gzip, br
.
However, many values of Accept-Encoding
are semantically equivalent. If a request carries Accept-Encoding: gzip, br
and results in a Brotli-compressed response being saved in our edge cache, a subsequent request that carries Accept-Encoding: gzip, br, deflate
should also be able to use that response. Fastly therefore automatically normalizes this header value to reduce the number of permutations, so that if the server delivers a compressed response which we can cache, we can reuse that response for as many users as possible. Learn more.
Compression at origin
If your origin server is capable of delivering compressed responses, and performs content negotiation correctly (respecting the value of Accept-Encoding
and correctly adding a Vary
header to the response), Fastly will ensure that compressed responses are only delivered to clients that support them, and that we use cached responses as much as possible.
Compression at the edge
Fastly can compress data for you on our edge servers. Static compression is done on responses when received from an origin server (before caching or post-processing), while dynamic compression is done just before responses are delivered to the client.
Type | Platforms | Where it happens | How you enable it | Options | Billing |
---|---|---|---|---|---|
Static | VCL only | Pre-cache, when responses are received from origin | API, UI or VCL code | File type, compression type | Compressed size |
Dynamic | VCL & Compute | Post-cache, when responses are leaving Fastly bound for the end user | HTTP Header | None | Uncompressed size |
Static compression
Fastly can take uncompressed response data from your server and compress it at the edge before inserting the compressed object into our cache. This is called static compression. The resulting cached object can then be used to serve future requests that have a compatible Accept-Encoding
without having to perform the compression again.
Static compression is available only in VCL services, and can use the GZip or Brotli algorithms. It can be enabled via the web interface, API, or in VCL code using the beresp.gzip
and beresp.brotli
variables in the vcl_fetch
subroutine.
If you want to perform static compression using your own VCL code, ensure you also add a suitable Vary
header to the response, and only compress formats that are not already compressed (media formats like images, audio and video are typically already compressed and will not benefit from GZip or Brotli). The following code example provides an example implementation:
Static compression is not supported on the Compute platform, and is not compatible with Edge Side Includes. If your Fastly service is subject to metered delivery charges, responses compressed statically are billed based on the data size delivered to the client - i.e. the compressed size.
Dynamic compression
Static compression is the most efficient way to compress data at the edge, especially for cacheable responses. However, if static compression isn't suitable for your use case, you can also compress responses individually as they are leaving the Fastly platform, which is called dynamic compression.
Native support for compression is available to both VCL and Compute services, is compatible with Edge Side Includes, and is enabled by adding the X-Compress-Hint
header to the outgoing response:
- Fastly VCL
- Rust
- JavaScript
- Go
response.set_header("x-compress-hint", "on");
The X-Compress-Hint
header enables dynamic compression but whether the response is actually compressed and which algorithm is used depends on the value of the Accept-Encoding
header on the associated request, and whether the response is already compressed. With dynamic compression these decisions are automatic and not configurable.
Dynamic compression happens after the size of the response is measured for billing purposes, so if your Fastly service is subject to metered delivery charges, responses compressed dynamically are billed based on the size before the compression takes place.
Decompression at the edge
If your origin server is serving compressed responses, you may want to decompress these responses at the edge, in order to parse, transform or otherwise act on the contents of the response. This is available in Compute services, both as a platform level primitive, and also in-process using features in many supported languages.
In VCL services, it is not possible to decompress compressed origin responses. If a client does not support receiving compressed responses, and therefore does not send an Accept-Encoding
header with their request, your origin server must be able to serve an uncompressed response to Fastly.
Auto decompress
Setting the auto decompress flag on requests instructs Fastly to automatically decompress any compressed responses before presenting them to your Compute program. You should only do this if you are planning to operate on the response body, and consider combining it with use of dynamic compression so that the response is recompressed before delivery to the client.
- Rust
- Go
// ContentEncodings is available via fastly_sysrequest.set_auto_decompress_response(ContentEncodings::default())
Time spent decompressing response bodies using this mechanism does not count as compute CPU time.
If you do not intend to parse the response body in your Compute program, it's generally better to leave automatic decompression disabled, so that compressed content from origin can remain compressed as it passes through Fastly.
Decompress in process
Some languages provide straightforward mechanisms for decompressing response streams inside of your Compute program:
- JavaScript
- Go
async function app(event) { const backendResponse = await fetch(event.request, { backend: "origin_0" }); let resp = new Response(backendResponse.body.pipeThrough(new DecompressionStream("gzip")), backendResponse); resp.headers.delete('content-encoding'); return resp;}addEventListener("fetch", event => event.respondWith(app(event)));
Time spent decompressing response bodies using this mechanism contributes to the billable CPU time for your Compute program.