Cache Key and Vary Header
Every CDN cache lookup begins with a single question: does the edge already have a response for this request? The answer depends entirely on how the cache key is constructed. Two requests with the same key share a cached response; two requests with different keys live in separate entries, even when the underlying bytes would be identical. The Vary header is the HTTP-standard mechanism for adding header values to that key without modifying the URL. Together they determine how many variants of each piece of content the edge has to store — and therefore your cache hit rate.
What is in the default cache key
Out of the box, every major CDN builds a cache key from three components of the incoming request:
- Scheme —
httporhttps. Two requests for the same path on different schemes are stored separately. - Host — the
Hostheader value (or the SNI hostname on HTTPS). Useful when one CDN configuration serves multiple domains. - Path + query string — the full URI after the host, including every query parameter in the order it appeared in the URL.
That means https://example.com/products?id=42 and https://example.com/products?id=42&src=email are two separate cache entries, even if your application ignores src. This is the most common reason new CDN deployments see low hit rates: tracking parameters fragment what should be a single cached resource into thousands of unique keys.
Normalization: what the CDN does before hashing
Before the cache key is hashed into an entry, the edge typically applies normalization rules. The exact rules differ per CDN but most include:
| Normalization | Effect | Why |
|---|---|---|
| Lowercase host | Example.com = example.com | DNS is case-insensitive; the cache should be too |
| Path case sensitivity (preserved) | /Image.jpg ≠ /image.jpg | Most filesystems and many web servers are case-sensitive |
| Query parameter sort | ?a=1&b=2 = ?b=2&a=1 (optional) | Avoids fragmentation from reordered params (off by default on most CDNs) |
| Trailing slash | /path often ≠ /path/ | Trailing slash is semantically significant in HTTP; not normalized by default |
| Default port stripping | :443 stripped from HTTPS host | Default ports are not part of the canonical URL |
Knowing which normalizations are on by default — and which require explicit configuration — is the difference between a 70% hit rate and a 98% hit rate on the same workload.
The Vary header: per-header variants
The Vary response header lists request headers that, if their value changes, should produce a different cached variant. The most common example is content encoding:
Cache-Control: public, max-age=86400
Vary: Accept-Encoding
With this response, the cache stores one variant per distinct Accept-Encoding value it sees from clients. A browser sending Accept-Encoding: gzip, br gets the brotli-compressed copy; a curl client sending no Accept-Encoding gets the uncompressed copy; both are stored as separate cache entries under the same URL.
Safe Vary values
A few headers are nearly always safe to vary on because they have a small, bounded set of possible values:
- Accept-Encoding — typically 2–4 variants (identity, gzip, br, zstd).
- Accept for content negotiation between, e.g., HTML and JSON — small set if you control the clients.
- Accept-Language — bounded if you serve a fixed list of locales; dangerous otherwise because browsers send long quality-weighted lists.
Dangerous Vary values
Some headers look reasonable to vary on but explode the variant count:
| Header | Distinct values seen in practice | Result |
|---|---|---|
User-Agent | Tens of thousands per day on a busy site | Hit rate collapses to near zero |
Cookie | One per logged-in user (or worse, per session) | Effectively un-cacheable |
Referer | One per source URL | Same — cache fragments per inbound link |
Accept-Language raw | Hundreds of quality-weighted strings | Severe fragmentation |
The rule of thumb: only vary on headers with a small, enumerable set of values. If the header is essentially unique per client, you have made the response uncacheable.
Custom cache keys: beyond Vary
Vary is a portable HTTP mechanism but it has limits. It can only key on request headers, not on cookies, specific query parameters, or derived values. Most CDNs let you override the default cache key with a custom expression. Typical capabilities:
- Include specific cookies — vary on
session_localebut not on the entire cookie jar. - Include specific query parameters — key on
idbut ignoreutm_source,utm_medium,fbclid,gclid. - Include a normalized device type — map User-Agent into
mobile/tablet/desktopand vary on that bucket. - Include the client's country — derived from IP geolocation, useful for legal content variants.
- Strip the host — share a cache across multiple domains pointing at the same content.
Custom keys are CDN-specific syntax but the underlying concept is identical across vendors: you are computing a deterministic string from the request, and the edge stores one entry per unique string.
The Vary trap: cache poisoning
If the origin sets a Vary header that depends on a user-controlled value, an attacker may be able to influence what the cache stores under a particular key. Concrete example: a site sets Vary: X-Forwarded-Host and reflects that header into the response. An attacker sends a request with a malicious X-Forwarded-Host value, the response is cached under that variant, and a later user with the same header receives the poisoned content.
The defenses are straightforward but easy to forget: do not Vary on headers the origin reflects into responses, do not Vary on headers the CDN does not strip from client requests, and audit Vary values periodically to confirm they only list headers under your control.
Cookies and the cache key
Cookies are the single biggest source of cache-key bugs on dynamic sites. The default behavior on most CDNs is one of two extremes:
| Default behavior | Effect | Used by |
|---|---|---|
| Any Cookie header bypasses cache entirely | Safe but no caching for any authenticated user | Some legacy CDN setups, Varnish defaults |
| Cookies stripped from cache key | Everyone gets the same cached response regardless of cookie | Modern CDN defaults for static content |
| Specific cookies included in cache key | Variant per session/locale/theme cookie value | What you actually want for personalized-but-cacheable content |
The third option is almost always the right one for any non-trivial site. Configure the CDN to include only the cookies that genuinely affect output and ignore the rest.
Query string handling patterns
Most production CDN deployments fall into one of three query-string strategies:
- Include all (default). Every query parameter is part of the cache key. Safest behavior but worst hit rate when tracking parameters are common.
- Ignore all. Strip the entire query string from the cache key. Maximum hit rate, but breaks any URL where the query parameter actually changes the response.
- Allowlist or blocklist. Include specific parameters in the key (allowlist) or exclude specific known-noise parameters (blocklist). Best balance for production.
A reasonable default blocklist: utm_source, utm_medium, utm_campaign, utm_term, utm_content, fbclid, gclid, msclkid, mc_cid, mc_eid, _ga, ref, source. None of these affect the response; all of them fragment the cache.
Debugging cache-key problems
When hit rate is lower than expected, the cause is almost always cache-key fragmentation. The diagnostic process:
- Make two identical requests for the same logical resource and compare the
Cache-Statusor CDN-specific response header. Both should be HIT after the first request. - If the second request is a MISS, inspect what differs: query string, cookies, Accept-Encoding, Accept-Language, User-Agent.
- Pull a sample of recent edge logs and group by cache key. If your top URL has thousands of distinct keys per day, the key includes something it shouldn't.
- Check Vary on the response. If it includes anything beyond Accept-Encoding, justify each value.
For deeper analysis, see cache hit ratio explained and CDN logs and observability.
Frequently Asked Questions
What is a cache key?
A cache key is the string a CDN uses to look up whether it already has a cached response for an incoming request. By default it is built from the HTTP scheme, host, and full request path (including query string). Two requests that produce the same cache key are treated as the same content; two requests with different cache keys are stored separately even if the underlying response would be byte-identical.
What does the Vary header do?
Vary tells caches that the response depends on the value of one or more request headers. A response with Vary: Accept-Encoding is cached separately for each distinct Accept-Encoding value the cache has seen. Caches store one variant per unique combination of the headers listed in Vary. Without Vary, a single cached response is returned regardless of which header values the next request carries.
Why do query parameters destroy cache hit rate?
Because by default the entire query string is part of the cache key. URLs like /image.jpg?utm_source=newsletter and /image.jpg?utm_source=twitter produce two different cache entries even though they return the same bytes. The fix is to configure the CDN to either ignore specific query parameters (UTM tags, fbclid, gclid) or to only include an allowlist of parameters that genuinely change the response.
Is Vary: User-Agent ever a good idea?
Almost never. User-Agent strings are essentially unique per device/browser version — varying on User-Agent fragments the cache into thousands of variants and drops hit rate to near zero. If you genuinely need to serve different content per device class (mobile vs desktop), use a CDN feature that normalizes User-Agent into a small set of bucket values (mobile, tablet, desktop) and varies on that derived value instead.
What is the difference between Vary and a custom cache key?
Vary is part of the HTTP standard — it lives in response headers and instructs every cache (browser, CDN, proxy) to store one variant per header combination. A custom cache key is a CDN-side configuration that changes how the edge derives its lookup string before checking storage. Vary is portable and universal; custom cache keys are CDN-specific but more flexible (you can include specific cookies, normalize headers, or strip parameters).
Related Guides
Cache-Control Headers
The HTTP headers that drive every CDN's caching decisions.
Cache Hit Ratio Explained
How hit rate is measured and what drives it up or down.
ETag and Conditional Requests
Validators that let caches revalidate without re-downloading.
CDN Logs and Observability
Reading edge logs to diagnose cache-key fragmentation.
More From This Section
All CDN & Edge Guides
How CDNs work, cache headers, anycast, edge functions, and security.
Anycast vs GeoDNS
Anycast and GeoDNS compared — how each routes users to CDN points of presence, BGP convergence, GeoDNS resolver…
Cache Hit Ratio Explained
What cache hit ratio actually measures, the difference between request and byte hit rate, and the configuration changes…
Run a Speed Test
Measure download, upload, ping, and jitter in your browser.