Cache Hit Ratio Explained
Cache hit ratio is the single most important operational metric for any CDN deployment. It captures, in one number, how often the edge served a user without contacting the origin — and therefore how much latency was saved, how much origin bandwidth was avoided, and how much CPU and database load the origin escaped. Most teams quote a single number, but there are at least two distinct things that get called "hit ratio" and they often differ by 10 or 20 percentage points. Understanding which one matters for what decision is the first step toward actually moving the number.
Request hit rate vs byte hit rate
The two canonical definitions:
| Metric | Numerator | Denominator | What it measures |
|---|---|---|---|
| Request hit ratio | Requests served from cache | Total requests | User-perceived latency improvement |
| Byte hit ratio | Bytes served from cache | Total bytes served | Origin bandwidth savings |
They diverge when miss responses are larger than hit responses. A site with a 95% request hit rate where the 5% misses are 100 MB video files might have a 60% byte hit rate. The same site looks great by one metric and mediocre by the other.
Pick the right one for the decision:
- For latency / TTFB: request hit rate.
- For egress costs from origin: byte hit rate.
- For overall efficiency: report both.
What the denominator excludes
Hit ratio counts only requests the CDN considered cacheable. Most CDNs exclude two categories:
- Requests with no Cache-Control or with
no-store— counted as PASS, not as MISS. PASS is excluded from the denominator on some CDNs, included on others. - Non-GET methods — POST, PUT, DELETE are not cacheable by default and are excluded entirely.
If your CDN reports an 80% hit rate but you suspect heavy uncacheable traffic, ask whether PASS traffic is in the denominator. The same workload can produce 80% or 95% hit rate depending on the convention.
Why hit ratios are bimodal
Real-world CDN traffic almost never sits in the middle of the hit-rate distribution. Either you have a hit ratio near 90%+ or near 30-50%. The bimodality comes from a few thresholds:
- Are responses cacheable at all? If most responses lack Cache-Control, the CDN passes through everything and hit ratio is near 0.
- Is the cache key fragmented? If tracking parameters or cookies are part of the key, hit ratio per URL is near 0 even when responses are cacheable.
- Is the working set small relative to cache size? If the popular content fits in cache, hit ratio is high. If you have millions of long-tail URLs and a small cache, evictions cap hit ratio.
Each of these is a binary failure mode. Once all three are resolved, hit ratio jumps to the high range almost immediately.
The cache hierarchy and where hits happen
On a tiered CDN, "hit" can mean different things depending on which layer served the response. The hierarchy from a user's perspective:
- Browser cache hit — no network at all. Not counted in CDN metrics.
- Edge POP cache hit — served by the POP nearest the user. Lowest possible network latency. This is what most CDNs report as a hit.
- Tier-1 / shield cache hit — the edge POP missed, asked the regional shield, the shield had it. Slightly higher latency than edge hit; still much faster than origin. Often reported separately or rolled into "hit."
- Origin fetch (miss) — the request reached origin. The slowest path.
For accurate analysis, distinguish edge hits from shield hits. A 90% overall hit rate that's 70% edge + 20% shield is fundamentally different from 90% edge. The edge-only number drives latency; the combined number drives origin offload.
Levers that move hit ratio
The top tactical changes, in approximate order of impact:
| Change | Typical hit-rate gain | Risk |
|---|---|---|
| Strip UTM/fbclid/gclid from cache key | +10-30 percentage points | None (these params don't affect output) |
| Add Cache-Control to cacheable responses | +10-40 pp | Low (verify content is truly cacheable) |
| Configure cookies to be ignored on static paths | +5-20 pp | Low for static assets |
| Enable tiered caching / origin shielding | +3-10 pp (overall, counting shield hits) | None |
| Lengthen max-age for static assets | +2-8 pp | Low if assets are versioned |
| Enable stale-while-revalidate | +1-5 pp (depends on counting) | Low for non-critical data |
| Normalize Accept-Encoding / Vary | +1-5 pp | Verify content negotiation still works |
Most deployments under 70% hit rate have the first two issues. Fix those before pursuing optimizations that recover single-digit percentages.
The diminishing-returns curve
Moving from 50% to 70% hit rate is straightforward and high-impact. Moving from 90% to 95% requires precision tuning. Moving from 95% to 98% might not be possible without architectural changes (more cache memory at the edge, prefetching, longer TTLs).
The right target depends on traffic composition. For a static-asset-heavy site, 95%+ is the bar. For a SaaS application with many per-user views, 60% might be the realistic ceiling because most requests are inherently uncacheable. Don't chase a number that the workload cannot support.
Measuring hit ratio correctly
Common measurement mistakes:
- Looking at a global average that hides per-URL behavior. One catastrophically uncacheable URL can drag down the average. Group by path prefix or content type.
- Measuring at the wrong time scale. Cache fills take time; a deploy that invalidates cache may show 5 minutes of low hit rate as the cache rewarms. Smooth over windows large enough to be meaningful (1+ hours).
- Treating cold-POP traffic as steady-state. A POP that just opened has a cold cache and a low hit rate until warm. CDN-wide averages mask this.
- Comparing across content types as if they should hit equally. APIs, HTML, images, video, and JS bundles have different inherent ceilings. Compare like to like.
Hit ratio and cost
The financial impact of moving hit rate, holding traffic constant:
- Origin egress cost scales with miss bytes. Doubling miss bytes from 5% to 10% of traffic doubles origin egress charges.
- CDN egress cost is the same regardless of hit ratio — the CDN charges you for bytes delivered to users, not bytes fetched from origin. So hit ratio improvements don't reduce CDN bills directly; they reduce origin bills.
- Origin compute cost falls with hit ratio if your origin spins up workers per request. Cache hits never reach origin compute.
The right comparison: cost of one origin fetch (egress + compute + DB) vs cost of one CDN edge hit. The ratio is usually 50–500×. Even modest hit-rate improvements pay for the engineering effort to get them.
Diagnostic checklist when hit ratio is low
- What percentage of responses set
Cache-Controlwith a positive max-age? If less than 80%, that is the bottleneck. - Are query strings included in the cache key? Sample a high-traffic URL and check distinct keys per day.
- Does the response include a Set-Cookie header on what should be cacheable content? Some CDNs bypass cache automatically on Set-Cookie responses.
- Does Vary list more than Accept-Encoding? If so, justify each additional header.
- Is the CDN's tiered cache or origin shield enabled?
- What is the cache eviction rate vs the cache fill rate? High evictions mean the working set is larger than cache.
Each answered question either fixes the problem or rules out a cause. Within an afternoon of focused diagnosis, most teams can identify which of the three failure modes they are hitting and ship a targeted fix.
Frequently Asked Questions
What is cache hit ratio?
Cache hit ratio is the percentage of requests (or bytes) that the CDN edge served from its cache without contacting the origin. A 90% request hit ratio means 9 out of 10 requests were satisfied at the edge; only 1 out of 10 required an origin fetch. Higher is better — it reduces user latency, origin load, and bandwidth costs.
What is the difference between request hit rate and byte hit rate?
Request hit rate counts each request once regardless of response size. Byte hit rate weighs each request by the size of the response. Byte hit rate is what determines origin bandwidth savings; request hit rate is what determines latency improvement. They can differ significantly: a site might have 95% request hit rate but only 70% byte hit rate if its rare cache misses are very large files.
What is a good cache hit ratio?
For purely static content (versioned assets, images), 95%+ is achievable and expected. For a mix of static and dynamic content, 80-90% is realistic. Below 60% almost always indicates a configuration problem — usually cache-key fragmentation from query parameters, missing Cache-Control headers on cacheable responses, or unintended cookies bypassing the cache.
How can I improve cache hit ratio?
The biggest wins typically come from (1) stripping tracking query parameters from the cache key, (2) setting explicit Cache-Control headers on responses that lack them so the CDN caches them, (3) configuring cookies to be ignored unless they genuinely affect the response, (4) using stale-while-revalidate so background refreshes count as hits to users, and (5) enabling tiered caching so cold-cache POPs hit a regional shield instead of origin.
Does cache hit ratio account for revalidation 304s?
Different CDNs count this differently. Some count a 304 revalidation as a hit because the cached body was served to the user; others count it as a miss because origin was contacted. Check your CDN's documentation. Either way, a 304 is much cheaper than a full origin fetch, so the distinction matters less than overall traffic to origin.
Related Guides
More From This Section
All CDN & Edge Guides
How CDNs work, cache headers, anycast, edge functions, and security.
Anycast vs GeoDNS
Anycast and GeoDNS compared — how each routes users to CDN points of presence, BGP convergence, GeoDNS resolver…
Cache Key and Vary Header
How CDNs derive a cache key from URL, headers, cookies, and query strings — and how the Vary header forces per-variant caching.
Run a Speed Test
Measure download, upload, ping, and jitter in your browser.