CDN WAF, DDoS, and Bot Mitigation
Modern CDNs are not just caches — they are the security perimeter for the websites they front. Sitting between users and origin, every CDN POP can inspect, score, and filter each request before deciding whether to forward it. This guide explains the three security layers every major CDN provides: the WAF that blocks known attack patterns per request, the DDoS protection that absorbs floods of traffic, and the bot management that distinguishes real users from automated tools. Each has different mechanisms, different failure modes, and different costs.
The three security layers
| Layer | What it stops | Mechanism |
|---|---|---|
| WAF (Web Application Firewall) | Application-layer attacks: SQLi, XSS, command injection | Pattern matching + ML on request content |
| DDoS protection | Volumetric attacks and connection floods | Anycast distribution + scrubbing + rate limits |
| Bot management | Automated traffic: scrapers, credential stuffing, fake signups | Fingerprinting + behavioral analysis + challenges |
These overlap and interact. A credential-stuffing attack is both a bot problem and a DDoS-level traffic spike. A SQL injection from a scraper is both a WAF event and a bot event. CDNs typically expose them as separate products with separate dashboards but the underlying decisions feed into each other.
WAF: blocking known attacks per request
A Web Application Firewall inspects HTTP requests for known attack patterns. The patterns come from:
- OWASP Core Rule Set (CRS). The open-source standard. Free, broadly available, well-understood.
- Provider-managed rules. Cloudflare Managed Rules, AWS Managed Rules, Akamai Kona Site Defender, Imperva Cloud WAF. Curated and maintained by the CDN's security team.
- Custom rules. Customer-defined rules for specific patterns the application cares about.
What WAFs typically catch:
- SQL injection. Patterns like
' OR 1=1--,UNION SELECT, encoded SQL in query strings. - Cross-site scripting (XSS).
<script>tags, JavaScript event handlers, eval-style attacks in request data. - Command injection. Shell metacharacters, pipe chains, attempts to break out of sanitized input.
- Path traversal.
../../sequences, Unicode-encoded traversal, attempts to access files above the web root. - Remote file inclusion. URL parameters with file:// or http:// schemes.
- Known vulnerability signatures. Specific patterns for Log4j, Spring4Shell, Apache Struts, recent CVEs.
WAF modes: monitor vs block
WAF deployment is usually staged:
- Monitor mode (log only). Rules fire and log but don't block. Used to find false positives before enabling enforcement.
- Challenge mode. Suspicious requests get a JavaScript challenge before being allowed.
- Block mode. Matching requests are rejected with a 403 or custom error page.
Going straight to block mode reliably breaks legitimate traffic — there are always false positives in any rule set. Monitor mode for 1-2 weeks reveals which rules need to be tuned or disabled for your application before enforcement.
False positives: the WAF's biggest operational problem
Common false positives:
- Programming forums hit XSS rules because users post code samples.
- Search-heavy applications hit SQL injection rules because users search for technical strings.
- Backend APIs hit "suspicious header" rules because internal tooling uses unusual user agents.
- Health-check tools hit rate-limit rules.
Resolution: review WAF event logs daily during rollout, add exceptions for the legitimate patterns, tune rule sensitivity. Production WAFs typically run with a mix of strict rules and exceptions tailored to the specific application.
DDoS protection: absorbing the flood
DDoS attacks split into two categories:
Volumetric attacks (Layer 3/4)
The attacker generates raw traffic — UDP floods, SYN floods, amplification attacks — designed to saturate the target's network capacity. Attack sizes in 2026 routinely hit 1-5 Tbps; record attacks have exceeded 10 Tbps.
CDNs absorb volumetric attacks via anycast. The same IP is announced from every POP; BGP distributes the attack across all POPs. A 5 Tbps attack against an anycast network with 200+ POPs becomes 25 Gbps per POP — within the routine capacity of each POP. The CDN's scrubbing infrastructure drops attack packets and forwards only legitimate traffic to origin.
This is the fundamental reason origins should never be exposed to public internet when behind a CDN. Origin IPs must be kept secret (DNS resolution to origin IP is the biggest leak source) and origin firewalls should allow only the CDN's IP ranges.
Application-layer attacks (Layer 7)
Each request looks legitimate. The attack uses volume — millions of requests per minute — to overwhelm application servers or specific expensive endpoints. Examples:
- Login attempts at a sustained 10K/sec attempting credential stuffing.
- Cache-busting attacks: requests with unique query strings to force cache misses.
- Slowloris: many connections that send partial requests slowly to exhaust server connection pools.
- Expensive-endpoint floods: continuous requests to search, recommendation, or other CPU-heavy endpoints.
Layer 7 attacks cannot be filtered by traffic volume alone — every request looks like a real user request. Mitigations require deeper analysis:
- Rate limiting per IP or fingerprint. Allow 100 requests/minute per source; block above that.
- JavaScript challenges. Require the client to execute JavaScript proof-of-work before getting access. Bots that don't run JavaScript fail.
- Behavioral analysis. Compare request patterns against learned normal behavior; flag anomalies.
- Geographic blocking. If an attack comes overwhelmingly from a country your users don't come from, block by geo.
- Cookie-based session continuity. Real users persist cookies; bots usually don't. Sessions without cookies get scrutinized harder.
Bot mitigation: distinguishing real users from automated tools
Not all bots are bad — Googlebot, Bingbot, monitoring services, and legitimate API consumers all benefit from being identified rather than blocked. Bot management distinguishes:
- Good bots. Verified search engines, monitoring services, social media unfurlers. Allow through.
- Neutral bots. Aggregators, analytics, archivers. Allow with rate limits.
- Malicious bots. Scrapers, credential stuffers, account creation bots, click fraud. Block or challenge.
Bot detection techniques
Modern bot management combines many signals:
TLS fingerprinting (JA3, JA4)
The TLS handshake reveals the client library. Different browsers, tools, and library versions produce different cipher suite ordering, extension ordering, and elliptic curve preferences. A JA4 hash of these uniquely identifies the client type. curl looks different from Chrome looks different from Python requests. Bots using common libraries are easily identified by JA4 even without HTTP-layer signals.
HTTP/2 fingerprinting
HTTP/2 settings, stream priorities, and header compression details differ between browsers and libraries. Headless Chromium running in a bot setup may use slightly different defaults than real Chrome, producing detectable fingerprints.
JavaScript challenges
The server returns a page with JavaScript that must compute a small proof-of-work or pass a behavioral check. Real browsers complete it transparently in a few hundred ms. HTTP libraries (Python requests, curl, basic scrapers) don't run JS at all and fail. Headless browsers run JS but can be detected via subtle properties.
Behavioral analysis
Real users move mice, scroll, hover, focus inputs. Bots don't unless they go to deliberate effort to simulate. Analytics that compare on-page behavior against a learned distribution of real users identify automated visitors even when they pass other checks.
Reputation databases
IPs known to host bots, residential proxies, datacenter ranges, and known credential-stuffing infrastructure are tagged in CDN reputation databases. Requests from these IPs get extra scrutiny.
Bot product names
- Cloudflare Bot Management + Turnstile (CAPTCHA alternative).
- Akamai Bot Manager.
- AWS WAF Bot Control.
- DataDome.
- Imperva Advanced Bot Protection.
- HUMAN (formerly White Ops).
Pricing: usually a multiplier on base CDN cost; full bot management can double or triple a CDN bill at scale.
Rate limiting at the edge
Rate limiting is the simplest and often most effective bot/DDoS mitigation. Configure rules like:
- Max 100 requests/minute per IP to /login.
- Max 10 password resets/hour per email.
- Max 5 account signups/hour per IP.
- Max 1000 API requests/minute per API key.
Implementation patterns:
- Per-IP rate limiting. Simple, effective for casual attackers. Bypassed by attackers using residential proxies.
- Per-fingerprint rate limiting. Better — proxies don't change the fingerprint. But fingerprints can be spoofed.
- Per-user rate limiting. Once authenticated, by user ID. Effective for account abuse.
- Adaptive rate limiting. Lower the threshold when attack signals are detected; raise when conditions return to normal.
Combining the layers
A typical CDN security stack processes each request in order:
- IP reputation check. Known-bad IPs are dropped immediately.
- DDoS profile. Is overall traffic in an attack pattern? Apply more aggressive controls if yes.
- Bot fingerprint scoring. Calculate a "bot score" 0-100 for the request.
- WAF rules. Pattern match for attack signatures.
- Rate limit check. Has this client exceeded its allowance?
- Geographic / IP allowlist check. If configured, restrict to allowed regions.
- If all checks pass: forward to origin (or serve from cache).
Each layer can block, challenge, or score the request. The cumulative score determines the action.
Cost vs benefit
Edge security is generally a high-ROI feature. The alternative is each application origin running its own protection — possible but operationally expensive and less effective (no anycast absorption, less aggregated threat intelligence).
Pricing landscape:
- Free/basic WAF + DDoS. Cloudflare's free tier includes both. AWS Shield Standard is automatic. Suitable for most low-risk applications.
- Pro / Business tiers. $20-200/month. Includes managed rule sets, bot scoring, rate limiting, custom rules.
- Enterprise. Custom pricing. Includes full bot management, dedicated DDoS scrubbing, 24/7 SOC support. Typically $50K+/year.
Frequently Asked Questions
What is the difference between a WAF and DDoS protection?
WAF (Web Application Firewall) inspects individual HTTP requests for known attack patterns — SQL injection, XSS, command injection, path traversal — and blocks them. It operates per-request, on application semantics. DDoS protection focuses on attack volume — millions of requests per second from many sources designed to overwhelm capacity. WAF stops malicious individual requests; DDoS protection absorbs floods that may consist of either malicious or merely overwhelming requests. CDNs typically offer both as separate but integrated features.
How does a CDN absorb a DDoS attack?
Through anycast and capacity. The CDN's anycast network distributes attack traffic across all POPs — each POP only sees the fraction of attack that routes to it via BGP. With 200+ POPs each having multi-hundred-Gbps capacity, the aggregate absorbing capacity is multiple Tbps. Origin servers, by contrast, are typically one location with single-digit Gbps capacity. The CDN scrubs the attack at the edge and only forwards legitimate traffic to origin. This is the killer feature anycast provides for security.
What is a Layer 7 DDoS attack?
Layer 7 (application-layer) DDoS attacks look like normal HTTP requests but at overwhelming rate. They might request expensive endpoints (search, login, password reset), bypass cache via unique query strings, or perform a "slowloris" attack with many slow-trickling requests. Unlike volumetric attacks, they cannot be filtered by raw traffic patterns — every request looks like a real user. Mitigation requires application-aware analysis: rate limiting per IP / fingerprint, JavaScript challenges, behavioral analysis.
How does a CDN detect bots vs real users?
Multiple signals combined: TLS fingerprinting (JA3, JA4 hash of TLS handshake reveals automated tooling), HTTP/2 fingerprinting (browser request patterns differ from libraries), TCP fingerprinting (window size, options reveal OS), behavioral analysis (mouse movement, scroll patterns, request timing), and machine learning models trained on known good/bad traffic. JavaScript challenges add a step that headless browsers can pass but raw HTTP libraries cannot. No single signal is decisive; combined they achieve 95%+ accuracy without affecting real users.
Do I need a WAF if my application is well-written?
Yes, for defense in depth. A WAF catches: (1) attacks against vulnerable dependencies you may not have patched yet (zero-days, Log4j-style), (2) misconfigurations that bypass application checks (proxy header bypass, parser confusion), (3) credential stuffing and brute-force attacks at scale, (4) malicious bots that probe for weaknesses. Well-written application code prevents the direct vulnerabilities; a WAF catches the long tail of issues you didn't anticipate. The combined cost is small compared to the cost of a single missed incident.
Related Guides
More From This Section
All CDN & Edge Guides
How CDNs work, cache headers, anycast, edge functions, and security.
Anycast vs GeoDNS
Anycast and GeoDNS compared — how each routes users to CDN points of presence, BGP convergence, GeoDNS resolver…
Cache Hit Ratio Explained
What cache hit ratio actually measures, the difference between request and byte hit rate, and the configuration changes…
Run a Speed Test
Measure download, upload, ping, and jitter in your browser.