What Is a Load Balancer?

Run a Speed Test

A load balancer distributes incoming traffic across multiple servers — preventing bottlenecks, absorbing failures, and keeping sites available under any traffic volume.

The Problem Load Balancers Solve

A single server has finite capacity. Under low traffic it handles requests quickly. As traffic grows, the queue of pending requests grows, response times increase, and eventually the server becomes unavailable. The naive solution — a bigger server — works up to a point, but no single machine is big enough for sites handling millions of requests per hour.

A load balancer solves this by distributing incoming requests across a pool of servers. From the outside, clients see one IP address and one service. Inside, their requests go to whichever backend has capacity. Adding a server to the pool increases capacity; removing one (or having one fail) causes the load balancer to redistribute traffic without any client-visible disruption.

Layer 4 vs Layer 7 Load Balancing

Load balancers operate at different levels of the network stack, and the level determines what they can see and how they route.

Feature	Layer 4 (Transport)	Layer 7 (Application)
Routes based on	IP address, TCP/UDP port	HTTP headers, URL, cookies, body
Protocol awareness	TCP/UDP only	HTTP, HTTPS, WebSocket, gRPC
TLS termination	No (passes through)	Yes
Content-based routing	No	Yes (URL path, host header)
Performance overhead	Very low	Higher (full packet inspection)
Typical use	High-throughput, non-HTTP services	Web applications, APIs

An L4 load balancer sees a TCP connection arriving on port 443 and forwards it to a backend — it cannot read the HTTP request inside. An L7 load balancer terminates the TLS connection, reads the HTTP request, and can route /api/* to one backend pool and /static/* to another, or route requests with a specific cookie to a specific server.

Load Balancing Algorithms

Algorithm	How It Works	Best For
Round Robin	Requests cycle through backends in order	Uniform request cost, equal-capacity servers
Weighted Round Robin	Servers with higher weight receive proportionally more requests	Backends with different capacity
Least Connections	Each request goes to the backend with fewest active connections	Variable request duration
Least Response Time	Routes to the backend with lowest current latency	Latency-sensitive applications
IP Hash	Client IP is hashed to consistently select the same backend	Stateful apps requiring session affinity
Random	Backend chosen randomly per request	Simple even distribution

Health Checks: Removing Failed Servers

A load balancer continuously monitors the health of its backend pool. Health checks are periodic probes — typically an HTTP GET to a /health endpoint, a TCP connection attempt, or a ping — that verify each backend is alive and responding. If a backend fails a configurable number of consecutive checks, the load balancer marks it unhealthy and stops routing traffic to it. When the backend recovers and passes health checks again, it is returned to the pool.

This automatic detection and exclusion of failed backends is what makes load-balanced architectures resilient. A single server failure does not cause downtime — the load balancer silently removes it and the remaining servers absorb its share of traffic.

Sticky Sessions: When Distribution Conflicts with State

Round-robin and least-connections algorithms distribute requests across all backends, which means two requests from the same user may go to different servers. For applications that store session data locally — in memory on the server rather than in a shared database — this causes users to lose their session.

Sticky sessions (session persistence) solve this by tagging requests with a cookie or by hashing the client IP, then always routing tagged requests to the same backend. This restores session consistency at the cost of perfectly even distribution. The better long-term solution is to move session state into a shared store (Redis, Memcached, a database) so any backend can serve any request.

Load Balancer vs Reverse Proxy

In practice the two terms are often used interchangeably, but they have different emphases. A load balancer focuses on distributing traffic and ensuring availability across a pool of backends. A reverse proxy is a broader concept — it includes load balancing but also TLS termination, caching, request transformation, compression, and security filtering.

Software like Nginx and HAProxy do both. A dedicated hardware load balancer appliance may do only traffic distribution. Cloud load balancers (AWS ALB/NLB, GCP Load Balancing) typically combine both roles into one managed service.

Frequently Asked Questions

What is a load balancer?

A component that distributes incoming requests across a pool of backend servers to prevent overload, improve response times, and ensure availability when individual servers fail.

What is the difference between L4 and L7 load balancing?

L4 routes based on IP and port — it cannot read request content. L7 routes based on HTTP headers, URLs, and cookies — it can make content-aware decisions and handles TLS termination. L7 is more flexible; L4 is faster and works for any TCP/UDP traffic.

What is round-robin load balancing?

Each request goes to the next server in a circular list. Simple and even by request count, but not by server load — weighted round-robin assigns more requests to higher-capacity backends.

Does a load balancer hide the server's IP?

Yes. Clients connect to the load balancer's IP. Backend servers stay on a private network, invisible to clients. This protects backends and allows them to be changed without any client-side reconfiguration.

What is sticky sessions?

A feature that routes all requests from a given client to the same backend, using a cookie or IP hash. Needed for apps that store session state locally. The better solution is shared session storage so any backend can serve any client.

What is the difference between a load balancer and a reverse proxy?

A load balancer's core job is distributing traffic. A reverse proxy is broader — it adds TLS termination, caching, compression, and security filtering on top of distribution. Most software implementations (Nginx, HAProxy) do both.

Run a Speed Test

Related Guides

Forward vs Reverse Proxy

How reverse proxies and load balancers overlap — and where they differ.

What Is a Proxy Server?

The full primer on proxy types — forward, reverse, transparent, and SOCKS.

What Is a CDN?

CDNs combine reverse proxying, load balancing, and geographic distribution.

More From This Section

All Internet Fundamentals

TCP/IP, DNS, HTTP, routing — the complete guide to how the internet works.

Forward vs Reverse Proxy

Two opposite uses of the proxy idea — clients vs servers, outbound vs inbound.

What Is a CDN?

Edge servers, caching, and how CDNs slash latency for users worldwide.

Run a Speed Test

Measure your download, upload, ping, and jitter.