What Is a Load Balancer?

Run a Speed Test

A load balancer distributes incoming traffic across multiple servers — preventing bottlenecks, absorbing failures, and keeping sites available under any traffic volume.

The Problem Load Balancers Solve

A single server has finite capacity. Under low traffic it handles requests quickly. As traffic grows, the queue of pending requests grows, response times increase, and eventually the server becomes unavailable. The naive solution — a bigger server — works up to a point, but no single machine is big enough for sites handling millions of requests per hour.

A load balancer solves this by distributing incoming requests across a pool of servers. From the outside, clients see one IP address and one service. Inside, their requests go to whichever backend has capacity. Adding a server to the pool increases capacity; removing one (or having one fail) causes the load balancer to redistribute traffic without any client-visible disruption.

Layer 4 vs Layer 7 Load Balancing

Load balancers operate at different levels of the network stack, and the level determines what they can see and how they route.

FeatureLayer 4 (Transport)Layer 7 (Application)
Routes based onIP address, TCP/UDP portHTTP headers, URL, cookies, body
Protocol awarenessTCP/UDP onlyHTTP, HTTPS, WebSocket, gRPC
TLS terminationNo (passes through)Yes
Content-based routingNoYes (URL path, host header)
Performance overheadVery lowHigher (full packet inspection)
Typical useHigh-throughput, non-HTTP servicesWeb applications, APIs

An L4 load balancer sees a TCP connection arriving on port 443 and forwards it to a backend — it cannot read the HTTP request inside. An L7 load balancer terminates the TLS connection, reads the HTTP request, and can route /api/* to one backend pool and /static/* to another, or route requests with a specific cookie to a specific server.

Load Balancing Algorithms

AlgorithmHow It WorksBest For
Round RobinRequests cycle through backends in orderUniform request cost, equal-capacity servers
Weighted Round RobinServers with higher weight receive proportionally more requestsBackends with different capacity
Least ConnectionsEach request goes to the backend with fewest active connectionsVariable request duration
Least Response TimeRoutes to the backend with lowest current latencyLatency-sensitive applications
IP HashClient IP is hashed to consistently select the same backendStateful apps requiring session affinity
RandomBackend chosen randomly per requestSimple even distribution

Health Checks: Removing Failed Servers

A load balancer continuously monitors the health of its backend pool. Health checks are periodic probes — typically an HTTP GET to a /health endpoint, a TCP connection attempt, or a ping — that verify each backend is alive and responding. If a backend fails a configurable number of consecutive checks, the load balancer marks it unhealthy and stops routing traffic to it. When the backend recovers and passes health checks again, it is returned to the pool.

This automatic detection and exclusion of failed backends is what makes load-balanced architectures resilient. A single server failure does not cause downtime — the load balancer silently removes it and the remaining servers absorb its share of traffic.

Sticky Sessions: When Distribution Conflicts with State

Round-robin and least-connections algorithms distribute requests across all backends, which means two requests from the same user may go to different servers. For applications that store session data locally — in memory on the server rather than in a shared database — this causes users to lose their session.

Sticky sessions (session persistence) solve this by tagging requests with a cookie or by hashing the client IP, then always routing tagged requests to the same backend. This restores session consistency at the cost of perfectly even distribution. The better long-term solution is to move session state into a shared store (Redis, Memcached, a database) so any backend can serve any request.

Load Balancer vs Reverse Proxy

In practice the two terms are often used interchangeably, but they have different emphases. A load balancer focuses on distributing traffic and ensuring availability across a pool of backends. A reverse proxy is a broader concept — it includes load balancing but also TLS termination, caching, request transformation, compression, and security filtering.

Software like Nginx and HAProxy do both. A dedicated hardware load balancer appliance may do only traffic distribution. Cloud load balancers (AWS ALB/NLB, GCP Load Balancing) typically combine both roles into one managed service.

Frequently Asked Questions

What is a load balancer?

A component that distributes incoming requests across a pool of backend servers to prevent overload, improve response times, and ensure availability when individual servers fail.

What is the difference between L4 and L7 load balancing?

L4 routes based on IP and port — it cannot read request content. L7 routes based on HTTP headers, URLs, and cookies — it can make content-aware decisions and handles TLS termination. L7 is more flexible; L4 is faster and works for any TCP/UDP traffic.

What is round-robin load balancing?

Each request goes to the next server in a circular list. Simple and even by request count, but not by server load — weighted round-robin assigns more requests to higher-capacity backends.

Does a load balancer hide the server's IP?

Yes. Clients connect to the load balancer's IP. Backend servers stay on a private network, invisible to clients. This protects backends and allows them to be changed without any client-side reconfiguration.

What is sticky sessions?

A feature that routes all requests from a given client to the same backend, using a cookie or IP hash. Needed for apps that store session state locally. The better solution is shared session storage so any backend can serve any client.

What is the difference between a load balancer and a reverse proxy?

A load balancer's core job is distributing traffic. A reverse proxy is broader — it adds TLS termination, caching, compression, and security filtering on top of distribution. Most software implementations (Nginx, HAProxy) do both.

Related Guides

More From This Section