Multi-Region Network Design

Going from single-region to multi-region cloud is a step change in architectural complexity, not a configuration tweak. Every cross-region call has higher latency. Every replicated byte costs money. Every failover has to be tested. The benefits — user-perceived latency and regional resilience — are real but earned by getting many design decisions right. The patterns below are the durable ones, independent of which cloud provider you use.

The two reasons to go multi-region

DriverWhat it solvesWhat it does not solve
LatencyUsers in distant regions get fast responses from a nearby copyDoesn't help if your data is single-region and every request has to round-trip there
ResilienceSurvive a region-level outageDoesn't help against bad code deploys, data corruption, or account-wide issues

If neither applies, single-region with multiple AZs is the right answer. AZ-level redundancy handles most failure modes; the operational cost is a fraction of multi-region.

One VPC per region

Each region gets its own VPC with its own non-overlapping CIDR. Each VPC has its own subnets, route tables, NAT gateways, and managed-service endpoints. Cross-region connectivity is via VPC peering, transit gateways, or the provider's global backbone, depending on cloud and scale.

A reasonable CIDR scheme:

  • us-east-1: 10.10.0.0/16
  • us-west-2: 10.20.0.0/16
  • eu-west-1: 10.30.0.0/16
  • ap-southeast-1: 10.40.0.0/16

Symmetric layouts simplify Terraform/Pulumi and make routing rules predictable.

How regions connect

MechanismUse caseLimitations
Inter-region VPC peeringTwo specific VPCs in different regionsDoesn't transit; full mesh becomes N×(N-1)/2 connections
Transit gateway peeringHub-and-spoke across regionsCloud-specific (AWS); cost per attachment
Global backboneService-to-service over provider's networkSpecific services (e.g., AWS Global Accelerator)
VPN between regionsEncrypted overlay; portable across cloudsBandwidth limited; managed-service overhead

For up to ~5 regions, peering is straightforward. For more, transit gateways or hub-and-spoke designs become essential.

Data replication is the hard part

Cross-region data replication is where most multi-region designs fail. The choices:

  • Synchronous replication. Every write waits for confirmation from all regions. Strong consistency, but per-write latency is bounded below by inter-region RTT — typically 50-200 ms. Workloads with high write volume cannot tolerate this.
  • Asynchronous replication. Writes commit locally and propagate to other regions later. Fast, but consistency is eventual; a failed region can lose recent writes.
  • Quorum / multi-leader. Writes need confirmation from a majority of regions. Balances consistency and latency at the cost of complexity.
  • Region-pinned data. User data lives in only one region; queries from elsewhere route there. Simplest but reintroduces latency for those queries.

The right model depends on the workload. Read-heavy applications often pick async replication with regional read replicas. Write-heavy applications with strong consistency requirements often stay single-region or accept the latency cost of synchronous.

User routing

How users reach the right region:

  • DNS-based. Route 53, Cloud DNS, Azure DNS return different IPs based on resolver location. Simple; coarse geographic accuracy.
  • Anycast / global load balancer. A single IP advertised from every region. BGP routes the user to the topologically closest region. Used by global front-doors (Cloudflare, AWS Global Accelerator, GCP Cloud Load Balancing).
  • Client-side selection. The application picks the region based on app-level knowledge (user account home region, time of day, latency probes).

Active-active vs active-passive

PropertyActive-ActiveActive-Passive
All regions serve trafficYesOnly primary
Failover speedSeconds (DNS or anycast reconvergence)Minutes (must promote standby)
Capacity utilizationHigh (no idle standby)Low (standby idle most of the time)
ComplexityHigh (data consistency, conflict resolution)Lower (single source of truth)
CostHigher (full capacity in each region)Lower (standby can be smaller)

Active-active is the right pattern when failover speed matters and when conflict resolution is tractable. Active-passive is the right pattern when data consistency is non-negotiable and minutes of failover is acceptable.

The cost dimension

Cross-region traffic charges are the multi-region tax:

  • Inter-region data transfer: typically $0.02-$0.09 per GB depending on regions.
  • Cross-region replication: every replicated byte is billed.
  • Inter-region API calls: cumulative for high-volume internal traffic.

For a service replicating 10 TB/day across regions, that's $200-$900/day in transfer charges alone, before any compute or storage. Optimize aggressively: compress, deduplicate, and only replicate what's necessary.

Testing failover

Failover that has never been tested probably doesn't work. Discipline:

  • Schedule regular game days. Disable the primary region in a non-prod environment quarterly.
  • Automate failover. Manual procedures break under stress.
  • Measure RTO and RPO. Recovery Time Objective (how fast you're back up) and Recovery Point Objective (how much data you might lose). Test both.
  • Test failback. Returning to the primary after recovery is often more disruptive than failing over.

The minimum viable multi-region

For most teams starting out: two regions, active-passive, async replication, DNS-based routing. This pattern provides regional resilience without the consistency challenges of active-active. Once operational maturity is in place, you can graduate to more sophisticated designs.

Frequently Asked Questions

Why deploy across multiple regions?

Two reasons. Latency: serving users from the geographically closest region cuts round-trip time substantially. Resilience: a region-level outage doesn't take down the whole service if other regions can absorb traffic. The cost is dramatically higher complexity — data replication, traffic routing, failover orchestration, and cross-region data transfer fees.

How do regions connect to each other?

Across the cloud provider's global backbone via cross-region peering or transit gateway peering. Traffic stays inside the provider's network, doesn't traverse the public internet, and uses dedicated capacity. The bandwidth and latency between regions vary — some pairs are well-connected with dedicated fiber, others route through intermediate POPs.

What is active-active vs active-passive?

Active-active means multiple regions simultaneously serve user traffic. Failover is automatic — DNS or anycast routes around a failing region. Active-passive means one region is primary and others are standby; on failure, traffic shifts to the standby. Active-active is more complex (data consistency, traffic routing) but provides faster failover and uses standby capacity for real traffic.

How do I route users to the right region?

Three approaches. DNS geo-routing returns different IPs based on the client's geographic location. Anycast routes via BGP to the topologically closest region — used by CDNs and some global load balancers. Application-layer routing (e.g., a global front-door service) makes the routing decision after receiving the connection, with access to richer signals.

What is the biggest cost surprise in multi-region?

Inter-region data transfer. Replicating data between regions, cross-region API calls, and database replication can produce surprisingly large bills. Cloud providers typically charge per GB for data leaving a region; per-GB rates may be lower for traffic within the provider's network than to the internet, but at multi-TB-per-day volumes the line item still surprises.

Related Guides

More From This Section