What Is an SLA

"99.9% uptime" sounds impressive until you do the math. It means 8.8 hours of allowed downtime per year and the provider owes you nothing as long as they stay under that. The SLA is a contract that defines what counts as up, how outages are measured, what the consequences are when they happen, and what the customer must do to claim them. Reading the SLA carefully — not just the headline number — is what separates a real guarantee from marketing.

The uptime nines

UptimeDowntime per yearDowntime per monthDowntime per week
99%3.65 days7.2 hours1.68 hours
99.9%8.77 hours43.8 minutes10.1 minutes
99.99%52.6 minutes4.38 minutes1.01 minutes
99.999%5.26 minutes26.3 seconds6.05 seconds

The jump between each tier is a 10x reduction in allowed downtime. 99.9% (three nines) is the typical baseline for business internet. 99.99% (four nines) is enterprise-grade with significant infrastructure investment. Five nines requires redundancy at every level — not just provider commitment.

The core SLA commitments

  • Availability / Uptime. The percentage of time the service is up, measured per month or per year. Defined as "minutes of outage" or "incidents per period."
  • Mean Time to Repair (MTTR). The average time from incident report to service restored. Watch for "respond" vs "repair" — "respond" only commits to acknowledgment, not fix.
  • Bandwidth / Throughput. For internet services, the speed actually delivered. Often expressed as a committed information rate.
  • Latency / Jitter / Packet loss. Network performance thresholds, with credits if exceeded over a defined measurement window.

What gets excluded

Every SLA has carve-outs. Common ones:

  • Scheduled maintenance. Doesn't count as downtime. May be defined narrowly ("notified 7 days in advance, between midnight and 6 AM, no more than 4 hours per month") or broadly ("at provider's discretion").
  • Force majeure. Natural disasters, government actions, war.
  • Customer-caused issues. Your firewall misconfig, your power outage, your wiring.
  • Third-party network issues. If the problem is between the provider's network and the rest of the internet but not in the provider's segment, they may not count it.
  • Force majeure of upstream providers. If their carrier has a fiber cut and they sourced from one carrier, you may have no recourse.

The exclusions list is sometimes longer than the commitments. Read both.

How outages are measured

Several measurement approaches, in order from customer-favorable to provider-favorable:

  1. Continuous synthetic monitoring. Third-party probes test the service constantly; logged downtime is whatever the probes detect.
  2. Customer-reported incidents. Outage time starts when the customer opens a ticket and ends when the provider declares it resolved.
  3. Provider-side telemetry. The provider reports their view of the service. May not match customer experience.
  4. Network-only measurement. The provider's network edge is up; what happens past that is not their concern.

SLAs vary widely on which measurement is authoritative. Customer-reported with documentation requirements is common and somewhat customer-favorable; provider-side telemetry is sometimes the only basis and is provider-favorable.

The credit structure

A typical tiered credit schedule for monthly downtime:

Downtime (hours)Service credit (% of MRC)
0-10%
1-45-10%
4-815-25%
8-2425-50%
24+50-100%

"100% credit" sounds generous — until you remember it's 100% of one month's bill, which may be a fraction of the revenue lost during the outage. The credit is rarely a complete financial remedy.

Cap on credits

Almost every SLA caps total credits at 100% of the monthly bill, even if outages exceed what would warrant more. Some cap lower (50%). The credit is the maximum consequence — not the actual damages — and almost all SLAs disclaim consequential damages explicitly. Lost revenue, lost customers, lost data are your problem, not the provider's.

The escalation path

Beyond credits, a useful SLA defines:

  • Who to contact at what severity (NOC for any issue; account manager for SLA disputes; executive escalation for unresolved disputes).
  • Response time commitments by severity.
  • Notification triggers (you get a status update every X hours during an active P1 incident).

The communication discipline is often more valuable than the credit. Knowing what's happening during an outage lets you make better decisions about workarounds and customer comms.

Asking for credits

Credits are usually not automatic. The customer must:

  1. Document the outage with start/end times and any tickets opened.
  2. Submit a credit request within the contractual window (often 30 days).
  3. Cite the specific SLA clause.
  4. Follow up if not honored.

For repeated SLA misses, escalate to account manager and consider invoking termination-for-cause clauses. Persistent SLA failures can be grounds for contract termination without penalty in some agreements.

SLA vs SLO vs SLI

  • SLA (Service Level Agreement) — the contractual commitment to customers with financial consequences.
  • SLO (Service Level Objective) — an internal target the team aims for (often stricter than the SLA).
  • SLI (Service Level Indicator) — the actual measured value (uptime in the last month).

Customers care about the SLA. Operators internally care about SLOs that leave headroom under the SLA.

Frequently Asked Questions

What is an SLA?

A service level agreement is a contract clause that commits the service provider to specific performance levels — typically uptime, response time, and resolution time — with defined consequences (usually service credits) if those levels are missed. It is the financial backstop for what the provider promises to deliver.

What does 99.9% uptime actually mean?

It means up to 0.1% of the time the service may be unavailable without violating the SLA. Over a month that's about 43 minutes; over a year, about 8.8 hours. 99.99% (four nines) is about 4.4 minutes per month; 99.999% (five nines) is about 26 seconds per month. The number of nines matters; the difference between three and four nines is an order of magnitude in allowed downtime.

What is MTTR in an SLA?

Mean Time To Repair (or sometimes Mean Time To Respond, which is different and weaker) is the committed average time from incident report to service restored. Common DIA SLAs commit to 4-8 hours MTTR. Hosting and cloud SLAs are often shorter; consumer services often have no MTTR commitment at all.

What is a service credit?

A partial refund of monthly fees applied when the provider misses an SLA commitment. Common schedules tie credit percentage to severity — for example, 10% credit for 1-4 hours of monthly downtime, 25% for 4-8 hours, 50% for 8+ hours. The credit is the entire financial consequence for the provider; if your actual outage cost exceeds the credit, you absorb the difference.

Are SLA credits worth anything in practice?

They are worth what you ask for. Credits are typically not issued automatically — the customer must request them with documentation of the outage. Providers may dispute or delay. For high-value customers, credits can be substantial; for typical business plans, credits are often a small fraction of actual outage cost. The SLA's value is partially the credit and partially the documented commitment that gives you leverage in negotiation or escalation.

Related Guides

More From This Section