S3 Protocol Explained

S3 started as Amazon's storage service in 2006 and gradually became the universal object-storage API. Every cloud provider, every open-source object store, and most enterprise storage systems now speak the S3 protocol. Understanding it is understanding object storage — the semantics, the operations, the consistency model, and the operational patterns that flow from "data lives in a flat namespace addressed by keys, not in a tree of directories."

The S3 data model

Bucket — a named container for objects. Globally unique across all AWS accounts (so bucket names like "data" are taken; use organization-prefixed names).
Object — a piece of data plus metadata, stored under a key.
Key — the object's identifier within the bucket. Looks like a path (e.g., users/alice/photo.jpg) but is actually a flat string.
Region — the geographic region where the bucket and its objects live.
Storage class — the tier the object is stored in (Standard, IA, Glacier, etc.).

The flat namespace

Objects live in a flat namespace within a bucket. The forward slashes in keys are just characters; they don't create real directories. Listing objects by prefix (e.g., "users/alice/") emulates directory-like browsing, but there are no actual directories, no "mkdir," no rename operations.

Consequences:

"Renaming" a file is actually a copy + delete.
"Moving" a "folder" requires re-keying every object underneath.
"Listing a directory" is a prefix scan, potentially over millions of keys.
No file locks; no append.

The core operations

Operation	HTTP	Purpose
PutObject	PUT /bucket/key	Write or replace an object
GetObject	GET /bucket/key	Read an object (full or range)
DeleteObject	DELETE /bucket/key	Remove an object
HeadObject	HEAD /bucket/key	Get metadata without the body
ListObjectsV2	GET /bucket?list-type=2&prefix=...	Enumerate objects by prefix
CopyObject	PUT with x-amz-copy-source header	Server-side copy

Authentication uses AWS Signature Version 4 — the request is signed with a secret key, and the signature includes a timestamp and the request details. Replay attacks are bounded by the timestamp; modifications break the signature.

Multipart upload

Large objects are uploaded in parts:

InitiateMultipartUpload — get an upload ID for this object.
UploadPart — for each part (typically 5-100 MB), upload with its part number and the upload ID. Parts can upload in parallel.
CompleteMultipartUpload — provide the list of part numbers and their ETags; S3 assembles the parts into the final object.
(or) AbortMultipartUpload — give up; uploaded parts are discarded.

Benefits: parallel upload bandwidth, resumability (retry only failed parts), support for objects up to 5 TB. Required for any upload above 5 GB (single-PUT limit).

Presigned URLs

A presigned URL embeds an AWS request signature in the URL itself, granting time-limited access without exposing the credentials. The owner signs a GET or PUT request with their key, sets an expiration, and shares the URL. Anyone with the URL can perform that specific operation until expiration.

Use cases:

Web upload: server generates presigned PUT URL; browser uploads directly to S3.
Private download: server generates presigned GET URL for a paid asset; user downloads.
Mobile app: server provides presigned URLs so the app uploads without app-side credentials.

Object versioning

With versioning enabled on a bucket, every PUT creates a new version of the object. The latest version is returned by default GETs; older versions can be addressed by version ID. DELETEs become "delete markers" that hide the object without erasing the version history.

Combined with object lock (immutability for a retention period), versioning provides defense against accidental deletion and ransomware.

Consistency model

Since December 2020, S3 provides strong read-after-write consistency:

A successful PUT is immediately visible on subsequent GETs of the same key.
A LIST after a PUT is immediately consistent.
Update or delete a single key, and subsequent operations on that key see the new state.

Before 2020, S3 had eventual consistency for overwrites and deletes. Many old design patterns assume eventual consistency and add complexity that's no longer needed.

Other S3-compatible systems have varying consistency models. MinIO is strongly consistent; some others are eventual. Check before relying on consistency assumptions.

Storage classes

Each object lives in a storage class:

Class	Use case
S3 Standard	Default; frequent access; lowest latency
S3 Intelligent-Tiering	Auto-tiers based on access patterns
S3 Standard-IA / One Zone-IA	Infrequent access; lower storage cost; retrieval fee
S3 Glacier Instant Retrieval	Archive with millisecond retrieval; higher fees
S3 Glacier Flexible Retrieval	Archive; minutes to hours to retrieve
S3 Glacier Deep Archive	Coldest archive; hours to retrieve; 180-day minimum

See hot vs cold storage.

The ecosystem

Beyond AWS S3 itself, the protocol is implemented by:

MinIO — open-source self-hosted S3-compatible storage.
Cloudflare R2 — no egress fees, S3-compatible API.
Backblaze B2 — cheap object storage with S3 compatibility.
Wasabi — S3-compatible with simpler pricing.
Google Cloud Storage — has its own API but also supports S3-compatible interop mode.
Ceph RADOS Gateway — S3 API on top of Ceph cluster storage.
NetApp StorageGRID, Dell ECS, others — enterprise object stores with S3 API.

The S3 API isn't formally standardized (it's controlled by AWS), but it has become a de facto interface. Applications written against the AWS S3 SDK typically work against any of these with minimal endpoint changes.

Common operational patterns

Static site hosting. Put HTML/CSS/JS in a bucket; serve via website endpoint or a CDN.
Log archiving. Apps stream logs to S3; lifecycle rules move them to cold storage over time.
Backup target. Backup tools write to S3; lifecycle and versioning handle retention.
Data lake. Analytics tools (Athena, Spark, DuckDB) query files directly from S3.
Media library. User uploads via presigned URLs; CDN serves from S3 origin.

Frequently Asked Questions

What is S3?

Simple Storage Service — Amazon's object storage service, and by extension the HTTP API it speaks. The API has been widely cloned: nearly every object storage system (MinIO, Backblaze B2, Cloudflare R2, Wasabi, Ceph RGW) speaks the S3 protocol. The protocol is the de facto standard for object storage.

What is the difference between objects and files?

Files live in directories with operations like open, seek, write, rename. Objects are addressed by a key string within a flat namespace (a bucket); operations are PUT (create or replace), GET (read), DELETE (remove), HEAD (metadata only), LIST (enumerate). Objects are typically immutable — modifying one means writing a new object with the same key, replacing the old. No partial writes or random access (though range GETs let clients read parts of an object).

What is a multipart upload?

S3's mechanism for uploading large objects. The client splits the object into parts (typically 5-100 MB each), uploads each part in parallel, then completes the upload by listing the parts. Benefits: parallel upload bandwidth, resumability (retry only failed parts), and support for objects up to 5 TB (single-PUT is capped at 5 GB). Required for any upload above ~5 GB.

What is a presigned URL?

A URL that grants temporary, scoped access to an S3 object without the recipient needing AWS credentials. The owner signs a request with their credentials and embeds the signature in the URL; the URL works for the configured duration. Used to let users upload directly to S3 or download private objects without exposing the storage's full credentials.

Is S3 strongly consistent?

Yes, since 2020. S3 provides strong read-after-write consistency for all operations — a successful PUT is immediately visible on subsequent GETs. Historically S3 had eventual consistency for some operations, which made certain application patterns risky. The new consistency model simplified S3 usage significantly. Other S3-compatible services have their own consistency models — check before assuming.

Run a Speed Test

S3 Protocol Explained

The S3 data model

The flat namespace

The core operations

Multipart upload

Presigned URLs

Object versioning

Consistency model

Storage classes

The ecosystem

Common operational patterns

Frequently Asked Questions

What is S3?

What is the difference between objects and files?

What is a multipart upload?

What is a presigned URL?

Is S3 strongly consistent?

Related Guides

NAS vs Cloud Storage

Hot vs Cold Storage

RAID vs Erasure Coding

Private Endpoints

More From This Section

All Storage & NAS Guides

The 3-2-1 Backup Strategy Explained

Deduplication Explained

Run a Speed Test