Deduplication Explained
Deduplication promises huge storage savings — 10x or 20x for the right workload — by storing identical data only once and replacing duplicates with pointers. The catch is the bookkeeping. The system needs to track every unique block, look up hashes on every write, and keep the lookup structure in memory. Get the workload right and dedupe is magic; get it wrong and it's a performance disaster that doesn't save much space.
The basic mechanism
- Incoming data is split into chunks (fixed or variable size).
- Each chunk is hashed (SHA-256 typically; some systems use weaker fast hashes plus optional byte-compare for verification).
- The hash is looked up in the dedupe table.
- If found, the new write just adds a reference to the existing block. No new storage consumed.
- If not found, the block is written and the table updated.
On read, the system follows the reference back to the actual block. Multiple files sharing chunks all point to the same physical data.
Inline vs post-process
| Mode | When dedupe happens | Tradeoff |
|---|---|---|
| Inline | During writes | Saves bandwidth and storage immediately; adds write latency |
| Post-process | Scanning after writes complete | Faster writes; needs space for duplicates until scan runs |
ZFS dedupe is inline. Some commercial systems (NetApp ASIS) are post-process. Choose based on whether write-path latency or temporary space is more constrained.
Fixed vs variable block
Fixed-block dedupe divides data into chunks of a constant size (e.g., 4 KB). Easy to implement and fast. Catches identical sequences only when they're aligned to block boundaries.
Variable-block dedupe uses content-defined chunking (rolling hash detects natural boundaries in the data). Catches duplicates even when files have insertions or deletions that shift the alignment of subsequent data. Higher dedupe ratios but more CPU cost.
Example: insert one byte at the beginning of a 1 GB file. Fixed-block dedupe sees no duplicates (every block has shifted). Variable-block dedupe catches all the unchanged content after the new boundary.
The dedupe table
The dedupe table maps each unique block's hash to its on-disk location and reference count. For ZFS:
- Approximately 320 bytes of table entry per unique block (varies by recordsize).
- For 4 KB recordsize: 80 GB of table per 1 TB of unique data.
- For larger recordsizes (128 KB): much smaller table per TB but coarser dedupe granularity.
The table must be efficiently accessible on every write. If it doesn't fit in RAM, the system reads table entries from disk on every block write — destroying performance.
RAM requirements
The conventional ZFS dedupe RAM rule of thumb:
| Workload | RAM per TB stored |
|---|---|
| Backup target (low write rate) | 1-2 GB |
| VM datastore (moderate write rate) | 3-5 GB |
| Write-heavy general purpose | 5-10 GB |
A 100 TB pool with dedupe wants 100-500 GB of RAM. For homelabs and SMB NAS deployments, this is often prohibitive. Many people enable dedupe, run into performance issues months later, and have to migrate to a non-dedupe pool to recover.
When dedupe wins
- Backup repositories. Multiple full backups of the same systems have 90%+ overlap. Dedupe ratios of 10-50x are common.
- VM image storage. 50 VMs running the same OS share most of the OS files. 5-10x dedupe is typical.
- Build artifacts. Sequential CI builds share most files.
- Email archives. Reply-all chains include the original message in every copy.
When dedupe doesn't help
- Already-compressed data. Video, JPEGs, encrypted blobs are pseudo-random; no duplicates.
- Encrypted data with per-block keys. Each block looks unique.
- Unique data. Log files with high entropy, sensor readings, scientific data.
- Small datasets. Dedupe overhead exceeds savings.
For these workloads, dedupe wastes RAM and slows writes for no benefit. Disable it.
Compression vs dedupe
Compression and dedupe attack the same problem (storage savings) from different angles:
- Compression — encodes data efficiently. Works on any data with patterns; modest savings (5-50%).
- Dedupe — removes duplicates. Works only when data actually repeats; huge savings when it does.
For most workloads, compression is the better default — small CPU cost, modest but reliable savings, no RAM tax. ZFS LZ4 compression is essentially free and recommended for almost everything. Dedupe is the targeted tool for workloads that specifically have duplicate content.
Application-layer dedupe
Some systems do dedupe at the application layer instead of the storage layer. Examples:
- Restic, borg, duplicacy — backup tools with content-defined chunking. Dedupe is built into the backup repository format. Storage backend (S3, local disk) sees pre-dedupered chunks.
- Git — tree-and-blob storage with content-addressed dedupe at the object level.
- Container registries — image layers shared across many containers.
Application-layer dedupe has the advantage that the application knows its data — variable boundaries, what's worth deduping, what's already-compressed. Storage-layer dedupe has to be content-agnostic.
Hashing and collisions
Dedupe identifies duplicates by hash comparison. SHA-256 has so few collisions that they're effectively impossible. Weaker hashes (xxHash, FNV) are sometimes used for speed but require byte-comparison to verify before deduplicating.
If a system relies solely on a fast non-cryptographic hash with no verification, a hash collision could cause one file's data to overwrite another — silent corruption. Modern dedupe implementations use cryptographic hashes or hash + verify.
Operational watch points
- Disabling dedupe doesn't reclaim space from already-deduplicated data. You need to copy data to a non-dedupe destination and delete the original.
- Dedupe + snapshots interact subtly. Snapshots reference deduped blocks; deletions can produce surprising space accounting.
- Performance degrades over time as the dedupe table grows. Plan for the table size at peak fill.
- Backup restore performance depends on dedupe — reading a deduped backup requires following many references. Random-access restores can be slow.
Frequently Asked Questions
What is deduplication?
A storage technique that identifies identical chunks of data and stores each unique chunk exactly once, with multiple references pointing to it. When the same block appears in many files, it consumes only one block of physical storage. Used in backup systems, VM-heavy storage, and any environment with significant data duplication.
What is the difference between inline and post-process deduplication?
Inline dedupe runs as data is written — every block is hashed and checked against the dedupe table before being stored. Post-process dedupe writes first, then scans the storage afterward to find and merge duplicates. Inline saves write bandwidth but adds latency to writes; post-process is slower to free space but has lower write-path overhead.
Why is ZFS deduplication so RAM-hungry?
Because the dedupe table maps each unique block's hash to its on-disk location. For best performance the table needs to fit in RAM. Rule of thumb: 1-5 GB of RAM per TB of deduplicated data. A 100 TB pool with dedupe enabled wants 100-500 GB of RAM just for the dedupe table. Insufficient RAM forces table reads from disk, killing performance.
When does deduplication actually pay off?
Backup repositories (multiple versions of the same files), VM image stores (most VMs share similar OS files), document repositories with many similar versions. Doesn't pay off on already-compressed data (video, JPEGs, encrypted blobs) or unique data (raw sensor logs, randomly-generated data). Test with a small representative workload before committing.
What is the difference between fixed-block and variable-block dedupe?
Fixed-block divides data into chunks of a fixed size (typically 4 KB-128 KB) and hashes each. Simple and fast. Variable-block uses content-defined chunking (rolling hash detects natural boundaries), which catches duplicates even when files have small insertions or deletions. Variable-block achieves higher dedupe ratios but with more CPU cost.
Related Guides
More From This Section
All Storage & NAS Guides
RAID, NAS, Plex/Jellyfin, SMB/NFS, backups, and filesystems.
The 3-2-1 Backup Strategy Explained
3-2-1 means 3 data copies, on 2 media types, with 1 offsite.
Encryption at Rest
Full-disk vs filesystem vs application encryption — what each layer protects against, how keys are managed, and the…
Run a Speed Test
Measure download, upload, ping, and jitter in your browser.