Database Dump Compression — 5× Smaller Than gzip'd pg_dump

The math at 1 TB of dumps / month

Typical mid-size Postgres / MySQL fleet: ~1 TB of daily-dump retention. Here's what 12 months looks like:

Backup target	Storage cost	Compression assumption	Annual cost (12 mo, 1 TB/mo)
AWS RDS automated snapshots	$0.095 / GB-month¹	RDS internal compression (~2×)	~$7,500 / yr
S3 + `gzip -9` pg_dump	$0.023 / GB-month²	gzip (~3–4× on plain SQL)	~$1,700 / yr
S3 + smallest.zip dump codec	$0.023 / GB-month	columnar zstd (~10–25× on plain SQL)	~$520 / yr
smallest.zip archive store + dump codec	$0.005 / GB-month + $0.02 / GB processed	columnar zstd + cold archive tier	~$365 / yr

Storage of 1 TB / month compounding for 12 months at the listed rate, average column. RDS row sized at 1 TB logical dump per month — the actual EBS snapshot price varies with churn but for retention-month accounting the per-GB-month rate is the right comparison.
¹AWS RDS for Postgres backup pricing us-east-1. ²S3 Standard us-east-1. Ratios verified — see benchmarks.
That's ~20× cheaper than RDS automated snapshots, ~5× cheaper than S3 + gzipped pg_dump.

How it works

1

Dump your DB

Run whatever you already run. pg_dump --format=plain, mysqldump, mongodump. We handle the resulting .sql / .dump / .bson.

2

Upload / pipe

Streaming upload over HTTPS, or pipe straight into our CLI: pg_dump mydb | smallzip --engine postgres > mydb.dbz. We auto-detect engine.

3

Store at 1/5 the size

Output goes to your S3 bucket, your GCS bucket, or our cold archive at $0.005/GB-mo. Restore is one command: decompress → psql -f.

See it on your own dump

Drop a real pg_dump or mysqldump file. Watch it shrink. Re-download the compressed blob.

Drop a dump file →

No signup. 100 MB cap per file on the trial. Larger files via API.

Supported engines

PostgreSQL — full

pg_dump --format=plain output. CREATE TABLE schema is preserved, COPY blocks are encoded columnarly by type (int / uuid / numeric / timestamp / json / text). INSERT-only dumps also supported.

MySQL — full

mysqldump default output, including batched INSERT INTO ... VALUES (...), (...) form. Per-table global string dictionary for high-cardinality varchar columns.

MongoDB BSON — full

mongodump BSON output. Schema-inferred columnar encoding of repeated documents, zstd-22 per field.

Oracle — text

expdp SQL exports and Oracle SQL Developer text exports. INSERT and DDL preserved; binary .dmp data files not yet supported.

MSSQL — text

SQL Server Management Studio text exports and bcp-format INSERT scripts. Native .bak backup files not yet supported.

Auto-detect

Drop the file; we sniff -- PostgreSQL database dump / -- MySQL dump headers and BSON magic bytes. You can also pin engine via ?engine=postgres on the API.

Important: semantically lossless, not byte-exact

Our codec re-emits the dump file from a columnar encoding rather than shuffling bytes. The output is semantically equivalent to the input — same schema, same row data, same row counts, identical post-restore database state — but text-different: whitespace inside SET pragmas may be normalised, and ordering of independent statements (indices, constraints) may differ.

If your compliance pipeline hashes the raw dump file, that hash will not match after a round-trip. The supported verification workflow is: decompress → restore to a transient Postgres / MySQL → re-pg_dump with stable flags → hash that. The post-restore DB state is identical and that hash will match.

A strict-lossless mode (preserves raw bytes alongside the columnar payload) is on the roadmap for hash-on-file customers. Talk to us if you need it now.

Benchmarks

All cases below round-trip with matching table set and matching row counts. Full report: codec-audit/db-dumps.

Dump	Original	gzip -9	smallest.zip	vs gzip	Reduction
pg_dump — narrow users table (1k rows)	53.4 KB	11.8 KB	2.0 KB	5.9×	−96%
pg_dump — INSERT-only narrow (200 rows)	19.5 KB	1.5 KB	1.0 KB	1.5×	−95%
mysqldump — batched INSERT (5k rows)	317.8 KB	57.4 KB	21.0 KB	2.7×	−93%
pg_dump — 5-table shop schema (6.8k rows)	136.7 KB	47.5 KB	28.7 KB	1.65×	−79%
pg_dump — wide mixed types (200k rows, uuid+json+numeric+ts)	21.4 MB	8.0 MB	5.2 MB	1.56×	−76%

"5× smaller than gzip" is the narrow-table case; on wider mixed-type schemas expect 1.5–2×. Either way you beat gzip -9 pg_dump.sql at no extra cost.

Frequently asked questions

Which engines are supported?

Postgres (plain pg_dump), MySQL (mysqldump), MongoDB (BSON from mongodump). Oracle and MSSQL text exports also work; Oracle .dmp binary and MSSQL .bak backup files are not yet supported — use the engine's own dump-to-SQL tool first.

How do I restore?

One command: smallzip decompress mydb.dbz | psql mydb. The decompressed file is plain .sql / .bson — same as what pg_dump would have produced. No proprietary loader.

Cross-version compatibility?

The codec preserves all DDL verbatim, so a Postgres 13 dump compressed by smallest.zip restores cleanly under Postgres 14/15/16 with the same semantics as restoring the raw pg_dump output. We don't translate dialects.

Is it encrypted?

TLS 1.3 in transit. At-rest encryption (AES-256) on our archive store. Bring your own KMS key on enterprise. The compressed blob itself is not encrypted by the codec — wrap with age / gpg / S3 SSE if you need it.

How is this different from gzipping pg_dump?

Gzip compresses bytes; we compress columns. After parsing the COPY block we know "this column is all integers", "this column is uuid", "this column is a JSONB blob", and apply a type-specific encoder (varint, scaled-int, per-column zstd dictionary). On narrow tables that's worth 5–6× over gzip; on wide mixed-type tables it's 1.5–2×.

On-prem?

Yes — enterprise tier ships a CLI binary you run in your VPC. Compressed output stays in your bucket. Contact sales.

What about format=custom or format=directory?

pg_dump --format=custom is already zstd-internally; running our codec on top buys you ~5% — not worth it. Use --format=plain (the default for many people) and pipe through smallest.zip instead.

Pricing?

$0.02 per GB of input processed, $0.005 per GB-month of compressed storage on our archive tier. No per-table fees, no per-restore fees. See pricing.

SLA?

99.9% uptime on standard, 99.99% on enterprise with multi-region replication. Credits applied automatically if missed.

What if smallest.zip disappears?

The decoder is a single static binary. Enterprise customers get a perpetual source-available license to it. Your .dbz files are never trapped.

Cut your DB backup bill by 95%

Drop a real pg_dump or mysqldump and watch it shrink. No signup, no credit card.

Try it free, no signup See pricing

Database dumps, 5× smaller than gzip