Genozip vs 4BIN: Which Compresses FASTQ Better?
If you're compressing FASTQ files, you've probably come across Genozip — an open-source genomic data compressor that supports FASTQ, BAM, VCF, and other formats. It's a solid tool with good community adoption.
We built 4BIN with a different approach. Here's how they compare on real sequencing data.
Benchmark Setup
We tested both compressors on three whole-genome sequencing datasets from the DDBJ Sequence Read Archive. All compression is fully lossless — decompressed output is bit-identical to the original input.
| Dataset | Raw Size | Genozip | 4BIN | Winner |
|---|---|---|---|---|
| DRR000798 | 100% | ~7.1% | 4.56% | 4BIN (1.56x) |
| DRR000801 | 100% | ~7.3% | 4.77% | 4BIN (1.53x) |
| DRR000802 | 100% | ~6.9% | 4.47% | 4BIN (1.54x) |
4BIN consistently compresses to 4.5% of original size — roughly 1.5x better than Genozip across all three datasets.
How Does This Translate to Storage Costs?
For a 1 PB FASTQ archive on AWS S3 ($0.023/GB/month):
| Compressor | Compressed Size | Annual S3 Cost |
|---|---|---|
| gzip | 250 TB | $69,000 |
| Genozip | ~71 TB | $19,596 |
| 4BIN | 45 TB | $12,420 |
Switching from Genozip to 4BIN saves an additional $7,176/year per petabyte — and both are lossless.
Feature Comparison
| Feature | Genozip | 4BIN |
|---|---|---|
| FASTQ compression | ~7% | 4.5% |
| Lossless | Yes | Yes |
| BAM support | Yes | Yes |
| VCF support | Yes | Yes |
| API access | No (CLI only) | Yes |
| Cloud-native | No | Yes |
| Open source | Yes | No |
Genozip is open source and runs locally. 4BIN is available as a cloud API, making it easier to integrate into existing bioinformatics pipelines without local installation.
When to Use Each
Choose Genozip if you need a free, open-source tool for local compression and don't mind slightly larger files.
Choose 4BIN if you're optimizing for minimum storage cost, need API integration, or are compressing at scale where the 1.5x difference compounds into significant savings.
Try 4BIN
Get free API access to test 4BIN on your own FASTQ data. Sign up here or read the API docs.