Blog

Research, benchmarks, and updates from the Smallest.zip team.

Mar 05, 2026 5 min read

Beyond FASTQ: 4BIN Compression for VCF, BAM, POD5, and Spatial Genomics

We expanded our genomics compression beyond FASTQ to four new file types. VCF achieves 385x reduction, BAM saves 84.7%, POD5 hits 2.69 bps, and spatial transcriptomics compresses to 4.73% of original size.

Compression Genomics DNA
Mar 04, 2026 3 min read

4BIN Now Compresses scRNA, RNA-seq, cfDNA, and WES — With Sub-12s Decompression

4BIN compression expanded beyond amplicon FASTQ to four new sequencing types. scRNA hits 5.87% of original size with 3.1s decompression. Every type decompresses in under 12 seconds.

Compression Genomics DNA
Mar 03, 2026 3 min read

4BIN vs PetaGene: Beating the Industry Standard in FASTQ Compression

Our 4BIN encoder compresses FASTQ files to 4.5% of original size — 1.1–1.2x better than PetaGene — while preserving 4-level quality scores. All amplicon files tested beat PetaGene.

Compression Genomics DNA
Feb 25, 2026 2 min read

Compressing Neural Network Weights: 40% Smaller Safetensors

We achieved 40% compression on BF16 safetensor model weights - cutting egress costs for terabyte-scale models by thousands of dollars per month.

Compression AI Machine Learning
Feb 14, 2026 3 min read

Introducing LIS: Backend Large Image Storage That Cuts Cloud Costs by 5,000x

Our new Large Image Storage system compresses a 1 PB image corpus to ~190 GB — reducing S3 costs from $2,300/mo to $4.37/mo while returning images in under 200ms.

Compression Image Storage
Feb 10, 2026 2 min read

Genozip vs 4BIN: FASTQ Compression Benchmark 2026

Head-to-head comparison of Genozip and 4BIN for FASTQ compression. 4BIN achieves 4.5% of original size vs Genozip's ~7%. Benchmarked on real DDBJ sequencing data.

Compression Genomics FASTQ
Feb 08, 2026 2 min read

28GB Windows Logs Down to 9.8MB — 94.4% Smaller Than xz

We compressed a 28GB Windows log file with 114.6 million lines down to just 9.8MB — 94.4% smaller than xz -9 at maximum compression.

Compression Benchmarks Windows
Feb 08, 2026 3 min read

70% Smaller JPEGs — Still Standard JPEG, Works Everywhere

Our JPEG compressor reduces images by up to 70% while outputting standard JPEG files that work in every browser, phone, and app. Fast mode runs in 29ms per image.

Compression Benchmarks JPEG
Feb 07, 2026 2 min read

97.6% Smaller Than xz on 11.4GB Windows Security Event Logs

We compressed an 11.4GB Windows security event log (JSONL) down to just 10MB — 97.6% smaller than xz -9. Structured JSON logs are where Smallest.zip truly shines.

Compression Benchmarks Windows
Feb 07, 2026 2 min read

99.4% Compression on ZooKeeper Logs — 82% Smaller Than xz

We compressed a 10.4MB ZooKeeper log file down to just 63KB — 81.6% smaller than xz -9 at maximum compression.

Compression Benchmarks ZooKeeper
Feb 07, 2026 2 min read

HDFS Logs: From 31% to 64% Smaller Than xz — Our V4 Breakthrough

Our V4 token detection system doubled the compression advantage on 1.5GB HDFS logs — now 63.6% smaller than xz -9, and 6x faster.

Compression Benchmarks HDFS
Feb 07, 2026 2 min read

96% Compression on 1.5GB HDFS Logs — 31% Smaller Than xz

We benchmarked Smallest.zip on a massive 1.5GB HDFS log file with 11 million lines. Result — 63MB output, 30.6% smaller than xz -9.

Compression Benchmarks HDFS
Feb 07, 2026 2 min read

93% Smaller Than xz on 26GB Windows CBS Logs

We compressed a 26.1GB Windows Component-Based Servicing log file down to 26MB — 93.2% smaller than xz -9 at maximum compression.

Compression Benchmarks Windows
Feb 07, 2026 2 min read

Fast Mode: Same Compression Ratios, Massively Faster

Our new fast-mode optimizer compresses log files in under a second — while maintaining or exceeding all previous compression ratios. Here's the full benchmark.

Compression Benchmarks Performance
Feb 06, 2026 2 min read

98.3% Compression on Linux Kernel Logs — 65% Smaller Than xz

We tested Smallest.zip against gzip, bzip2, zstd, and xz on a 2.3MB Linux kernel/syslog file. Our encoder compressed it to just 40KB — 65.4% smaller than xz -9.

Compression Benchmarks Linux
Feb 06, 2026 2 min read

99.1% Compression on Apache Logs — 75% Smaller Than xz

We tested Smallest.zip on a 4.9MB Apache log file with 56K lines of mixed error, notice, and access logs. Result — 46KB output, 74.6% smaller than xz -9.

Compression Benchmarks Apache
Feb 05, 2026 3 min read

Reduce AWS Genomics Storage Costs by 95% with FASTQ Compression

How to cut your AWS S3 genomics storage bill by 95%. Compress FASTQ files from 25% (gzip) to 4.5% of original size with 4BIN. Real cost calculations included.

Compression Genomics AWS
Feb 05, 2026 2 min read

Crushing Log Files: 98.4% Compression on 70MB SSH Logs

We benchmarked our Smallest.zip encoder against gzip, xz, bzip2, and zstd on a real-world 70MB SSH syslog file. The results speak for themselves — 67% smaller than xz -9.

Compression Benchmarks Log Files
Feb 01, 2026 2 min read

PetaGene Alternative: 4BIN Compresses FASTQ 1.15x Better

Looking for a PetaGene alternative? 4BIN achieves 4.5% FASTQ compression vs PetaGene's 5.3% — 1.15x better on real sequencing data. Cloud API, no local install required.

Compression PetaGene FASTQ
Jan 20, 2026 3 min read

How to Compress FASTQ Files for S3 Archival

Step-by-step guide to compressing FASTQ files for long-term S3 storage. Compare gzip, Genozip, PetaGene, and 4BIN compression ratios and costs.

Compression FASTQ S3
Jan 15, 2026 3 min read

FASTQ vs BAM: Which Is Cheaper to Store?

Comparing FASTQ and BAM file sizes, compression ratios, and cloud storage costs. How to minimize genomics storage spend across both formats.

Compression FASTQ BAM
Jan 05, 2026 2 min read

Geospatial Compression: GeoJSON, Shapefile, LiDAR and GeoTIFF Below 10%

We hit our geospatial compression targets — GeoJSON at 9.85%, Shapefile at 9.82%, LiDAR at 6.99%, and GeoTIFF DEM at 23% — all lossless with bbox query support.

Compression Geospatial GeoJSON