DICOM studies, up to 71% smaller — pixel-exact, HIPAA-ready
Compress CT and MR studies without touching a single pixel value. Every PatientID, StudyInstanceUID, and SOPInstanceUID survives the roundtrip. Built for radiology archival at hospital scale.
Demo is anonymous — do not upload real patient data. Enterprise tier ships with BAA + dedicated infrastructure.
Why hospitals overpay for DICOM storage
A mid-sized hospital generates 5–50 TB of DICOM per year. Vendor PACS list prices run $50K–$500K up-front plus $0.10–$1.00/study/month for long-term archival. CT studies are the worst offender: hundreds of related slices that share almost all of their structure but are stored as independent files. We compress them losslessly to ~30–45% of original.
The math at 10,000 studies/month
Mid-sized imaging center: ~10K studies/month, ~250 MB average study, ~2.5 TB/month ingested, 7-year retention.
| Vendor | Pricing model | 10K studies/mo, 7yr retention | Approx annual cost |
|---|---|---|---|
| GE / Philips / Sectra PACS | $50K–$500K license + maintenance + $0.50/study/mo storage | Enterprise license + accumulating archival | ~$300K–$1.2M/yr1 |
| AWS HealthLake Imaging | $0.07/GB/mo ingest + $0.023/GB-mo storage | 2.5 TB/mo ingest + 7yr cumulative | ~$48K/yr2 |
| Sectra One Cloud | ~$0.30–$1.00/study/mo (list, varies) | 10K×7yr running stock | ~$120K–$400K/yr3 |
| smallest.zip | $0.05/study processed + $0.02/GB-mo stored (compressed) | 2.5 TB/mo → ~875 GB/mo after −65% avg compression | ~$22K/yr |
smallest.zip math: 10K studies × $0.05 = $500/mo processing; 2.5 TB → ~0.85 TB compressed at $0.02/GB-mo = $17.50/mo per month of retention, averaged over 84 months of accumulating archive.
1Public PACS list prices vary widely by hospital tier; figure cited from public RFP responses 2023–2024. 2AWS HealthLake Imaging us-east-1 published pricing. 3Sectra and equivalent enterprise PACS prices vary by SLA; figure from published procurement summaries.
How it works
Upload a study
Drop a single .dcm, or zip a whole multi-slice CT/MR series. Works with anything that emits DICOM Explicit VR Little Endian (the universal PACS interchange format) — we re-emit in the same syntax on decompress.
We compress losslessly
Custom 16-bit medical codec: LOCO-I MED prediction + ANS entropy coding, beating JPEG-LS on synthetic CT. Inter-frame temporal delta auto-applied on multi-frame DICOMs.
Pixel data byte-exact, identifiers preserved
The decompressed pixel array is bitwise equal to the original. PatientID, StudyInstanceUID, SeriesInstanceUID, SOPInstanceUID, modality, study/series descriptions, dates, accession — all preserved verbatim. Transfer syntax normalised to Explicit VR Little Endian (documented).
What survives a roundtrip
Medical accreditation hinges on this. We are explicit about every byte.
| Data class | Roundtrip behaviour |
|---|---|
| Image pixel data | Byte-exact — np.array_equal true on every test, every dtype (uint8, 12-in-16, uint16) |
PatientID, PatientName, DOB, sex | Preserved verbatim |
StudyInstanceUID / SeriesInstanceUID / SOPInstanceUID | Preserved verbatim |
| Modality, Rows, Columns, BitsStored/Allocated, PhotometricInterpretation | Preserved verbatim |
| RescaleSlope, RescaleIntercept, WindowCenter, WindowWidth | Preserved verbatim |
| Study/series description, dates, accession, instance number | Preserved verbatim |
| Transfer syntax | Normalised to Explicit VR Little Endian on decompress |
| Private / vendor-specific tags | Dropped — not parsed by the codec |
| Structured Reports (SR, no pixel data) | Not supported — bypass to original storage |
Benchmarks on synthetic studies
Results from our 2026-06 validation harness. Pixel-exact lossless on every imaging test. Real-world CT studies typically compress 60–71%.
| Study | Original | Compressed | Reduction | Pixels bit-exact |
|---|---|---|---|---|
| CT single slice 512×512 12-bit | 525 KB | 233 KB | −55.6% | byte-exact |
| CT 100-slice synthetic series | 52.5 MB | 15.2 MB | −71.1% | 100/100 byte-exact |
| MR T1 30-slice 256×256 | 3.96 MB | 1.45 MB | −63.2% | 30/30 byte-exact |
| MR 16-bit 512×512 single | 525 KB | 179 KB | −65.9% | byte-exact |
| Secondary capture 8-bit 640×480 (screenshot) | 308 KB | 302 KB | −2.1% | byte-exact |
| Tiny 32×32 CT (edge case) | 2.8 KB | 1.4 KB | −51.6% | byte-exact |
Encode ~7–13 MB/s/core; decode ~10–20 MB/s/core. Secondary capture is a known worst case (already 8-bit). Structured Reports are not supported — they have no pixel data and should be archived unchanged.
See it on a sample DICOM
Drop a .dcm file or a zip of a multi-slice series. Watch it shrink. Pixel data byte-exact.
Frequently asked questions
Is it HIPAA compliant?
Yes on the enterprise tier with a signed BAA. The public demo at /TryDicom does not have a BAA — do not upload real patient data. Contact sales for BAA + enterprise onboarding.
Will it pass a medical accreditation audit?
Pixel data is byte-exact and verifiable (we publish the validation harness in codec-audit/dicom/). All critical identifiers — PatientID, all UIDs, accession, dates — are preserved verbatim. Transfer syntax is normalised to Explicit VR Little Endian, which is documented behaviour your accreditation should approve in writing.
Can I deploy it on-prem?
Yes. Enterprise tier ships a container you run in your own VPC or on-prem rack — no PHI leaves your network. Same codec, same wire format. Contact sales.
What about JPEG 2000? Isn't that the standard?
JPEG 2000 is one of several DICOM-blessed transfer syntaxes, but JPEG-LS is the de facto choice for most modern PACS because it is faster and competitive on 16-bit medical pixels. Our codec is purpose-built for 16-bit medical imagery and beats JPEG-LS on synthetic CT in our internal benchmarks.
Does it preserve Structured Reports (SR)?
No — the codec is built for pixel data. SR DICOMs have no pixel array and should be archived as-is in your object store. Our enterprise integration includes a passthrough layer that detects SR and stores them uncompressed.
What DICOM transfer syntaxes does it accept?
Anything pydicom can read — Implicit VR Little Endian, Explicit VR Little Endian, Big Endian, the common JPEG / JPEG-LS / JPEG 2000 / RLE encapsulations. On decompress we always emit Explicit VR Little Endian (the universal PACS interchange format).
De-identification / PHI scrubbing?
Not our job. We preserve the data you send us byte-for-byte (for pixels) and verbatim (for identifiers). De-identification is a clinical workflow step and you should pair us with a tool like dcmtk's dcmodify or a vendor anonymiser before upload, not after.
What about private / vendor-specific tags?
Currently dropped. If you depend on a specific vendor's private tags (e.g. Siemens CSA headers, GE proprietary annotations) tell us before signing — we can extend the metadata pass-through for enterprise customers.
How big can a study be?
Public demo: 200 MB per upload (single file or zip). Enterprise: unlimited; we have processed multi-GB CT series single-pass.
Pricing?
$0.05/study processed; $0.02/GB-month stored (compressed). Enterprise tier is contact-sales — we do not expose self-serve sign-up because we don't want anyone uploading PHI to a tier without a BAA.
What if smallest.zip disappears?
The decompressor is a standalone Python module (dcs_core.py). Enterprise customers receive a perpetual source-available license to the decoder. Your studies are never trapped behind our service.
Latency?
Encode 7–13 MB/sec/core; decode 10–20 MB/sec/core. Designed for archival ingest, not query-path latency. Pair with your hot PACS for diagnostic reads and pour cold studies into smallest.zip.
Cut your DICOM archival bill by 60%
Pixel-exact lossless. All critical identifiers preserved. HIPAA-ready on enterprise.