A Beginner's Guide to Error-Bounded Lossy Compressors

Understanding SZ2, ZFP, TTHRESH, and FPZIP—the compression algorithms powering efficient scientific computing and distributed machine learning.

What is Lossy Compression?

You're probably familiar with lossy compression from everyday life—JPEG images and MP3 audio files sacrifice some quality to dramatically reduce file size. The same principle applies to numerical data in scientific computing and machine learning.

Lossy compression achieves higher compression ratios than lossless methods (like ZIP or GZIP) by allowing small, controlled errors in the reconstructed data. The key question is: how much error is acceptable?

Key Insight: In many ML and scientific applications, floating-point values contain more precision than is actually meaningful. Compressing away this "noise" can save 10-100x storage/bandwidth with negligible impact on results.

Error-Bounded Lossy Compressors (EBLCs)

Error-Bounded Lossy Compressors give you precise control over the maximum error introduced during compression. Unlike general lossy compression where quality is often subjective, EBLCs provide mathematical guarantees.

Types of Error Bounds

Absolute Error Bound — The difference between original and decompressed values is at most ε
|original - decompressed| ≤ ε
Relative Error Bound — The error is proportional to the original value
|original - decompressed| / |original| ≤ ε
Point-wise Relative (PW_REL) — Relative bound that handles values near zero gracefully

Original Data
32-bit floats

→

EBLC
with error bound ε

→

Compressed
~10-50x smaller

→

Decompressed
error ≤ ε

EBLC compression pipeline with guaranteed error bounds

The Four Major EBLCs

Let's explore the most widely-used error-bounded lossy compressors and understand how each one works:

SZ2

Prediction-Based Compression

SZ2 predicts each value based on its neighbors and stores either a quantized prediction error or the original value if the prediction fails. It adapts its predictor on-the-fly, making it excellent for data with local patterns.

Adaptive Prediction Huffman Encoding Best for Smooth Data

ZFP

Block-Based Transform Coding

ZFP divides data into fixed-size blocks (4×4×4 for 3D), applies an orthogonal transform, and uses embedded coding to progressively encode coefficients. Great for random-access decompression.

Fixed-Size Blocks Random Access GPU-Friendly

TTHRESH

Tensor Decomposition

TTHRESH uses Tucker decomposition to compress multidimensional arrays. It's particularly effective for visual data and tensors with strong correlations across dimensions.

Tucker Decomposition Multi-dimensional Visual Data

FPZIP

Fast Floating-Point Compression

FPZIP uses a simple prediction scheme combined with fast entropy coding. It prioritizes speed over compression ratio, making it ideal for streaming applications.

Very Fast Low Overhead Streaming-Friendly

How SZ2 Works: A Closer Look

Let's dive deeper into SZ2, the most commonly used EBLC in federated learning research:

Step 1: Prediction

SZ2 uses multiple predictors (Lorenzo, linear regression, etc.) and selects the best one for each data point:

// Lorenzo predictor for 3D data
predicted[i][j][k] = data[i-1][j][k] 
                   + data[i][j-1][k] 
                   + data[i][j][k-1] 
                   - data[i-1][j-1][k] 
                   - data[i-1][j][k-1] 
                   - data[i][j-1][k-1] 
                   + data[i-1][j-1][k-1]

Step 2: Quantization

If the prediction error is within bounds, it's quantized to an integer:

error = actual - predicted
if |error| <= error_bound:
    quantized = round(error / (2 * error_bound))
else:
    store original value (unpredictable)

Step 3: Encoding

Quantized values are compressed using Huffman coding, exploiting the fact that small prediction errors are more common than large ones.

Comparison: When to Use Each Compressor

Compressor	Best For	Speed	Ratio
SZ2	Smooth, correlated data	Medium	High
ZFP	Random access needs	Fast	Medium
TTHRESH	Multi-dimensional tensors	Slow	Very High
FPZIP	Speed-critical streaming	Very Fast	Low

EBLCs in Machine Learning

Why do ML practitioners care about scientific compression algorithms? Several key applications:

1. Federated Learning Communication

Model updates (gradients or weights) transmitted between clients and servers can be compressed with EBLCs. Research shows compression ratios of 10-100x with minimal accuracy impact.

2. Model Checkpointing

Training large models requires frequent checkpoints. Compressing checkpoints with error bounds saves significant storage while preserving the ability to resume training.

3. Gradient Compression

In distributed training, gradient communication is often the bottleneck. EBLCs can compress gradients more effectively than simple quantization schemes.

Real-World Results: In our federated learning research, using SZ2 with adaptive error bounds reduced communication by up to 67% compared to uncompressed baselines, with negligible impact on final model accuracy.

Getting Started

Ready to try EBLCs? Here are the main libraries:

SZ — pip install pysz or GitHub
ZFP — pip install zfpy or GitHub
cuSZ — GPU-accelerated SZ for CUDA

Quick Example with SZ

import numpy as np
import pysz

# Create some data
data = np.random.randn(1000, 1000).astype(np.float32)

# Compress with relative error bound of 1e-3
compressor = pysz.SZ(rel_err_bound=1e-3)
compressed = compressor.compress(data)
decompressed = compressor.decompress(compressed, data.shape)

# Check compression ratio
ratio = data.nbytes / len(compressed)
print(f"Compression ratio: {ratio:.1f}x")

Key Takeaways

EBLCs provide guaranteed error bounds — You control exactly how much precision is lost
Different compressors suit different data — SZ2 for smooth data, ZFP for random access, FPZIP for speed
10-100x compression is achievable — With often negligible impact on downstream tasks
Growing importance in ML — Essential for efficient federated learning, checkpointing, and distributed training

Next in this series: How we use distortion as a feedback signal to adaptively tune error bounds in federated learning—achieving better compression without sacrificing model accuracy.