LZ4 Format Guide

Available Conversions

About LZ4 Format

LZ4 is an extremely fast lossless compression algorithm developed by Yann Collet in 2011. Unlike most compression algorithms that optimize for compression ratio, LZ4 is designed primarily for speed — it can compress data at over 500 MB/s and decompress at multi-gigabyte per second rates on modern hardware, approaching memory bandwidth speeds. This makes LZ4 the algorithm of choice for real-time compression scenarios where latency is critical. The LZ4 frame format uses xxHash32 for integrity verification.

History of LZ4

Yann Collet created LZ4 in 2011 while working on real-time compression solutions. The algorithm was designed as a modern replacement for older fast compressors like Snappy (Google, 2011) and LZO (1996), offering better speed-to-ratio trade-offs. LZ4 was released as open source under a BSD license and rapidly gained adoption in performance-critical systems. In 2012, Collet added LZ4 HC (High Compression) mode, which sacrifices compression speed for significantly better ratios while maintaining the same ultra-fast decompression speed. The success of LZ4 led Collet to develop Zstandard in 2015, which offers a broader range of speed-ratio trade-offs. Despite Zstandard's existence, LZ4 remains the preferred choice when absolute minimum latency is required. LZ4 is part of the Linux kernel since version 3.11 (2013) and is used in ZFS since its OpenZFS implementation.

Key Features and Uses

LZ4's defining characteristic is its extreme speed — decompression operates at memory bandwidth speeds (up to 5 GB/s on modern CPUs), making the compression practically transparent to applications. The algorithm offers two modes: LZ4 (default, fastest) and LZ4 HC (high compression, levels 1-12, better ratio at slower compression but same fast decompression). The LZ4 frame format supports content size declaration, block checksums, content checksums (xxHash32), and independent blocks for random access. Dictionary compression is available for improving ratios on small data.

Common Applications

LZ4 is used in the most performance-sensitive layers of modern infrastructure: the Linux kernel uses LZ4 for btrfs transparent filesystem compression, squashfs (read-only filesystem), and initramfs compression; ZFS uses LZ4 as its default compression algorithm; databases including ClickHouse, Apache Arrow, Apache Parquet, and RocksDB use LZ4 for column and page compression; game engines including Unity and Unreal Engine use LZ4 for asset compression; network protocols use LZ4 for real-time data compression; and container runtimes use LZ4 for fast image layer decompression during container startup.

Advantages and Disadvantages

Advantages

  • Fastest Decompression: Multi-GB/s speeds, approaching memory bandwidth
  • Minimal CPU Usage: Near-zero CPU overhead for decompression
  • Real-time Capable: Transparent compression for filesystems and databases
  • HC Mode: High compression mode with same fast decompression
  • Kernel Integration: Built into Linux kernel for btrfs, squashfs
  • ZFS Default: Default compression algorithm for ZFS filesystem
  • Database Standard: Used by ClickHouse, Arrow, Parquet, RocksDB
  • Low Latency: Ideal for latency-sensitive real-time applications
  • Open Source: BSD licensed, same author as Zstandard

Disadvantages

  • Lower Ratios: Significantly lower compression than gzip, zstd, or xz
  • Single File Only: Cannot archive directories — must combine with tar
  • No Encryption: No built-in password protection
  • Desktop Support: Not widely recognized by casual users
  • Windows Support: Not natively supported, requires third-party tools
  • Larger Files: Compressed files are larger than with other algorithms
  • Not for Distribution: Poor choice for downloads where size matters
  • Niche Format: Primarily used in backend/infrastructure, not user-facing