Blosc & bcolz : efficient compressed data containers
Compressed data containers can be critical when you want to process datasets exceeding the capacities of your computer. Blosc/bcolz allows to do that in a very efficient way.
at this point; there is no better publicly available option that I'm aware of. That's not just "yet another compressor library" case.” — Ivan Smirnov (advocating for Blosc inclusion in h5py)
both chunked and is compressible • It is meant for both memory and persistent storage (disk) • Main goal: to demonstrate that compression can accelerate data access (both on disk and in-memory)
• This can lead to a poor use of space to accommodate variable-length data (potentially large zero-paddings) • Blosc2 addresses this shortcoming by using superchunks of variable-length chunks
Blosc2 on ARM achieves one of the best bandwidth/Watt ratios in the market • Profound implications for the density of data storage devices (e.g. arrays of disks driven by ARM) Not using NEON Using NEON