Signet Forge 0.1.0
C++20 Parquet library with AI-native extensions
DEMO
Loading...
Searching...
No Matches
byte_stream_split.hpp File Reference

BYTE_STREAM_SPLIT encoding and decoding (Parquet encoding type 9). More...

#include <bit>
#include <cstdint>
#include <cstring>
#include <vector>

Go to the source code of this file.

Namespaces

namespace  signet
 
namespace  signet::forge
 
namespace  signet::forge::byte_stream_split
 BYTE_STREAM_SPLIT encoding functions for float and double types.
 

Functions

std::vector< uint8_t > signet::forge::byte_stream_split::encode_float (const float *values, size_t count)
 Encode float values using the BYTE_STREAM_SPLIT algorithm.
 
std::vector< uint8_t > signet::forge::byte_stream_split::encode_double (const double *values, size_t count)
 Encode double values using the BYTE_STREAM_SPLIT algorithm.
 
std::vector< float > signet::forge::byte_stream_split::decode_float (const uint8_t *data, size_t size, size_t count)
 Decode float values from BYTE_STREAM_SPLIT encoding.
 
std::vector< double > signet::forge::byte_stream_split::decode_double (const uint8_t *data, size_t size, size_t count)
 Decode double values from BYTE_STREAM_SPLIT encoding.
 

Detailed Description

BYTE_STREAM_SPLIT encoding and decoding (Parquet encoding type 9).

Splits IEEE 754 float/double values by byte position to group similar exponent and mantissa bits together. This dramatically improves compression ratios with ZSTD/Snappy/LZ4 for financial data (prices, rates, quantities) where successive values share exponent bytes. All functions reside in the signet::forge::byte_stream_split namespace.

See also
https://parquet.apache.org/documentation/latest/ (BYTE_STREAM_SPLIT)

Definition in file byte_stream_split.hpp.