Signet Forge 0.1.0
C++20 Parquet library with AI-native extensions
DEMO
Loading...
Searching...
No Matches
signet::forge::RleDecoder Class Reference

Streaming decoder for the Parquet RLE/Bit-Packing Hybrid scheme. More...

#include <rle.hpp>

Public Member Functions

 RleDecoder (const uint8_t *data, size_t size, int bit_width)
 Construct a decoder over a raw encoded byte buffer.
 
bool get (uint64_t *value)
 Read the next decoded value.
 
bool get_batch (uint64_t *out, size_t count)
 Read a batch of decoded values.
 

Static Public Member Functions

static std::vector< uint32_t > decode (const uint8_t *data, size_t size, int bit_width, size_t num_values)
 Decode values from an RLE-encoded buffer without a length prefix.
 
static std::vector< uint32_t > decode_with_length (const uint8_t *data, size_t size, int bit_width, size_t num_values)
 Decode from a buffer that starts with a 4-byte LE length prefix.
 

Detailed Description

Streaming decoder for the Parquet RLE/Bit-Packing Hybrid scheme.

Consumes a byte stream encoded by RleEncoder and yields individual unsigned integer values one at a time via get(), or in bulk via get_batch(). The decoder handles both RLE runs and bit-packed groups transparently.

Note
The input data must NOT include a length prefix; use decode_with_length() for length-prefixed buffers (def/rep levels).
See also
RleEncoder

Definition at line 467 of file rle.hpp.

Constructor & Destructor Documentation

◆ RleDecoder()

signet::forge::RleDecoder::RleDecoder ( const uint8_t *  data,
size_t  size,
int  bit_width 
)
inline

Construct a decoder over a raw encoded byte buffer.

Parameters
dataPointer to the RLE/Bit-Pack encoded bytes.
sizeSize of the encoded data in bytes.
bit_widthBits per value (0–64). Values outside this range are clamped to 0.

Definition at line 475 of file rle.hpp.

Member Function Documentation

◆ decode()

static std::vector< uint32_t > signet::forge::RleDecoder::decode ( const uint8_t *  data,
size_t  size,
int  bit_width,
size_t  num_values 
)
inlinestatic

Decode values from an RLE-encoded buffer without a length prefix.

Convenience static method that constructs a decoder, reads up to num_values values, and returns them as uint32. For bit_width == 0 and an empty payload, returns a vector of num_values zeros. Returns empty on invalid bit_width or non-empty payload with bit_width == 0.

Parameters
dataPointer to the encoded byte data.
sizeSize of the encoded data in bytes.
bit_widthBits per value (0–64).
num_valuesMaximum number of values to decode.
Returns
Decoded values (may be shorter than num_values if the stream is exhausted).
See also
decode_with_length, RleEncoder::encode

Definition at line 626 of file rle.hpp.

◆ decode_with_length()

static std::vector< uint32_t > signet::forge::RleDecoder::decode_with_length ( const uint8_t *  data,
size_t  size,
int  bit_width,
size_t  num_values 
)
inlinestatic

Decode from a buffer that starts with a 4-byte LE length prefix.

Reads a 4-byte little-endian uint32 payload length, then delegates to decode() for the payload bytes. This is the format used by Parquet for definition and repetition level encoding.

Parameters
dataPointer to the length-prefixed encoded data.
sizeTotal size of the buffer in bytes (must be >= 4).
bit_widthBits per value (0–64).
num_valuesMaximum number of values to decode.
Returns
Decoded values (empty if size < 4 or decode fails).
See also
decode, RleEncoder::encode_with_length

Definition at line 658 of file rle.hpp.

◆ get()

bool signet::forge::RleDecoder::get ( uint64_t *  value)
inline

Read the next decoded value.

Reads from buffered RLE runs or bit-packed groups first, then parses the next header from the byte stream when buffers are exhausted. Includes guards against corrupt varints and oversized bit-packed allocations (capped at 8M values per group).

Parameters
[out]valuePointer to receive the decoded value.
Returns
true if a value was read, false if the stream is exhausted.

Definition at line 495 of file rle.hpp.

◆ get_batch()

bool signet::forge::RleDecoder::get_batch ( uint64_t *  out,
size_t  count 
)
inline

Read a batch of decoded values.

Reads exactly count values into out. Returns false immediately if the stream is exhausted before count values are read (partial results may be written to out).

Parameters
[out]outOutput array with space for at least count values.
countNumber of values to read.
Returns
true if all count values were read successfully.

Definition at line 601 of file rle.hpp.


The documentation for this class was generated from the following file: