Streaming encoder for the Parquet RLE/Bit-Packing Hybrid scheme.
More...
#include <rle.hpp>
|
| | RleEncoder (int bit_width) |
| | Construct an encoder for values of the given bit width.
|
| |
| void | put (uint64_t value) |
| | Add a single value to the encoding stream.
|
| |
| void | flush () |
| | Flush any pending values to the output buffer.
|
| |
| const std::vector< uint8_t > & | data () const |
| | Returns a reference to the encoded byte buffer (without length prefix).
|
| |
| size_t | encoded_size () const |
| | Returns the size of the encoded data in bytes.
|
| |
| void | reset () |
| | Reset the encoder to its initial state, preserving the bit width.
|
| |
|
| static std::vector< uint8_t > | encode (const uint32_t *values, size_t count, int bit_width) |
| | Encode an array of uint32 values using the RLE/Bit-Pack Hybrid scheme.
|
| |
| static std::vector< uint8_t > | encode_with_length (const uint32_t *values, size_t count, int bit_width) |
| | Encode with a 4-byte little-endian length prefix.
|
| |
Streaming encoder for the Parquet RLE/Bit-Packing Hybrid scheme.
Accepts a stream of unsigned integer values via put() and decides per-group whether to emit an RLE run (for repeated values) or a bit-packed group of 8. Call flush() after all values are written, then retrieve the encoded bytes from data().
- Note
- The encoded output does NOT include a length prefix. Use encode_with_length() for def/rep level encoding that requires one.
- See also
- RleDecoder
-
https://parquet.apache.org/documentation/latest/
Definition at line 211 of file rle.hpp.
◆ RleEncoder()
| signet::forge::RleEncoder::RleEncoder |
( |
int |
bit_width | ) |
|
|
inlineexplicit |
Construct an encoder for values of the given bit width.
- Parameters
-
| bit_width | Bits per value (0–64). Values outside this range are clamped to 0 (no encoding). |
Definition at line 217 of file rle.hpp.
◆ data()
| const std::vector< uint8_t > & signet::forge::RleEncoder::data |
( |
| ) |
const |
|
inline |
Returns a reference to the encoded byte buffer (without length prefix).
- Returns
- Const reference to the internal encoded output.
- Note
- Call flush() before accessing this to ensure all data is emitted.
Definition at line 308 of file rle.hpp.
◆ encode()
| static std::vector< uint8_t > signet::forge::RleEncoder::encode |
( |
const uint32_t * |
values, |
|
|
size_t |
count, |
|
|
int |
bit_width |
|
) |
| |
|
inlinestatic |
Encode an array of uint32 values using the RLE/Bit-Pack Hybrid scheme.
Convenience static method that constructs an encoder, feeds all values, flushes, and returns the resulting byte buffer without a length prefix.
- Parameters
-
| values | Pointer to the input values. |
| count | Number of values to encode. |
| bit_width | Bits per value (0–64). Returns empty for invalid widths. |
- Returns
- Encoded byte buffer (empty on error or bit_width == 0).
- See also
- encode_with_length
Definition at line 342 of file rle.hpp.
◆ encode_with_length()
| static std::vector< uint8_t > signet::forge::RleEncoder::encode_with_length |
( |
const uint32_t * |
values, |
|
|
size_t |
count, |
|
|
int |
bit_width |
|
) |
| |
|
inlinestatic |
Encode with a 4-byte little-endian length prefix.
Produces the same output as encode(), but prepends a 4-byte LE uint32 length prefix containing the payload size. This format is required by Parquet for definition and repetition level encoding.
- Parameters
-
| values | Pointer to the input values. |
| count | Number of values to encode. |
| bit_width | Bits per value (0–64). |
- Returns
- Length-prefixed encoded byte buffer.
- See also
- encode, RleDecoder::decode_with_length
Definition at line 371 of file rle.hpp.
◆ encoded_size()
| size_t signet::forge::RleEncoder::encoded_size |
( |
| ) |
const |
|
inline |
Returns the size of the encoded data in bytes.
- Returns
- Number of bytes in the encoded output.
Definition at line 313 of file rle.hpp.
◆ flush()
| void signet::forge::RleEncoder::flush |
( |
| ) |
|
|
inline |
Flush any pending values to the output buffer.
Must be called after all put() calls to finalize the encoding. Any partial bit-packed group (fewer than 8 values) is zero-padded to 8 before emission.
Definition at line 272 of file rle.hpp.
◆ put()
| void signet::forge::RleEncoder::put |
( |
uint64_t |
value | ) |
|
|
inline |
Add a single value to the encoding stream.
Values are buffered internally and flushed as RLE runs or bit-packed groups. If bit_width is 0, this is a no-op (all values are implicitly zero).
- Parameters
-
| value | The unsigned integer value to encode (must fit in bit_width bits). |
Definition at line 228 of file rle.hpp.
◆ reset()
| void signet::forge::RleEncoder::reset |
( |
| ) |
|
|
inline |
Reset the encoder to its initial state, preserving the bit width.
Clears all internal buffers and accumulators so the encoder can be reused for a new encoding session.
Definition at line 319 of file rle.hpp.
The documentation for this class was generated from the following file: