Signet Forge 0.1.0
C++20 Parquet library with AI-native extensions
DEMO
Loading...
Searching...
No Matches
signet::forge::DictionaryDecoder< T > Class Template Reference

Dictionary decoder for Parquet PLAIN_DICTIONARY / RLE_DICTIONARY encoding. More...

#include <dictionary.hpp>

Public Member Functions

 DictionaryDecoder (const uint8_t *dict_data, size_t dict_size, size_t num_dict_entries, PhysicalType type)
 Construct a decoder by parsing the raw PLAIN-encoded dictionary page.
 
expected< std::vector< T > > decode (const uint8_t *indices_data, size_t indices_size, size_t num_values) const
 Decode an RLE_DICTIONARY indices page into original typed values.
 
size_t dictionary_size () const
 Number of entries in the dictionary.
 

Detailed Description

template<typename T>
class signet::forge::DictionaryDecoder< T >

Dictionary decoder for Parquet PLAIN_DICTIONARY / RLE_DICTIONARY encoding.

Reconstructs original typed values from a PLAIN-encoded dictionary page and an RLE_DICTIONARY-encoded indices page. The constructor parses the dictionary page; decode() then maps indices back to values.

Template Parameters
TThe value type (std::string, int32_t, int64_t, float, or double).
See also
DictionaryEncoder, RleDecoder

Definition at line 413 of file dictionary.hpp.

Constructor & Destructor Documentation

◆ DictionaryDecoder()

template<typename T >
signet::forge::DictionaryDecoder< T >::DictionaryDecoder ( const uint8_t *  dict_data,
size_t  dict_size,
size_t  num_dict_entries,
PhysicalType  type 
)
inline

Construct a decoder by parsing the raw PLAIN-encoded dictionary page.

Decodes num_dict_entries values from dict_data using the appropriate PLAIN format for type T. The decoded values are stored internally for index-based lookup during decode().

Parameters
dict_dataPointer to PLAIN-encoded dictionary page bytes.
dict_sizeSize of the dictionary page in bytes.
num_dict_entriesNumber of entries in the dictionary.
typeThe Parquet physical type (retained for metadata; actual decoding dispatches on template type T).

Definition at line 426 of file dictionary.hpp.

Member Function Documentation

◆ decode()

template<typename T >
expected< std::vector< T > > signet::forge::DictionaryDecoder< T >::decode ( const uint8_t *  indices_data,
size_t  indices_size,
size_t  num_values 
) const
inline

Decode an RLE_DICTIONARY indices page into original typed values.

Reads the 1-byte bit_width prefix, decodes the RLE/Bit-Packing Hybrid index stream, and maps each index back to its dictionary value. Returns an empty vector if any index is out of bounds.

Parameters
indices_dataPointer to the indices page (1-byte bit_width + RLE payload).
indices_sizeSize of the indices page in bytes.
num_valuesNumber of values to decode.
Returns
Decoded values, or CORRUPT_DATA error on out-of-bounds index (CWE-754).
See also
DictionaryEncoder::indices_page

Definition at line 451 of file dictionary.hpp.

◆ dictionary_size()

template<typename T >
size_t signet::forge::DictionaryDecoder< T >::dictionary_size ( ) const
inline

Number of entries in the dictionary.

Returns
Dictionary cardinality (as parsed from the dictionary page).

Definition at line 482 of file dictionary.hpp.


The documentation for this class was generated from the following file: