![]() |
Signet Forge 0.1.0
C++20 Parquet library with AI-native extensions
|
DEMO |
PLAIN-encoded Parquet column decoder. More...
#include <column_reader.hpp>
Public Member Functions | |
| ColumnReader (PhysicalType type, const uint8_t *data, size_t size, int64_t num_values, int32_t type_length=-1) | |
| Construct a reader over raw PLAIN-encoded page data. | |
| expected< bool > | read_bool () |
| Read a single BOOLEAN value (bit-packed, LSB first). | |
| expected< int32_t > | read_int32 () |
| Read a single INT32 value (4 bytes little-endian). | |
| expected< int64_t > | read_int64 () |
| Read a single INT64 value (8 bytes little-endian). | |
| expected< float > | read_float () |
| Read a single FLOAT value (4 bytes little-endian, IEEE 754). | |
| expected< double > | read_double () |
| Read a single DOUBLE value (8 bytes little-endian, IEEE 754). | |
| expected< std::string > | read_string () |
| Read a single BYTE_ARRAY value as a std::string. | |
| expected< std::string_view > | read_string_view () |
| Read a single BYTE_ARRAY value as a non-owning std::string_view. | |
| expected< std::vector< uint8_t > > | read_bytes () |
| Read a single BYTE_ARRAY or FIXED_LEN_BYTE_ARRAY value as raw bytes. | |
| expected< void > | read_batch_bool (bool *out, size_t count) |
Read a batch of BOOLEAN values into out. | |
| expected< void > | read_batch_int32 (int32_t *out, size_t count) |
| Read a batch of INT32 values via bulk memcpy. | |
| expected< void > | read_batch_int64 (int64_t *out, size_t count) |
| Read a batch of INT64 values via bulk memcpy. | |
| expected< void > | read_batch_float (float *out, size_t count) |
| Read a batch of FLOAT values via bulk memcpy. | |
| expected< void > | read_batch_double (double *out, size_t count) |
| Read a batch of DOUBLE values via bulk memcpy. | |
| expected< void > | read_batch_string (std::string *out, size_t count) |
| Read a batch of BYTE_ARRAY values as strings. | |
| template<typename T > | |
| expected< T > | read () |
Read a single value of type T, dispatching to the correct typed reader. | |
| template<typename T > | |
| expected< void > | read_batch (T *out, size_t count) |
Read a batch of count values of type T into out. | |
| int64_t | values_remaining () const |
| Number of values not yet read from this page. | |
| bool | has_next () const |
| Whether there is at least one more value to read. | |
| PhysicalType | type () const |
| The Parquet physical type of this column. | |
| size_t | position () const |
| Current byte offset within the page data buffer. | |
PLAIN-encoded Parquet column decoder.
Wraps a raw data page buffer and decodes values one at a time or in batches. The reader maintains a cursor position and a count of values read, returning an error on type mismatch, buffer overrun, or exhaustion.
Definition at line 46 of file column_reader.hpp.
|
inline |
Construct a reader over raw PLAIN-encoded page data.
| type | The physical type of the column. |
| data | Pointer to the start of the page data buffer. |
| size | Size of the page data in bytes. |
| num_values | Number of values encoded in this page. |
| type_length | For FIXED_LEN_BYTE_ARRAY columns, the fixed byte length per value (ignored for other types). |
Definition at line 56 of file column_reader.hpp.
|
inline |
Whether there is at least one more value to read.
Definition at line 566 of file column_reader.hpp.
|
inline |
Current byte offset within the page data buffer.
Definition at line 576 of file column_reader.hpp.
|
inline |
Read a single value of type T, dispatching to the correct typed reader.
Supported types: bool, int32_t, int64_t, float, double, std::string, std::string_view, std::vector<uint8_t>.
| T | The value type to decode. |
Definition at line 504 of file column_reader.hpp.
|
inline |
Read a batch of count values of type T into out.
Dispatches to the correct typed batch reader. Supported types: bool, int32_t, int64_t, float, double, std::string.
| T | The value type to decode. |
| out | Pre-allocated buffer of at least count elements. |
| count | Number of values to read. |
Definition at line 537 of file column_reader.hpp.
|
inline |
Read a batch of BOOLEAN values into out.
| out | Pre-allocated buffer of at least count elements. |
| count | Number of values to read. |
Definition at line 319 of file column_reader.hpp.
|
inline |
Read a batch of DOUBLE values via bulk memcpy.
| out | Pre-allocated buffer of at least count elements. |
| count | Number of values to read. |
Definition at line 433 of file column_reader.hpp.
|
inline |
Read a batch of FLOAT values via bulk memcpy.
| out | Pre-allocated buffer of at least count elements. |
| count | Number of values to read. |
Definition at line 405 of file column_reader.hpp.
|
inline |
Read a batch of INT32 values via bulk memcpy.
| out | Pre-allocated buffer of at least count elements. |
| count | Number of values to read. |
Definition at line 349 of file column_reader.hpp.
|
inline |
Read a batch of INT64 values via bulk memcpy.
| out | Pre-allocated buffer of at least count elements. |
| count | Number of values to read. |
Definition at line 377 of file column_reader.hpp.
|
inline |
Read a batch of BYTE_ARRAY values as strings.
| out | Pre-allocated buffer of at least count std::string elements. |
| count | Number of values to read. |
Definition at line 461 of file column_reader.hpp.
|
inline |
Read a single BOOLEAN value (bit-packed, LSB first).
Definition at line 76 of file column_reader.hpp.
|
inline |
Read a single BYTE_ARRAY or FIXED_LEN_BYTE_ARRAY value as raw bytes.
For BYTE_ARRAY, reads a 4-byte LE length prefix then the payload. For FIXED_LEN_BYTE_ARRAY, reads exactly type_length bytes.
Definition at line 264 of file column_reader.hpp.
|
inline |
Read a single DOUBLE value (8 bytes little-endian, IEEE 754).
Definition at line 167 of file column_reader.hpp.
|
inline |
Read a single FLOAT value (4 bytes little-endian, IEEE 754).
Definition at line 145 of file column_reader.hpp.
|
inline |
Read a single INT32 value (4 bytes little-endian).
Definition at line 101 of file column_reader.hpp.
|
inline |
Read a single INT64 value (8 bytes little-endian).
Definition at line 123 of file column_reader.hpp.
|
inline |
Read a single BYTE_ARRAY value as a std::string.
PLAIN encoding: 4-byte LE length prefix followed by raw bytes.
Definition at line 192 of file column_reader.hpp.
|
inline |
Read a single BYTE_ARRAY value as a non-owning std::string_view.
The returned view points directly into the page data buffer, so it is only valid as long as the underlying buffer is alive.
Definition at line 228 of file column_reader.hpp.
|
inline |
The Parquet physical type of this column.
Definition at line 571 of file column_reader.hpp.
|
inline |
Number of values not yet read from this page.
Definition at line 561 of file column_reader.hpp.