![]() |
Signet Forge 0.1.0
C++20 Parquet library with AI-native extensions
|
DEMO |
Per-page min/max statistics for predicate pushdown. More...
#include <column_index.hpp>
Public Types | |
| enum class | BoundaryOrder : int32_t { UNORDERED = 0 , ASCENDING = 1 , DESCENDING = 2 } |
| Ordering of min values across pages, used to short-circuit filtering. More... | |
Public Member Functions | |
| bool | valid () const |
| Check if deserialization was successful. | |
| void | serialize (thrift::CompactEncoder &enc) const |
| Serialize this ColumnIndex to a Thrift compact encoder. | |
| void | deserialize (thrift::CompactDecoder &dec) |
| Deserialize this ColumnIndex from a Thrift compact decoder. | |
| std::vector< size_t > | filter_pages (const std::string &min_val, const std::string &max_val, PhysicalType physical_type=PhysicalType::BYTE_ARRAY) const |
| Filter pages by a value range for predicate pushdown. | |
Public Attributes | |
| bool | valid_ = true |
| False if deserialization failed (M-V7). | |
| std::vector< bool > | null_pages |
| True if the corresponding page is all nulls. | |
| std::vector< std::string > | min_values |
| Binary-encoded minimum value per page. | |
| std::vector< std::string > | max_values |
| Binary-encoded maximum value per page. | |
| BoundaryOrder | boundary_order = BoundaryOrder::UNORDERED |
| Boundary order of min values. | |
| std::vector< int64_t > | null_counts |
| Null count per page (optional). | |
Per-page min/max statistics for predicate pushdown.
Stores binary-encoded min/max values for each data page in a column chunk, along with null-page flags, boundary ordering, and optional null counts. Readers use filter_pages() to eliminate pages whose value ranges do not overlap the query predicate.
Definition at line 147 of file column_index.hpp.
|
strong |
Ordering of min values across pages, used to short-circuit filtering.
| Enumerator | |
|---|---|
| UNORDERED | Min values have no particular order. |
| ASCENDING | Min values are non-decreasing across pages. |
| DESCENDING | Min values are non-increasing across pages. |
Definition at line 154 of file column_index.hpp.
|
inline |
Deserialize this ColumnIndex from a Thrift compact decoder.
| dec | The decoder to read from. |
Definition at line 215 of file column_index.hpp.
|
inline |
Filter pages by a value range for predicate pushdown.
Given a range [min_val, max_val] (binary-encoded, same encoding as min_values/max_values), returns page indices that might contain matching data. A page is excluded only if its max is strictly less than min_val or its min is strictly greater than max_val. All-null pages are always excluded.
| min_val | Lower bound of the query range (binary-encoded). |
| max_val | Upper bound of the query range (binary-encoded). |
| physical_type | Physical type for typed comparison (default: BYTE_ARRAY = lexicographic). |
Definition at line 294 of file column_index.hpp.
|
inline |
Serialize this ColumnIndex to a Thrift compact encoder.
| enc | The encoder to write to. |
Definition at line 168 of file column_index.hpp.
|
inline |
Check if deserialization was successful.
Definition at line 164 of file column_index.hpp.
| BoundaryOrder signet::forge::ColumnIndex::boundary_order = BoundaryOrder::UNORDERED |
Boundary order of min values.
Definition at line 159 of file column_index.hpp.
| std::vector<std::string> signet::forge::ColumnIndex::max_values |
Binary-encoded maximum value per page.
Definition at line 151 of file column_index.hpp.
| std::vector<std::string> signet::forge::ColumnIndex::min_values |
Binary-encoded minimum value per page.
Definition at line 150 of file column_index.hpp.
| std::vector<int64_t> signet::forge::ColumnIndex::null_counts |
Null count per page (optional).
Definition at line 161 of file column_index.hpp.
| std::vector<bool> signet::forge::ColumnIndex::null_pages |
True if the corresponding page is all nulls.
Definition at line 149 of file column_index.hpp.
| bool signet::forge::ColumnIndex::valid_ = true |
False if deserialization failed (M-V7).
Definition at line 148 of file column_index.hpp.