Signet Forge 0.1.0
C++20 Parquet library with AI-native extensions
DEMO
Loading...
Searching...
No Matches
signet::forge::ColumnIndexBuilder Class Reference

Builder that accumulates per-page statistics during column writing. More...

#include <column_index.hpp>

Public Member Functions

void start_page ()
 Start a new page. Must be called before set_min/set_max etc.
 
void set_min (const std::string &min_val)
 Record the minimum value for the current page (binary-encoded).
 
void set_max (const std::string &max_val)
 Record the maximum value for the current page (binary-encoded).
 
void set_null_page (bool is_null)
 Mark the current page as all-nulls (or not).
 
void set_null_count (int64_t count)
 Record the null count for the current page.
 
void set_first_row_index (int64_t row_index)
 Record the first row index for the current page (relative to row group).
 
void set_page_location (int64_t offset, int32_t compressed_size)
 Record the page location (file offset and compressed size) for the current page.
 
ColumnIndex build_column_index (PhysicalType pt=PhysicalType::BYTE_ARRAY) const
 Finalize and return the ColumnIndex from accumulated page info.
 
OffsetIndex build_offset_index () const
 Finalize and return the OffsetIndex from accumulated page info.
 
void reset ()
 Reset the builder, discarding all accumulated page info.
 
size_t num_pages () const
 Number of pages accumulated so far.
 

Detailed Description

Builder that accumulates per-page statistics during column writing.

Usage:

for (each page being written) {
builder.start_page();
builder.set_min(...);
builder.set_max(...);
builder.set_null_page(false);
builder.set_null_count(0);
builder.set_first_row_index(row_offset);
builder.set_page_location(file_offset, compressed_size);
}
Builder that accumulates per-page statistics during column writing.
void set_page_location(int64_t offset, int32_t compressed_size)
Record the page location (file offset and compressed size) for the current page.
void set_min(const std::string &min_val)
Record the minimum value for the current page (binary-encoded).
void set_max(const std::string &max_val)
Record the maximum value for the current page (binary-encoded).
void set_null_page(bool is_null)
Mark the current page as all-nulls (or not).
void start_page()
Start a new page. Must be called before set_min/set_max etc.
void set_first_row_index(int64_t row_index)
Record the first row index for the current page (relative to row group).
ColumnIndex build_column_index(PhysicalType pt=PhysicalType::BYTE_ARRAY) const
Finalize and return the ColumnIndex from accumulated page info.
OffsetIndex build_offset_index() const
Finalize and return the OffsetIndex from accumulated page info.
void set_null_count(int64_t count)
Record the null count for the current page.
Per-page min/max statistics for predicate pushdown.
Page locations for random access within a column chunk.
See also
ColumnIndex (output of build_column_index())
OffsetIndex (output of build_offset_index())

Definition at line 395 of file column_index.hpp.

Member Function Documentation

◆ build_column_index()

ColumnIndex signet::forge::ColumnIndexBuilder::build_column_index ( PhysicalType  pt = PhysicalType::BYTE_ARRAY) const
inline

Finalize and return the ColumnIndex from accumulated page info.

Automatically detects boundary order from the min_values sequence.

Parameters
ptPhysical type of the column — used for type-aware boundary order detection (signed comparison for INT32/INT64, etc.).
Returns
A fully populated ColumnIndex ready for serialization.

Definition at line 459 of file column_index.hpp.

◆ build_offset_index()

OffsetIndex signet::forge::ColumnIndexBuilder::build_offset_index ( ) const
inline

Finalize and return the OffsetIndex from accumulated page info.

Returns
A fully populated OffsetIndex ready for serialization.

Definition at line 492 of file column_index.hpp.

◆ num_pages()

size_t signet::forge::ColumnIndexBuilder::num_pages ( ) const
inline

Number of pages accumulated so far.

Definition at line 513 of file column_index.hpp.

◆ reset()

void signet::forge::ColumnIndexBuilder::reset ( )
inline

Reset the builder, discarding all accumulated page info.

Definition at line 508 of file column_index.hpp.

◆ set_first_row_index()

void signet::forge::ColumnIndexBuilder::set_first_row_index ( int64_t  row_index)
inline

Record the first row index for the current page (relative to row group).

Parameters
row_indexZero-based row index of the first row in this page.

Definition at line 436 of file column_index.hpp.

◆ set_max()

void signet::forge::ColumnIndexBuilder::set_max ( const std::string &  max_val)
inline

Record the maximum value for the current page (binary-encoded).

Parameters
max_valThe binary-encoded maximum value.

Definition at line 412 of file column_index.hpp.

◆ set_min()

void signet::forge::ColumnIndexBuilder::set_min ( const std::string &  min_val)
inline

Record the minimum value for the current page (binary-encoded).

Parameters
min_valThe binary-encoded minimum value.

Definition at line 404 of file column_index.hpp.

◆ set_null_count()

void signet::forge::ColumnIndexBuilder::set_null_count ( int64_t  count)
inline

Record the null count for the current page.

Parameters
countNumber of null values in the current page.

Definition at line 428 of file column_index.hpp.

◆ set_null_page()

void signet::forge::ColumnIndexBuilder::set_null_page ( bool  is_null)
inline

Mark the current page as all-nulls (or not).

Parameters
is_nullTrue if the page contains only null values.

Definition at line 420 of file column_index.hpp.

◆ set_page_location()

void signet::forge::ColumnIndexBuilder::set_page_location ( int64_t  offset,
int32_t  compressed_size 
)
inline

Record the page location (file offset and compressed size) for the current page.

Parameters
offsetAbsolute file offset of the page.
compressed_sizePage size in compressed bytes.

Definition at line 445 of file column_index.hpp.

◆ start_page()

void signet::forge::ColumnIndexBuilder::start_page ( )
inline

Start a new page. Must be called before set_min/set_max etc.

Definition at line 398 of file column_index.hpp.


The documentation for this class was generated from the following file: