Signet Forge 0.1.0
C++20 Parquet library with AI-native extensions
DEMO
Loading...
Searching...
No Matches
signet::forge::ColumnBatch Class Reference

A column-major batch of feature rows for ML inference and WAL serialization. More...

#include <column_batch.hpp>

Public Member Functions

 ColumnBatch ()=default
 Default constructor (empty batch, no schema).
 
 ColumnBatch (ColumnBatch &&)=default
 Move constructor.
 
ColumnBatchoperator= (ColumnBatch &&)=default
 Move assignment.
 
 ColumnBatch (const ColumnBatch &)=default
 Copy constructor.
 
ColumnBatchoperator= (const ColumnBatch &)=default
 Copy assignment.
 
expected< void > push_row (const double *values, size_t count)
 Append one row of feature values.
 
expected< void > push_row (std::initializer_list< double > values)
 Append one row from an initializer list (e.g.
 
expected< void > push_row (const std::vector< double > &values)
 Append one row from a vector.
 
size_t num_rows () const noexcept
 Number of rows currently in the batch.
 
size_t num_columns () const noexcept
 Number of columns defined by the schema.
 
bool empty () const noexcept
 True if the batch contains no rows.
 
const std::vector< ColumnDesc > & schema () const noexcept
 The schema (column descriptors) this batch was created with.
 
TensorView column_view (size_t col_idx) const
 Zero-copy TensorView over a single column's contiguous double array.
 
std::span< const double > column_span (size_t col_idx) const
 Span accessor for a single column — zero-copy, range-checked.
 
expected< OwnedTensoras_tensor (TensorDataType output_dtype=TensorDataType::FLOAT32) const
 Assemble all columns into a single 2D [rows x cols] OwnedTensor.
 
StreamRecord to_stream_record (int64_t timestamp_ns=0, uint32_t type_id=0x434F4C42u) const
 Serialize the batch into a WAL StreamRecord.
 
void clear ()
 Clear all row data while preserving the schema.
 
void reserve (size_t rows)
 Pre-allocate storage for the given number of rows in each column.
 

Static Public Member Functions

static ColumnBatch with_schema (std::vector< ColumnDesc > schema, size_t reserve_rows=64)
 Create an empty ColumnBatch with the given schema.
 
static expected< ColumnBatchfrom_stream_record (const StreamRecord &rec)
 Deserialize a StreamRecord payload back into a ColumnBatch.
 

Public Attributes

std::string source_id
 Exchange / feed identifier.
 
std::string symbol
 Instrument symbol.
 
int64_t seq_first = 0
 First WAL sequence number in this batch.
 
int64_t seq_last = 0
 Last WAL sequence number in this batch.
 
int64_t created_ns = 0
 Batch creation timestamp (ns since epoch)
 

Detailed Description

A column-major batch of feature rows for ML inference and WAL serialization.

Data is stored in column-major layout (columns_[col][row]) so each column is a contiguous double array suitable for zero-copy wrapping as a TensorView or ONNX OrtValue without transposition.

Typically shared across threads via SharedColumnBatch (std::shared_ptr).

See also
SharedColumnBatch, make_column_batch, EventBus

Definition at line 73 of file column_batch.hpp.

Constructor & Destructor Documentation

◆ ColumnBatch() [1/3]

signet::forge::ColumnBatch::ColumnBatch ( )
default

Default constructor (empty batch, no schema).

◆ ColumnBatch() [2/3]

signet::forge::ColumnBatch::ColumnBatch ( ColumnBatch &&  )
default

Move constructor.

◆ ColumnBatch() [3/3]

signet::forge::ColumnBatch::ColumnBatch ( const ColumnBatch )
default

Copy constructor.

Member Function Documentation

◆ as_tensor()

expected< OwnedTensor > signet::forge::ColumnBatch::as_tensor ( TensorDataType  output_dtype = TensorDataType::FLOAT32) const
inline

Assemble all columns into a single 2D [rows x cols] OwnedTensor.

Uses BatchTensorBuilder internally. The default output type is FLOAT32 for direct ONNX Runtime consumption.

Parameters
output_dtypeDesired element type (default FLOAT32).
Returns
OwnedTensor of shape {num_rows, num_columns}, or Error if empty.

Definition at line 198 of file column_batch.hpp.

◆ clear()

void signet::forge::ColumnBatch::clear ( )
inline

Clear all row data while preserving the schema.

Definition at line 377 of file column_batch.hpp.

◆ column_span()

std::span< const double > signet::forge::ColumnBatch::column_span ( size_t  col_idx) const
inline

Span accessor for a single column — zero-copy, range-checked.

Definition at line 177 of file column_batch.hpp.

◆ column_view()

TensorView signet::forge::ColumnBatch::column_view ( size_t  col_idx) const
inline

Zero-copy TensorView over a single column's contiguous double array.

Shape: {num_rows_}, dtype: FLOAT64. The view is valid as long as this ColumnBatch is alive and unmodified.

Definition at line 166 of file column_batch.hpp.

◆ empty()

bool signet::forge::ColumnBatch::empty ( ) const
inlinenoexcept

True if the batch contains no rows.

Definition at line 156 of file column_batch.hpp.

◆ from_stream_record()

static expected< ColumnBatch > signet::forge::ColumnBatch::from_stream_record ( const StreamRecord rec)
inlinestatic

Deserialize a StreamRecord payload back into a ColumnBatch.

Inverse of to_stream_record(). Reads the binary column-major format and reconstructs the schema, columns, and row data.

Parameters
recStreamRecord previously produced by to_stream_record().
Returns
Reconstructed ColumnBatch, or Error on truncated/corrupt payload.

Definition at line 312 of file column_batch.hpp.

◆ num_columns()

size_t signet::forge::ColumnBatch::num_columns ( ) const
inlinenoexcept

Number of columns defined by the schema.

Definition at line 154 of file column_batch.hpp.

◆ num_rows()

size_t signet::forge::ColumnBatch::num_rows ( ) const
inlinenoexcept

Number of rows currently in the batch.

Definition at line 152 of file column_batch.hpp.

◆ operator=() [1/2]

ColumnBatch & signet::forge::ColumnBatch::operator= ( ColumnBatch &&  )
default

Move assignment.

◆ operator=() [2/2]

ColumnBatch & signet::forge::ColumnBatch::operator= ( const ColumnBatch )
default

Copy assignment.

◆ push_row() [1/3]

expected< void > signet::forge::ColumnBatch::push_row ( const double *  values,
size_t  count 
)
inline

Append one row of feature values.

values.size() must equal num_columns().

Definition at line 121 of file column_batch.hpp.

◆ push_row() [2/3]

expected< void > signet::forge::ColumnBatch::push_row ( const std::vector< double > &  values)
inline

Append one row from a vector.

Parameters
valuesFeature values (must match num_columns()).
Returns
Error on schema mismatch.

Definition at line 143 of file column_batch.hpp.

◆ push_row() [3/3]

expected< void > signet::forge::ColumnBatch::push_row ( std::initializer_list< double >  values)
inline

Append one row from an initializer list (e.g.

push_row({1.0, 2.0})).

Parameters
valuesFeature values (must match num_columns()).
Returns
Error on schema mismatch.

Definition at line 135 of file column_batch.hpp.

◆ reserve()

void signet::forge::ColumnBatch::reserve ( size_t  rows)
inline

Pre-allocate storage for the given number of rows in each column.

Parameters
rowsNumber of rows to reserve capacity for.

Definition at line 384 of file column_batch.hpp.

◆ schema()

const std::vector< ColumnDesc > & signet::forge::ColumnBatch::schema ( ) const
inlinenoexcept

The schema (column descriptors) this batch was created with.

Definition at line 159 of file column_batch.hpp.

◆ to_stream_record()

StreamRecord signet::forge::ColumnBatch::to_stream_record ( int64_t  timestamp_ns = 0,
uint32_t  type_id = 0x434F4C42u 
) const
inline

Serialize the batch into a WAL StreamRecord.

The binary payload uses little-endian column-major format. The default type_id 0x434F4C42 ("COLB") identifies ColumnBatch records in the WAL.

Parameters
timestamp_nsOverride timestamp (0 = use created_ns).
type_idRecord type tag for WAL routing.
Returns
StreamRecord with the serialized batch payload.

Definition at line 234 of file column_batch.hpp.

◆ with_schema()

static ColumnBatch signet::forge::ColumnBatch::with_schema ( std::vector< ColumnDesc schema,
size_t  reserve_rows = 64 
)
inlinestatic

Create an empty ColumnBatch with the given schema.

Parameters
schemaColumn descriptors in order.
reserve_rowsPre-allocate space for this many rows.

Definition at line 92 of file column_batch.hpp.

Member Data Documentation

◆ created_ns

int64_t signet::forge::ColumnBatch::created_ns = 0

Batch creation timestamp (ns since epoch)

Definition at line 83 of file column_batch.hpp.

◆ seq_first

int64_t signet::forge::ColumnBatch::seq_first = 0

First WAL sequence number in this batch.

Definition at line 81 of file column_batch.hpp.

◆ seq_last

int64_t signet::forge::ColumnBatch::seq_last = 0

Last WAL sequence number in this batch.

Definition at line 82 of file column_batch.hpp.

◆ source_id

std::string signet::forge::ColumnBatch::source_id

Exchange / feed identifier.

Definition at line 79 of file column_batch.hpp.

◆ symbol

std::string signet::forge::ColumnBatch::symbol

Instrument symbol.

Definition at line 80 of file column_batch.hpp.


The documentation for this class was generated from the following file: