Signet Forge 0.1.0
C++20 Parquet library with AI-native extensions
DEMO
Loading...
Searching...
No Matches
signet::forge::VectorWriter Class Reference

Buffers float vectors and encodes them as FIXED_LEN_BYTE_ARRAY PLAIN data. More...

#include <vector_type.hpp>

Public Member Functions

 VectorWriter (VectorColumnSpec spec)
 Construct a VectorWriter for the given column specification.
 
void add (const float *data)
 Add a single vector from a float32 pointer (must point to dimension floats).
 
bool add_batch (const float *data, size_t num_vectors)
 Add a batch of vectors (num_vectors vectors, each dimension elements, row-major).
 
std::vector< uint8_t > flush ()
 Flush the buffered vectors and return the encoded page bytes.
 
size_t num_vectors () const noexcept
 Number of vectors currently buffered (since last flush).
 
const VectorColumnSpecspec () const noexcept
 The column spec this writer was constructed with.
 

Static Public Member Functions

static ColumnDescriptor make_descriptor (const std::string &name, const VectorColumnSpec &spec)
 Create a ColumnDescriptor for a vector column with the given name and spec.
 

Detailed Description

Buffers float vectors and encodes them as FIXED_LEN_BYTE_ARRAY PLAIN data.

Input is always float* (float32). When the spec's element_type is FLOAT16 or FLOAT64, the writer converts during add(). The internal buffer stores data in the target element format, ready for direct page embedding.

Usage:

w.add(my_embedding_ptr); // 768 floats
w.add_batch(batch_ptr, 100); // 100 vectors of 768 floats each
auto page_bytes = w.flush(); // encoded FIXED_LEN_BYTE_ARRAY data
Buffers float vectors and encodes them as FIXED_LEN_BYTE_ARRAY PLAIN data.
void add(const float *data)
Add a single vector from a float32 pointer (must point to dimension floats).
@ FLOAT32
IEEE 754 single-precision (4 bytes per element)
See also
VectorReader, VectorColumnSpec

Definition at line 220 of file vector_type.hpp.

Constructor & Destructor Documentation

◆ VectorWriter()

signet::forge::VectorWriter::VectorWriter ( VectorColumnSpec  spec)
inlineexplicit

Construct a VectorWriter for the given column specification.

Parameters
specColumn specification (dimension and element type).

Definition at line 224 of file vector_type.hpp.

Member Function Documentation

◆ add()

void signet::forge::VectorWriter::add ( const float *  data)
inline

Add a single vector from a float32 pointer (must point to dimension floats).

Definition at line 228 of file vector_type.hpp.

◆ add_batch()

bool signet::forge::VectorWriter::add_batch ( const float *  data,
size_t  num_vectors 
)
inline

Add a batch of vectors (num_vectors vectors, each dimension elements, row-major).

Returns
true on success, false on overflow (batch rejected entirely).

Definition at line 264 of file vector_type.hpp.

◆ flush()

std::vector< uint8_t > signet::forge::VectorWriter::flush ( )
inline

Flush the buffered vectors and return the encoded page bytes.

The returned buffer contains PLAIN-encoded FIXED_LEN_BYTE_ARRAY data: consecutive vectors of bytes_per_vector() bytes each, with no length prefix (the type_length is known from the column descriptor).

After flush, the writer is reset and ready for the next page.

Definition at line 281 of file vector_type.hpp.

◆ make_descriptor()

static ColumnDescriptor signet::forge::VectorWriter::make_descriptor ( const std::string &  name,
const VectorColumnSpec spec 
)
inlinestatic

Create a ColumnDescriptor for a vector column with the given name and spec.

The descriptor maps to: physical_type = FIXED_LEN_BYTE_ARRAY logical_type = FLOAT32_VECTOR type_length = dimension * element_size

Definition at line 300 of file vector_type.hpp.

◆ num_vectors()

size_t signet::forge::VectorWriter::num_vectors ( ) const
inlinenoexcept

Number of vectors currently buffered (since last flush).

Definition at line 289 of file vector_type.hpp.

◆ spec()

const VectorColumnSpec & signet::forge::VectorWriter::spec ( ) const
inlinenoexcept

The column spec this writer was constructed with.

Definition at line 292 of file vector_type.hpp.


The documentation for this class was generated from the following file: