Signet Forge 0.1.0
C++20 Parquet library with AI-native extensions
DEMO
Loading...
Searching...
No Matches
signet::forge::QuantizedVectorWriter Class Reference

Accumulates float32 vectors, quantizes them, and produces FIXED_LEN_BYTE_ARRAY page data suitable for Parquet column chunks. More...

#include <quantized_vector.hpp>

Public Member Functions

 QuantizedVectorWriter (QuantizationParams params)
 Construct a writer with the given quantization parameters.
 
void add (const float *data)
 Add a single float32 vector (quantized internally).
 
void add_raw (const uint8_t *data)
 Add pre-quantized raw bytes for one vector.
 
void add_batch (const float *data, size_t num_vectors)
 Add a batch of float32 vectors (quantized internally).
 
std::vector< uint8_t > flush ()
 Flush accumulated data as FIXED_LEN_BYTE_ARRAY page bytes.
 
size_t num_vectors () const
 Number of vectors currently buffered.
 
const QuantizationParamsparams () const
 Access the quantization parameters.
 

Static Public Member Functions

static ColumnDescriptor make_descriptor (const std::string &name, const QuantizationParams &params)
 Create a ColumnDescriptor suitable for a quantized vector column.
 

Detailed Description

Accumulates float32 vectors, quantizes them, and produces FIXED_LEN_BYTE_ARRAY page data suitable for Parquet column chunks.

See also
QuantizedVectorReader, Quantizer, QuantizationParams

Definition at line 264 of file quantized_vector.hpp.

Constructor & Destructor Documentation

◆ QuantizedVectorWriter()

signet::forge::QuantizedVectorWriter::QuantizedVectorWriter ( QuantizationParams  params)
inlineexplicit

Construct a writer with the given quantization parameters.

Definition at line 267 of file quantized_vector.hpp.

Member Function Documentation

◆ add()

void signet::forge::QuantizedVectorWriter::add ( const float *  data)
inline

Add a single float32 vector (quantized internally).

Parameters
dataPointer to dim floats.

Definition at line 1244 of file quantized_vector.hpp.

◆ add_batch()

void signet::forge::QuantizedVectorWriter::add_batch ( const float *  data,
size_t  num_vectors 
)
inline

Add a batch of float32 vectors (quantized internally).

Parameters
dataPointer to num_vectors * dim contiguous floats.
num_vectorsNumber of vectors to add.

Definition at line 1258 of file quantized_vector.hpp.

◆ add_raw()

void signet::forge::QuantizedVectorWriter::add_raw ( const uint8_t *  data)
inline

Add pre-quantized raw bytes for one vector.

Parameters
dataPointer to bytes_per_vector() bytes.

Definition at line 1252 of file quantized_vector.hpp.

◆ flush()

std::vector< uint8_t > signet::forge::QuantizedVectorWriter::flush ( )
inline

Flush accumulated data as FIXED_LEN_BYTE_ARRAY page bytes.

After flush, the writer is empty and ready for a new page.

Returns
A byte buffer containing all buffered quantized vectors.

Definition at line 1272 of file quantized_vector.hpp.

◆ make_descriptor()

ColumnDescriptor signet::forge::QuantizedVectorWriter::make_descriptor ( const std::string &  name,
const QuantizationParams params 
)
inlinestatic

Create a ColumnDescriptor suitable for a quantized vector column.

Physical type: FIXED_LEN_BYTE_ARRAY. Logical type: FLOAT32_VECTOR. The quantization metadata is stored separately in key-value metadata.

Parameters
nameColumn name.
paramsQuantization parameters (used to derive type_length).
Returns
A ColumnDescriptor with type_length = bytes_per_vector().

Definition at line 1279 of file quantized_vector.hpp.

◆ num_vectors()

size_t signet::forge::QuantizedVectorWriter::num_vectors ( ) const
inline

Number of vectors currently buffered.

Definition at line 291 of file quantized_vector.hpp.

◆ params()

const QuantizationParams & signet::forge::QuantizedVectorWriter::params ( ) const
inline

Access the quantization parameters.

Definition at line 306 of file quantized_vector.hpp.


The documentation for this class was generated from the following file: