Signet Forge 0.1.0
C++20 Parquet library with AI-native extensions
DEMO
Loading...
Searching...
No Matches
signet::forge Namespace Reference

Namespaces

namespace  byte_stream_split
 BYTE_STREAM_SPLIT encoding functions for float and double types.
 
namespace  commercial
 Commercial licensing and evaluation-tier usage enforcement.
 
namespace  crypto
 
namespace  delta
 
namespace  detail
 Internal implementation details for dictionary encoding.
 
namespace  detail_mmap
 
namespace  detail_mmap_reader
 
namespace  detail_reader
 
namespace  dora
 
namespace  eu_ai_act
 
namespace  gdpr
 
namespace  mifid2
 
namespace  regulatory
 
namespace  risk
 
namespace  simd
 Platform-optimized SIMD routines for common vector operations.
 
namespace  thrift
 
namespace  xxhash
 xxHash64 hashing functions for Parquet bloom filter support.
 
namespace  z_order
 Z-order curve (Morton code) utilities for spatial sort keys.
 

Classes

class  Arena
 Bump-pointer arena allocator for batch Parquet reads. More...
 
class  ArrowExporter
 Exports Signet Forge tensors and columns as Arrow C Data Interface structs. More...
 
class  ArrowImporter
 Imports Arrow C Data Interface arrays into Signet TensorView or OwnedTensor. More...
 
struct  Art15Metrics
 Computed accuracy, robustness, and drift metrics per EU AI Act Art.15. More...
 
class  Art15MetricsCalculator
 Computes Art.15 accuracy, robustness, and drift metrics from inference records. More...
 
class  AuditChainVerifier
 Verifies hash chain integrity. More...
 
class  AuditChainWriter
 Builds SHA-256 hash chains during Parquet writes. More...
 
struct  AuditMetadata
 Chain summary stored in Parquet key-value metadata. More...
 
class  BatchTensorBuilder
 Builds a single contiguous 2D tensor from multiple column tensors, suitable for passing to an ML inference engine (ONNX Runtime, etc.). More...
 
struct  BufferInfo
 Simple C-contiguous buffer descriptor for Python interop. More...
 
struct  ClockSyncStatus
 NTP/PTP clock synchronization status for MiFID II RTS 25 Art.3. More...
 
class  CodecRegistry
 Thread-safe singleton registry of compression codec implementations. More...
 
struct  Column
 Typed column descriptor for the Schema::build() variadic API. More...
 
class  ColumnBatch
 A column-major batch of feature rows for ML inference and WAL serialization. More...
 
struct  ColumnDesc
 Describes a single column in a ColumnBatch schema. More...
 
struct  ColumnDescriptor
 Descriptor for a single column in a Parquet schema. More...
 
struct  ColumnFileStats
 Per-column statistics from ParquetReader::file_stats(). More...
 
struct  ColumnIndex
 Per-page min/max statistics for predicate pushdown. More...
 
class  ColumnIndexBuilder
 Builder that accumulates per-page statistics during column writing. More...
 
class  ColumnReader
 PLAIN-encoded Parquet column decoder. More...
 
class  ColumnStatistics
 Per-column-chunk statistics tracker. More...
 
class  ColumnToTensor
 Provides static methods to convert Parquet column data into tensor form. More...
 
class  ColumnWriter
 PLAIN encoding writer for a single Parquet column. More...
 
struct  ColumnWriteStats
 Per-column statistics produced by ParquetWriter::close(). More...
 
struct  ComplianceReport
 The generated compliance report returned to the caller. More...
 
class  CompressionCodec
 Abstract base class for all compression/decompression codecs. More...
 
class  DataClassificationOntology
 A named collection of data classification rules forming a formal ontology. More...
 
struct  DataClassificationRule
 Per-field data classification and handling policy. More...
 
class  DecisionLogReader
 Reads AI decision log Parquet files and verifies hash chain integrity. More...
 
class  DecisionLogWriter
 Writes AI trading decision records to Parquet files with cryptographic hash chaining for tamper-evident audit trails. More...
 
struct  DecisionRecord
 A single AI-driven trading decision with full provenance. More...
 
class  Dequantizer
 Dequantizes INT8/INT4 quantized vectors back to float32. More...
 
class  DictionaryDecoder
 Dictionary decoder for Parquet PLAIN_DICTIONARY / RLE_DICTIONARY encoding. More...
 
class  DictionaryEncoder
 Dictionary encoder for Parquet PLAIN_DICTIONARY / RLE_DICTIONARY encoding. More...
 
struct  DLDataType
 DLPack data type descriptor. More...
 
struct  DLDevice
 DLPack device descriptor (type + ordinal). More...
 
struct  DLManagedTensor
 DLPack managed tensor – the exchange object for from_dlpack(). More...
 
struct  DLTensor
 DLPack tensor descriptor (non-owning). More...
 
struct  DreadScore
 DREAD risk quantification — 5 factors scored 1..10. More...
 
struct  Error
 Lightweight error value carrying an ErrorCode and a human-readable message. More...
 
class  EUAIActReporter
 EU AI Act compliance report generator (Regulation (EU) 2024/1689). More...
 
class  EventBus
 Multi-tier event bus for routing SharedColumnBatch events. More...
 
struct  EventBusOptions
 Configuration options for EventBus. More...
 
class  expected
 A lightweight result type that holds either a success value of type T or an Error. More...
 
class  expected< void >
 Specialization of expected for void — used for operations that return success or error only. More...
 
struct  FeatureGroupDef
 Schema definition for a single feature group. More...
 
class  FeatureReader
 Point-in-time correct ML feature store reader over Parquet files. More...
 
struct  FeatureReaderOptions
 Configuration options for FeatureReader::open(). More...
 
struct  FeatureVector
 A single versioned observation for one entity. More...
 
class  FeatureWriter
 Append-only writer for a single feature group. More...
 
struct  FeatureWriterOptions
 Configuration options for FeatureWriter::create(). More...
 
struct  FileStats
 Aggregate file-level statistics returned by ParquetReader::file_stats(). More...
 
struct  HashChainEntry
 A single link in the cryptographic hash chain. More...
 
class  HumanOverrideLogReader
 Reads human override log Parquet files and verifies hash chain integrity. More...
 
class  HumanOverrideLogWriter
 Writes human override events to Parquet files with cryptographic hash chaining for tamper-evident audit trails. More...
 
struct  HumanOverrideRecord
 A single human oversight event with full provenance. More...
 
struct  HybridQueryOptions
 Per-query filter options passed to HybridReader::read(). More...
 
class  HybridReader
 Reads StreamRecords across historical Parquet files and (optionally) a live StreamingSink ring buffer snapshot. More...
 
struct  HybridReaderOptions
 Options for constructing a HybridReader via HybridReader::create(). More...
 
struct  IncidentPlaybook
 An ordered sequence of response steps for a specific incident type. More...
 
class  IncidentResponseTracker
 Tracks execution progress of a playbook during an active incident. More...
 
class  InferenceLogReader
 Reads ML inference log Parquet files and verifies hash chain integrity. More...
 
class  InferenceLogWriter
 Writes ML inference records to Parquet files with cryptographic hash chaining for tamper-evident audit trails. More...
 
struct  InferenceRecord
 A single ML inference event with full operational metadata. More...
 
class  LogRetentionManager
 Manages log file lifecycle: retention, archival, and deletion. More...
 
class  MappedSegment
 RAII cross-platform memory-mapped file segment. More...
 
class  MiFID2Reporter
 MiFID II RTS 24 algorithmic trading compliance report generator. More...
 
struct  Mitigation
 A specific mitigation control for a threat. More...
 
class  MmapParquetReader
 
class  MmapReader
 Low-level memory-mapped file handle. More...
 
class  MpmcRing
 Lock-free bounded multi-producer multi-consumer ring buffer. More...
 
class  MpscRingBuffer
 Multiple-producer single-consumer (MPSC) bounded ring buffer. More...
 
struct  native_type_of
 Maps a Parquet PhysicalType back to its corresponding C++ native type. More...
 
struct  native_type_of< PhysicalType::BOOLEAN >
 
struct  native_type_of< PhysicalType::BYTE_ARRAY >
 
struct  native_type_of< PhysicalType::DOUBLE >
 
struct  native_type_of< PhysicalType::FLOAT >
 
struct  native_type_of< PhysicalType::INT32 >
 
struct  native_type_of< PhysicalType::INT64 >
 
class  NumpyBridge
 Exports and imports Signet tensors via DLPack, enabling zero-copy interoperability with PyTorch, NumPy, JAX, and other DLPack-aware frameworks. More...
 
struct  OffsetIndex
 Page locations for random access within a column chunk. More...
 
struct  OnnxInputSet
 A set of named ONNX tensors for multi-input model inference. More...
 
struct  OnnxTensorInfo
 Contains all information needed to create an OrtValue externally. More...
 
class  OverrideRateMonitor
 Sliding-window override rate monitor — EU AI Act Art.14(5). More...
 
struct  OverrideRateMonitorOptions
 Options for the override rate monitor. More...
 
class  OwnedTensor
 An owning tensor that manages its own memory via a std::vector<uint8_t> buffer. More...
 
struct  PageLocation
 File offset and size descriptor for a single data page. More...
 
struct  parquet_type_of
 Maps a C++ type to its corresponding Parquet PhysicalType at compile time. More...
 
struct  parquet_type_of< bool >
 
struct  parquet_type_of< double >
 
struct  parquet_type_of< float >
 
struct  parquet_type_of< int32_t >
 
struct  parquet_type_of< int64_t >
 
struct  parquet_type_of< std::string >
 
class  ParquetReader
 Parquet file reader with typed column access and full encoding support. More...
 
class  ParquetWriter
 Streaming Parquet file writer with row-based and column-based APIs. More...
 
class  PlaybookRegistry
 Registry of incident response playbooks indexed by incident type. More...
 
struct  PlaybookStep
 A single step in an incident response playbook. More...
 
struct  QuantizationParams
 Parameters that fully describe a quantization mapping. More...
 
class  QuantizedVectorReader
 Reads quantized page data (FIXED_LEN_BYTE_ARRAY) and dequantizes to float32 on demand. More...
 
class  QuantizedVectorWriter
 Accumulates float32 vectors, quantizes them, and produces FIXED_LEN_BYTE_ARRAY page data suitable for Parquet column chunks. More...
 
class  Quantizer
 Quantizes float32 vectors to INT8 or INT4 representation. More...
 
struct  RegulatoryChange
 A tracked regulatory change record. More...
 
class  RegulatoryChangeMonitor
 Registry and tracker for regulatory changes affecting the system. More...
 
struct  ReportOptions
 Query and formatting parameters for compliance report generation. More...
 
struct  RetentionPolicy
 Retention policy configuration for log lifecycle management. More...
 
struct  RetentionSummary
 Summary of a retention enforcement pass. More...
 
class  RleDecoder
 Streaming decoder for the Parquet RLE/Bit-Packing Hybrid scheme. More...
 
class  RleEncoder
 Streaming encoder for the Parquet RLE/Bit-Packing Hybrid scheme. More...
 
class  RowLineageTracker
 Per-row lineage tracking inspired by Iceberg V3-style data governance. More...
 
class  Schema
 Immutable schema description for a Parquet file. More...
 
class  SchemaBuilder
 Fluent builder for constructing a Schema one column at a time. More...
 
class  SnappyCodec
 Bundled Snappy compression codec (header-only, no external dependency). More...
 
class  SplitBlockBloomFilter
 Parquet-spec Split Block Bloom Filter for probabilistic set membership. More...
 
class  SpscRingBuffer
 Lock-free single-producer single-consumer (SPSC) bounded ring buffer. More...
 
class  StreamingSink
 Background-thread Parquet compaction sink fed by a lock-free ring buffer. More...
 
struct  StreamRecord
 A single record flowing through the StreamingSink pipeline. More...
 
struct  TensorShape
 Describes the shape of a tensor as a vector of dimension sizes. More...
 
class  TensorView
 A lightweight, non-owning view into a contiguous block of typed memory, interpreted as a multi-dimensional tensor. More...
 
struct  ThreatEntry
 A single identified threat in the threat model. More...
 
struct  ThreatModel
 A threat model for a specific component or the entire system. More...
 
struct  ThreatModelAnalysis
 Analysis result from validating a threat model. More...
 
class  ThreatModelAnalyzer
 Validates threat model coverage and produces audit-ready JSON. More...
 
struct  VectorColumnSpec
 Configuration for a vector column: dimensionality and element precision. More...
 
class  VectorReader
 Reads FIXED_LEN_BYTE_ARRAY page data back into float vectors. More...
 
class  VectorWriter
 Buffers float vectors and encodes them as FIXED_LEN_BYTE_ARRAY PLAIN data. More...
 
struct  WalEntry
 A single decoded WAL record returned by WalReader::next() or read_all(). More...
 
class  WalManager
 Manages multiple rolling WAL segment files in a directory. More...
 
struct  WalManagerOptions
 Configuration options for WalManager::open(). More...
 
struct  WalMmapOptions
 Configuration options for WalMmapWriter::open(). More...
 
class  WalMmapWriter
 High-performance WAL writer using a ring of N memory-mapped segments. More...
 
class  WalReader
 Sequential WAL file reader for crash recovery and replay. More...
 
class  WalWriter
 Append-only Write-Ahead Log writer with CRC-32 integrity per record. More...
 
struct  WalWriterOptions
 Configuration options for WalWriter::open(). More...
 
struct  WriterOptions
 Configuration options for ParquetWriter. More...
 
struct  WriteStats
 File-level write statistics returned by ParquetWriter::close(). More...
 

Typedefs

using TDT = TensorDataType
 Convenience alias for TensorDataType (shorter schema declarations).
 
using SharedColumnBatch = std::shared_ptr< ColumnBatch >
 Thread-safe shared pointer to a ColumnBatch – the unit transferred between producer and consumer threads via EventBus.
 
using WalRecord = WalEntry
 Alias so callers can use either WalEntry or WalRecord.
 
template<PhysicalType PT>
using native_type_of_t = typename native_type_of< PT >::type
 Convenience alias: native_type_of_t<PhysicalType::INT64> == int64_t.
 

Enumerations

enum class  ReportFormat { JSON , NDJSON , CSV }
 Output serialization format for compliance reports. More...
 
enum class  TimestampGranularity { NANOS , MICROS , MILLIS }
 Timestamp granularity for MiFID II RTS 24 Art.2(2) compliance. More...
 
enum class  ComplianceStandard { MIFID2_RTS24 , EU_AI_ACT_ART12 , EU_AI_ACT_ART13 , EU_AI_ACT_ART19 }
 Which regulatory standard a compliance report satisfies. More...
 
enum class  DataClassification : int32_t { PUBLIC = 0 , INTERNAL = 1 , RESTRICTED = 2 , HIGHLY_RESTRICTED = 3 }
 Data confidentiality level per DORA Art.8 + ISO 27001 Annex A. More...
 
enum class  DataSensitivity : int32_t {
  NEUTRAL = 0 , PSEUDONYMISED = 1 , ANONYMISED = 2 , PII = 3 ,
  FINANCIAL_PII = 4 , BIOMETRIC = 5 , HEALTH = 6
}
 Data sensitivity per GDPR Art.9 special categories. More...
 
enum class  RegulatoryRegime : int32_t {
  NONE = 0 , GDPR = 1 , MIFID2 = 2 , DORA = 3 ,
  EU_AI_ACT = 4 , SOX = 5 , SEC_17A4 = 6 , PCI_DSS = 7 ,
  HIPAA = 8
}
 Regulatory regime(s) applicable to the data. More...
 
enum class  DecisionType : int32_t {
  SIGNAL = 0 , ORDER_NEW = 1 , ORDER_CANCEL = 2 , ORDER_MODIFY = 3 ,
  POSITION_OPEN = 4 , POSITION_CLOSE = 5 , RISK_OVERRIDE = 6 , NO_ACTION = 7
}
 Classification of the AI-driven trading decision. More...
 
enum class  RiskGateResult : int32_t { PASSED = 0 , REJECTED = 1 , MODIFIED = 2 , THROTTLED = 3 }
 Outcome of the pre-trade risk gate evaluation. More...
 
enum class  OrderType : int32_t {
  MARKET = 0 , LIMIT = 1 , STOP = 2 , STOP_LIMIT = 3 ,
  PEGGED = 4 , OTHER = 99
}
 Order type classification for MiFID II RTS 24 Annex I Table 2 Field 7. More...
 
enum class  TimeInForce : int32_t {
  DAY = 0 , GTC = 1 , IOC = 2 , FOK = 3 ,
  GTD = 4 , OTHER = 99
}
 Time-in-force classification for MiFID II RTS 24 Annex I Table 2 Field 8. More...
 
enum class  BuySellIndicator : int32_t { BUY = 0 , SELL = 1 , SHORT_SELL = 2 }
 Buy/sell direction for MiFID II RTS 24 Annex I Table 2 Field 6. More...
 
enum class  OverrideSource : int32_t { ALGORITHMIC = 0 , HUMAN = 1 , AUTOMATED = 2 }
 Source of a decision or override — EU AI Act Art.14(4). More...
 
enum class  OverrideAction : int32_t {
  APPROVE = 0 , MODIFY = 1 , REJECT = 2 , ESCALATE = 3 ,
  HALT = 4
}
 What action the human override took — EU AI Act Art.14(4). More...
 
enum class  HaltReason : int32_t {
  MANUAL = 0 , SAFETY_THRESHOLD = 1 , ANOMALY_DETECTED = 2 , REGULATORY = 3 ,
  MAINTENANCE = 4 , EXTERNAL = 5
}
 Reason for system halt — EU AI Act Art.14(4) "stop button". More...
 
enum class  IncidentPhase : int32_t {
  PREPARATION = 0 , DETECTION = 1 , CONTAINMENT = 2 , ERADICATION = 3 ,
  RECOVERY = 4 , LESSONS_LEARNED = 5
}
 NIST SP 800-61 incident response lifecycle phases. More...
 
enum class  IncidentSeverity : int32_t { P4_LOW = 0 , P3_MEDIUM = 1 , P2_HIGH = 2 , P1_CRITICAL = 3 }
 Incident severity per DORA Art.10(1) classification. More...
 
enum class  EscalationLevel : int32_t { L1_OPERATIONS = 0 , L2_ENGINEERING = 1 , L3_MANAGEMENT = 2 , L4_REGULATORY = 3 }
 Escalation hierarchy for incident routing. More...
 
enum class  NotificationChannel : int32_t { INTERNAL_LOG = 0 , EMAIL = 1 , PAGER = 2 , REGULATORY = 3 }
 Notification channel for incident communications. More...
 
enum class  InferenceType : int32_t {
  CLASSIFICATION = 0 , REGRESSION = 1 , EMBEDDING = 2 , GENERATION = 3 ,
  RANKING = 4 , ANOMALY = 5 , RECOMMENDATION = 6 , CUSTOM = 255
}
 Classification of the ML inference operation. More...
 
enum class  QuantizationScheme : int32_t { SYMMETRIC_INT8 = 0 , ASYMMETRIC_INT8 = 1 , SYMMETRIC_INT4 = 2 }
 Identifies the quantization method used for vector compression. More...
 
enum class  RegulatoryChangeType : int32_t {
  NEW_REGULATION = 0 , AMENDMENT = 1 , GUIDANCE = 2 , TECHNICAL_STANDARD = 3 ,
  ENFORCEMENT = 4 , DEPRECATION = 5
}
 Type of regulatory change being tracked. More...
 
enum class  RegulatoryImpact : int32_t {
  NONE = 0 , INFORMATIONAL = 1 , LOW = 2 , MEDIUM = 3 ,
  HIGH = 4 , CRITICAL = 5
}
 Impact level of a regulatory change on the system. More...
 
enum class  ChangeComplianceStatus : int32_t {
  NOT_ASSESSED = 0 , ASSESSED = 1 , IN_PROGRESS = 2 , IMPLEMENTED = 3 ,
  VERIFIED = 4 , NOT_APPLICABLE = 5
}
 Compliance status for a tracked regulatory change. More...
 
enum class  TensorDataType : int32_t {
  FLOAT32 = 0 , FLOAT64 = 1 , INT32 = 2 , INT64 = 3 ,
  INT8 = 4 , UINT8 = 5 , INT16 = 6 , FLOAT16 = 7 ,
  BOOL = 8
}
 Element data type for tensor storage, mapping to ONNX/PyTorch/TF type enums. More...
 
enum class  StrideCategory : int32_t {
  SPOOFING = 0 , TAMPERING = 1 , REPUDIATION = 2 , INFORMATION_DISCLOSURE = 3 ,
  DENIAL_OF_SERVICE = 4 , ELEVATION_OF_PRIVILEGE = 5
}
 Microsoft STRIDE threat categories. More...
 
enum class  ThreatSeverity : int32_t { LOW = 0 , MEDIUM = 1 , HIGH = 2 , CRITICAL = 3 }
 Threat severity classification per NIST SP 800-30. More...
 
enum class  MitigationStatus : int32_t {
  NOT_MITIGATED = 0 , PARTIAL = 1 , MITIGATED = 2 , ACCEPTED = 3 ,
  TRANSFERRED = 4
}
 Mitigation status for a threat. More...
 
enum class  VectorElementType : int32_t { FLOAT32 = 0 , FLOAT64 = 1 , FLOAT16 = 2 }
 Specifies the numerical precision of each element within a vector column. More...
 
enum class  WalLifecycleMode : uint8_t { Development = 0 , Benchmark = 1 , Production = 2 }
 Controls safety guardrails for WAL segment lifecycle operations. More...
 
enum class  ErrorCode {
  OK = 0 , IO_ERROR , INVALID_FILE , CORRUPT_FOOTER ,
  CORRUPT_PAGE , CORRUPT_DATA , INVALID_ARGUMENT , UNSUPPORTED_ENCODING ,
  UNSUPPORTED_COMPRESSION , UNSUPPORTED_TYPE , SCHEMA_MISMATCH , OUT_OF_RANGE ,
  THRIFT_DECODE_ERROR , ENCRYPTION_ERROR , HASH_CHAIN_BROKEN , LICENSE_ERROR ,
  LICENSE_LIMIT_EXCEEDED , INTERNAL_ERROR
}
 Error codes returned by all Signet Forge operations. More...
 
enum class  OnnxTensorType : int32_t {
  UNDEFINED = 0 , FLOAT = 1 , UINT8 = 2 , INT8 = 3 ,
  UINT16 = 4 , INT16 = 5 , INT32 = 6 , INT64 = 7 ,
  STRING = 8 , BOOL = 9 , FLOAT16 = 10 , DOUBLE = 11 ,
  UINT32 = 12 , UINT64 = 13 , BFLOAT16 = 16
}
 ONNX tensor element data types, mirroring OrtTensorElementDataType. More...
 
enum class  PhysicalType : int32_t {
  BOOLEAN = 0 , INT32 = 1 , INT64 = 2 , INT96 = 3 ,
  FLOAT = 4 , DOUBLE = 5 , BYTE_ARRAY = 6 , FIXED_LEN_BYTE_ARRAY = 7
}
 Parquet physical (storage) types as defined in parquet.thrift. More...
 
enum class  LogicalType : int32_t {
  NONE = 0 , STRING = 1 , ENUM = 2 , UUID = 3 ,
  DATE = 4 , TIME_MS = 5 , TIME_US = 6 , TIME_NS = 7 ,
  TIMESTAMP_MS = 8 , TIMESTAMP_US = 9 , TIMESTAMP_NS = 10 , DECIMAL = 11 ,
  JSON = 12 , BSON = 13 , FLOAT16 = 14 , FLOAT32_VECTOR = 100
}
 Parquet logical types (from parquet.thrift LogicalType union). More...
 
enum class  ConvertedType : int32_t {
  NONE = -1 , UTF8 = 0 , MAP = 1 , MAP_KEY_VALUE = 2 ,
  LIST = 3 , ENUM = 4 , DECIMAL = 5 , DATE = 6 ,
  TIME_MILLIS = 7 , TIME_MICROS = 8 , TIMESTAMP_MILLIS = 9 , TIMESTAMP_MICROS = 10 ,
  UINT_8 = 11 , UINT_16 = 12 , UINT_32 = 13 , UINT_64 = 14 ,
  INT_8 = 15 , INT_16 = 16 , INT_32 = 17 , INT_64 = 18 ,
  JSON = 19 , BSON = 20 , INTERVAL = 21
}
 Legacy Parquet converted types for backward compatibility with older readers. More...
 
enum class  Encoding : int32_t {
  PLAIN = 0 , PLAIN_DICTIONARY = 2 , RLE = 3 , BIT_PACKED = 4 ,
  DELTA_BINARY_PACKED = 5 , DELTA_LENGTH_BYTE_ARRAY = 6 , DELTA_BYTE_ARRAY = 7 , RLE_DICTIONARY = 8 ,
  BYTE_STREAM_SPLIT = 9
}
 Parquet page encoding types. More...
 
enum class  Compression : int32_t {
  UNCOMPRESSED = 0 , SNAPPY = 1 , GZIP = 2 , LZO = 3 ,
  BROTLI = 4 , LZ4 = 5 , ZSTD = 6 , LZ4_RAW = 7
}
 Parquet compression codecs. More...
 
enum class  PageType : int32_t { DATA_PAGE = 0 , INDEX_PAGE = 1 , DICTIONARY_PAGE = 2 , DATA_PAGE_V2 = 3 }
 Parquet page types within a column chunk. More...
 
enum class  Repetition : int32_t { REQUIRED = 0 , OPTIONAL = 1 , REPEATED = 2 }
 Parquet field repetition types (nullability / cardinality). More...
 
DLPack type definitions (matching dlpack.h v0.8)

Self-contained DLPack struct definitions for zero-dependency interop.

enum class  DLDeviceType : int32_t {
  kDLCPU = 1 , kDLCUDA = 2 , kDLCUDAHost = 3 , kDLROCM = 10 ,
  kDLMetal = 8 , kDLVulkan = 7
}
 DLPack device type, matching DLDeviceType from dlpack.h. More...
 
enum class  DLDataTypeCode : uint8_t { kDLInt = 0 , kDLUInt = 1 , kDLFloat = 2 , kDLBfloat = 4 }
 DLPack data type code, matching DLDataTypeCode from dlpack.h. More...
 

Functions

int64_t now_ns ()
 Return the current time as nanoseconds since the Unix epoch (UTC).
 
std::string hash_to_hex (const std::array< uint8_t, 32 > &hash)
 Convert a 32-byte SHA-256 hash to a lowercase hexadecimal string (64 chars).
 
expected< std::array< uint8_t, 32 > > hex_to_hash (const std::string &hex)
 Convert a 64-character lowercase hex string back to a 32-byte hash.
 
std::string generate_chain_id ()
 Generate a simple chain identifier based on the current timestamp.
 
expected< AuditMetadatabuild_audit_metadata (const AuditChainWriter &writer, const std::string &chain_id)
 Build an AuditMetadata from a populated AuditChainWriter.
 
expected< std::vector< HashChainEntry > > deserialize_and_verify_chain (const uint8_t *chain_data, size_t chain_size)
 Deserialize and verify a chain from serialized bytes in one call.
 
SharedColumnBatch make_column_batch (std::vector< ColumnDesc > schema, size_t reserve_rows=64)
 Convenience factory: create a shared batch with a given schema.
 
Schema decision_log_schema ()
 Build the Parquet schema for AI decision log files.
 
Schema human_override_log_schema ()
 Build the Parquet schema for human override log files.
 
Schema inference_log_schema ()
 Build the Parquet schema for ML inference log files.
 
constexpr size_t tensor_element_size (TensorDataType dtype) noexcept
 Returns the byte size of a single element of the given tensor data type.
 
const char * tensor_dtype_name (TensorDataType dtype) noexcept
 Returns a human-readable name for a TensorDataType.
 
float f16_to_f32 (uint16_t h) noexcept
 Convert a 16-bit IEEE 754 half-precision value to a 32-bit float.
 
uint16_t f32_to_f16 (float val) noexcept
 Convert a 32-bit float to a 16-bit IEEE 754 half-precision value.
 
SchemaBuilderadd_vector_column (SchemaBuilder &builder, const std::string &name, uint32_t dimension, VectorElementType elem=VectorElementType::FLOAT32)
 Add a vector column to a SchemaBuilder.
 
void append_le32 (std::vector< uint8_t > &buf, uint32_t val)
 Append a uint32_t in little-endian byte order to a byte buffer.
 
void append_le64 (std::vector< uint8_t > &buf, uint64_t val)
 Append a uint64_t in little-endian byte order to a byte buffer.
 
expected< std::vector< uint8_t > > compress (Compression codec, const uint8_t *data, size_t size)
 Compress data using the specified codec via the global CodecRegistry.
 
expected< std::vector< uint8_t > > decompress (Compression codec, const uint8_t *data, size_t size, size_t uncompressed_size)
 Decompress data using the specified codec via the global CodecRegistry.
 
Compression auto_select_compression (const uint8_t *sample_data, size_t sample_size)
 Automatically select the best available compression codec.
 
void register_snappy_codec ()
 Register the bundled Snappy codec with the global CodecRegistry.
 
size_t encode_varint (std::vector< uint8_t > &buf, uint64_t value)
 Encode an unsigned varint (LEB128) into a byte buffer.
 
uint64_t decode_varint (const uint8_t *data, size_t &pos, size_t size)
 Decode an unsigned varint (LEB128) from a byte buffer.
 
void bit_pack_8 (std::vector< uint8_t > &out, const uint64_t *values, int bit_width)
 Pack exactly 8 values at the given bit width into a byte buffer.
 
void bit_unpack_8 (const uint8_t *src, uint64_t *values, int bit_width)
 Unpack exactly 8 values at the given bit width from a byte buffer.
 
expected< BufferInfoto_buffer_info (const TensorView &tensor)
 Create a BufferInfo from a TensorView for Python buffer protocol export.
 
expected< OnnxInputSetprepare_inputs_for_onnx (const std::vector< std::pair< std::string, TensorView > > &inputs)
 Prepare a batch of named TensorViews for ONNX Runtime inference.
 
const char * onnx_type_name (OnnxTensorType t)
 Return a human-readable string for an OnnxTensorType value.
 
expected< size_t > validate_mmap_page_value_count (int64_t num_values, const char *context)
 
expected< size_t > validate_page_value_count (int64_t num_values, const char *context)
 
bool has_encrypted_page_header_prefix (const uint8_t *data, size_t size) noexcept
 
uint32_t load_le32 (const uint8_t *data) noexcept
 
template<typename T >
std::vector< uint8_t > to_le_bytes (T value)
 Convert an arithmetic value to its little-endian byte representation.
 
std::vector< uint8_t > to_le_bytes (const std::string &value)
 Overload for std::string – returns raw bytes (no endian conversion needed).
 
template<typename T >
from_le_bytes (const std::vector< uint8_t > &bytes)
 Reconstruct an arithmetic value from its little-endian byte representation.
 
Format string mappings

Conversion functions between Parquet/Tensor types and Arrow format strings.

const char * parquet_to_arrow_format (PhysicalType pt)
 Map a Parquet PhysicalType to an Arrow format string.
 
const char * tensor_dtype_to_arrow_format (TensorDataType dtype)
 Map a TensorDataType to an Arrow format string.
 
expected< TensorDataTypearrow_format_to_tensor_dtype (const char *format)
 Map an Arrow format string to a TensorDataType.
 
expected< TensorDataTypephysical_to_tensor_dtype (PhysicalType pt)
 Map a PhysicalType to a TensorDataType (for column export).
 
size_t physical_type_byte_size (PhysicalType pt)
 Return the byte size for a PhysicalType (primitive types only).
 
Type conversion: TensorDataType <-> DLDataType
DLDataType to_dlpack_dtype (TensorDataType dtype)
 Convert a Signet TensorDataType to a DLPack DLDataType.
 
expected< TensorDataTypefrom_dlpack_dtype (DLDataType dl_dtype)
 Convert a DLPack DLDataType back to a Signet TensorDataType.
 
Type conversion: TensorDataType <-> OnnxTensorType
OnnxTensorType to_onnx_type (TensorDataType dtype)
 Convert a Signet TensorDataType to the corresponding OnnxTensorType.
 
expected< TensorDataTypefrom_onnx_type (OnnxTensorType ort_type)
 Convert an OnnxTensorType back to a Signet TensorDataType.
 
Zero-copy tensor export for ONNX Runtime
expected< OnnxTensorInfoprepare_for_onnx (const TensorView &tensor)
 Prepare a TensorView for ONNX Runtime consumption (zero-copy).
 
expected< OnnxTensorInfoprepare_for_onnx (const OwnedTensor &tensor)
 Prepare an OwnedTensor for ONNX Runtime consumption (zero-copy).
 

Variables

constexpr size_t HASH_CHAIN_ENTRY_SIZE = 112
 Chain summary stored in Parquet key-value metadata.
 
constexpr uint8_t kEncryptedPageHeaderMagic [4] = {'S', 'P', 'H', '1'}
 
template<typename T >
constexpr PhysicalType parquet_type_of_v = parquet_type_of<T>::value
 Convenience variable template: parquet_type_of_v<double> == PhysicalType::DOUBLE.
 
constexpr int32_t PARQUET_VERSION = 2
 Parquet format version written to the file footer.
 
constexpr const char * SIGNET_CREATED_BY = "SignetStack signet-forge version 0.1.0"
 Default "created_by" string embedded in every Parquet footer.
 
constexpr uint32_t PARQUET_MAGIC = 0x31524150
 "PAR1" magic bytes (little-endian uint32) — marks a standard Parquet file.
 
constexpr uint32_t PARQUET_MAGIC_ENCRYPTED = 0x45524150
 "PARE" magic bytes (little-endian uint32) — marks a Parquet file with an encrypted footer.
 

Typedef Documentation

◆ native_type_of_t

template<PhysicalType PT>
using signet::forge::native_type_of_t = typedef typename native_type_of<PT>::type

Convenience alias: native_type_of_t<PhysicalType::INT64> == int64_t.

Definition at line 198 of file types.hpp.

◆ SharedColumnBatch

using signet::forge::SharedColumnBatch = typedef std::shared_ptr<ColumnBatch>

Thread-safe shared pointer to a ColumnBatch – the unit transferred between producer and consumer threads via EventBus.

Definition at line 400 of file column_batch.hpp.

◆ TDT

Convenience alias for TensorDataType (shorter schema declarations).

Definition at line 46 of file column_batch.hpp.

◆ WalRecord

Alias so callers can use either WalEntry or WalRecord.

Definition at line 207 of file wal.hpp.

Enumeration Type Documentation

◆ BuySellIndicator

enum class signet::forge::BuySellIndicator : int32_t
strong

Buy/sell direction for MiFID II RTS 24 Annex I Table 2 Field 6.

Enumerator
BUY 
SELL 
SHORT_SELL 

Short selling (RTS 24 Annex I Field 16)

Definition at line 90 of file decision_log.hpp.

◆ ChangeComplianceStatus

enum class signet::forge::ChangeComplianceStatus : int32_t
strong

Compliance status for a tracked regulatory change.

Enumerator
NOT_ASSESSED 

Impact assessment not yet performed.

ASSESSED 

Impact assessed, action plan pending.

IN_PROGRESS 

Implementation underway.

IMPLEMENTED 

Changes implemented.

VERIFIED 

Compliance verified by review/testing.

NOT_APPLICABLE 

Change does not apply to this system.

Definition at line 65 of file regulatory_monitor.hpp.

◆ ComplianceStandard

Which regulatory standard a compliance report satisfies.

Enumerator
MIFID2_RTS24 

MiFID II RTS 24 — algorithmic trading records.

EU_AI_ACT_ART12 

EU AI Act Article 12 — operational logging.

EU_AI_ACT_ART13 

EU AI Act Article 13 — transparency disclosure.

EU_AI_ACT_ART19 

EU AI Act Article 19 — conformity assessment summary.

Definition at line 43 of file compliance_types.hpp.

◆ Compression

enum class signet::forge::Compression : int32_t
strong

Parquet compression codecs.

Snappy is bundled (header-only); ZSTD, LZ4, and Gzip require linking external libraries enabled via CMake options.

See also
WriterOptions::compression, WriterOptions::auto_compression
Enumerator
UNCOMPRESSED 

No compression.

SNAPPY 

Snappy compression (bundled, header-only).

GZIP 

Gzip/deflate compression (requires SIGNET_ENABLE_GZIP).

LZO 

LZO compression (not currently supported).

BROTLI 

Brotli compression (not currently supported).

LZ4 

LZ4 block compression (requires SIGNET_ENABLE_LZ4).

ZSTD 

Zstandard compression (requires SIGNET_ENABLE_ZSTD).

LZ4_RAW 

LZ4 raw (unframed) block compression.

Definition at line 115 of file types.hpp.

◆ ConvertedType

enum class signet::forge::ConvertedType : int32_t
strong

Legacy Parquet converted types for backward compatibility with older readers.

Prefer LogicalType for new code. ConvertedType is written to the Thrift footer only when a corresponding LogicalType mapping exists (e.g. STRING → UTF8).

See also
LogicalType
Enumerator
NONE 

No converted type annotation.

UTF8 

UTF-8 encoded string.

MAP 

Map (nested group).

MAP_KEY_VALUE 

Map key-value pair.

LIST 

List (nested group).

ENUM 

Enum string.

DECIMAL 

Fixed-point decimal.

DATE 

Date (days since epoch).

TIME_MILLIS 

Time in milliseconds.

TIME_MICROS 

Time in microseconds.

TIMESTAMP_MILLIS 

Timestamp in milliseconds.

TIMESTAMP_MICROS 

Timestamp in microseconds.

UINT_8 

Unsigned 8-bit integer.

UINT_16 

Unsigned 16-bit integer.

UINT_32 

Unsigned 32-bit integer.

UINT_64 

Unsigned 64-bit integer.

INT_8 

Signed 8-bit integer.

INT_16 

Signed 16-bit integer.

INT_32 

Signed 32-bit integer.

INT_64 

Signed 64-bit integer.

JSON 

JSON document.

BSON 

BSON document.

INTERVAL 

Time interval.

Definition at line 67 of file types.hpp.

◆ DataClassification

enum class signet::forge::DataClassification : int32_t
strong

Data confidentiality level per DORA Art.8 + ISO 27001 Annex A.

Enumerator
PUBLIC 

No confidentiality requirement.

INTERNAL 

Business-internal, not for external sharing.

RESTRICTED 

Regulated data (GDPR, FCA, MiFID II)

HIGHLY_RESTRICTED 

Cryptographic keys, trading secrets, PII.

Definition at line 46 of file data_classification.hpp.

◆ DataSensitivity

enum class signet::forge::DataSensitivity : int32_t
strong

Data sensitivity per GDPR Art.9 special categories.

Enumerator
NEUTRAL 

No special sensitivity.

PSEUDONYMISED 

Identifiable only with additional key (Art.25)

ANONYMISED 

Irreversibly de-identified (Art.4(1))

PII 

Personally Identifiable Information.

FINANCIAL_PII 

Financial account data, trading activity.

BIOMETRIC 

Biometric data (Art.9 special category)

HEALTH 

Health/genetic data (Art.9 special category)

Definition at line 54 of file data_classification.hpp.

◆ DecisionType

enum class signet::forge::DecisionType : int32_t
strong

Classification of the AI-driven trading decision.

Covers the full lifecycle of order management decisions that must be logged under MiFID II RTS 24 and EU AI Act Article 12.

Enumerator
SIGNAL 

Raw model signal/prediction.

ORDER_NEW 

Decision to submit a new order.

ORDER_CANCEL 

Decision to cancel an existing order.

ORDER_MODIFY 

Decision to modify an existing order.

POSITION_OPEN 

Decision to open a position.

POSITION_CLOSE 

Decision to close a position.

RISK_OVERRIDE 

Risk gate override/rejection.

NO_ACTION 

Model evaluated but no action taken.

Definition at line 48 of file decision_log.hpp.

◆ DLDataTypeCode

enum class signet::forge::DLDataTypeCode : uint8_t
strong

DLPack data type code, matching DLDataTypeCode from dlpack.h.

Enumerator
kDLInt 

Signed integer.

kDLUInt 

Unsigned integer.

kDLFloat 

IEEE floating point.

kDLBfloat 

Brain floating point (bfloat16)

Definition at line 50 of file numpy_bridge.hpp.

◆ DLDeviceType

enum class signet::forge::DLDeviceType : int32_t
strong

DLPack device type, matching DLDeviceType from dlpack.h.

Only kDLCPU and kDLCUDAHost are supported for import by NumpyBridge. Other device types are defined for completeness and forward compatibility.

Enumerator
kDLCPU 

System main memory.

kDLCUDA 

NVIDIA CUDA GPU memory.

kDLCUDAHost 

CUDA pinned host memory.

kDLROCM 

AMD ROCm GPU memory.

kDLMetal 

Apple Metal GPU memory.

kDLVulkan 

Vulkan GPU memory.

Definition at line 40 of file numpy_bridge.hpp.

◆ Encoding

enum class signet::forge::Encoding : int32_t
strong

Parquet page encoding types.

Each data page stores values using one of these encodings. The writer selects encoding per-column (or auto-selects based on data characteristics).

See also
WriterOptions::default_encoding, WriterOptions::auto_encoding
Enumerator
PLAIN 

Values stored back-to-back in their native binary layout.

PLAIN_DICTIONARY 

Legacy dictionary encoding (Parquet 1.0).

RLE 

Run-length / bit-packed hybrid (used for booleans and def/rep levels).

BIT_PACKED 

Deprecated — superseded by RLE.

DELTA_BINARY_PACKED 

Delta encoding for INT32/INT64 (compact for sorted/sequential data).

DELTA_LENGTH_BYTE_ARRAY 

Delta-encoded lengths + concatenated byte arrays.

DELTA_BYTE_ARRAY 

Incremental/prefix encoding for byte arrays.

RLE_DICTIONARY 

Modern dictionary encoding (Parquet 2.0) — dict page + RLE indices.

BYTE_STREAM_SPLIT 

Byte-stream split for FLOAT/DOUBLE (transposes byte lanes for better compression).

Definition at line 98 of file types.hpp.

◆ ErrorCode

enum class signet::forge::ErrorCode
strong

Error codes returned by all Signet Forge operations.

Every function in the library that can fail returns an expected<T> whose error payload carries one of these codes together with a human-readable message string. Codes are grouped by subsystem so that callers can pattern-match on categories (I/O, format corruption, unsupported features, licensing) without inspecting the message text.

Enumerator
OK 

Operation completed successfully (no error).

IO_ERROR 

A file-system or stream I/O operation failed (open, read, write, rename).

INVALID_FILE 

The file is not a valid Parquet file (e.g. missing or wrong magic bytes).

CORRUPT_FOOTER 

The Parquet footer (FileMetaData) is missing, truncated, or malformed.

CORRUPT_PAGE 

A data page failed integrity checks (bad CRC, truncated, or exceeds size limits).

CORRUPT_DATA 

Decoded data is corrupt or inconsistent (e.g. out-of-range dictionary index).

INVALID_ARGUMENT 

A caller-supplied argument is outside the valid range or violates a precondition.

UNSUPPORTED_ENCODING 

The file uses an encoding not supported by this build (e.g. BYTE_STREAM_SPLIT on integers).

UNSUPPORTED_COMPRESSION 

The file uses a compression codec not linked into this build (ZSTD, LZ4, Gzip).

UNSUPPORTED_TYPE 

The file contains a Parquet physical or logical type that is not implemented.

SCHEMA_MISMATCH 

The requested column name or type does not match the file schema.

OUT_OF_RANGE 

An index, offset, or size value is outside the valid range.

THRIFT_DECODE_ERROR 

The Thrift Compact Protocol decoder encountered invalid or malicious input.

ENCRYPTION_ERROR 

An encryption or decryption operation failed (bad key, tampered ciphertext, PME error).

HASH_CHAIN_BROKEN 

The cryptographic audit hash chain is broken, indicating data tampering.

LICENSE_ERROR 

The commercial license is missing, invalid, or the build is misconfigured.

LICENSE_LIMIT_EXCEEDED 

An evaluation-tier usage limit has been exceeded (rows, users, nodes, or time).

INTERNAL_ERROR 

An unexpected internal error that does not fit any other category.

Definition at line 49 of file error.hpp.

◆ EscalationLevel

enum class signet::forge::EscalationLevel : int32_t
strong

Escalation hierarchy for incident routing.

Enumerator
L1_OPERATIONS 

First-line operations team.

L2_ENGINEERING 

Engineering / DevOps on-call.

L3_MANAGEMENT 

Management / CISO notification.

L4_REGULATORY 

Regulatory authority notification (DORA Art.19)

Definition at line 64 of file incident_response.hpp.

◆ HaltReason

enum class signet::forge::HaltReason : int32_t
strong

Reason for system halt — EU AI Act Art.14(4) "stop button".

Enumerator
MANUAL 

Operator manually halted the system.

SAFETY_THRESHOLD 

Override rate exceeded safety threshold.

ANOMALY_DETECTED 

Anomalous behavior detected.

REGULATORY 

Regulatory or compliance-driven halt.

MAINTENANCE 

Scheduled maintenance halt.

EXTERNAL 

External event (market halt, circuit breaker)

Definition at line 76 of file human_oversight.hpp.

◆ IncidentPhase

enum class signet::forge::IncidentPhase : int32_t
strong

NIST SP 800-61 incident response lifecycle phases.

Enumerator
PREPARATION 

Pre-incident readiness.

DETECTION 

Anomaly detection / alert triage.

CONTAINMENT 

Limit blast radius.

ERADICATION 

Remove root cause.

RECOVERY 

Restore normal operations.

LESSONS_LEARNED 

Post-incident review (DORA Art.13)

Definition at line 46 of file incident_response.hpp.

◆ IncidentSeverity

enum class signet::forge::IncidentSeverity : int32_t
strong

Incident severity per DORA Art.10(1) classification.

Enumerator
P4_LOW 

Minor, no customer impact.

P3_MEDIUM 

Limited impact, workaround available.

P2_HIGH 

Significant impact, SLA breach risk.

P1_CRITICAL 

Major outage, data loss, regulatory notification required.

Definition at line 56 of file incident_response.hpp.

◆ InferenceType

enum class signet::forge::InferenceType : int32_t
strong

Classification of the ML inference operation.

Covers common ML workloads from classical models to LLM generation.

Enumerator
CLASSIFICATION 

Binary or multi-class classification.

REGRESSION 

Continuous value prediction.

EMBEDDING 

Vector embedding computation.

GENERATION 

LLM text generation.

RANKING 

Ranking/scoring of candidates.

ANOMALY 

Anomaly/outlier detection.

RECOMMENDATION 

Recommendation system inference.

CUSTOM 

Application-specific inference type.

Definition at line 47 of file inference_log.hpp.

◆ LogicalType

enum class signet::forge::LogicalType : int32_t
strong

Parquet logical types (from parquet.thrift LogicalType union).

Logical types add semantic meaning on top of a PhysicalType. For example, a STRING column is stored as BYTE_ARRAY but interpreted as UTF-8 text.

See also
PhysicalType, ColumnDescriptor
Enumerator
NONE 

No logical annotation — raw physical type.

STRING 

UTF-8 string (stored as BYTE_ARRAY).

ENUM 

Enum string (stored as BYTE_ARRAY).

UUID 

RFC 4122 UUID (stored as FIXED_LEN_BYTE_ARRAY(16)).

DATE 

Calendar date — INT32, days since 1970-01-01.

TIME_MS 

Time of day — INT32, milliseconds since midnight.

TIME_US 

Time of day — INT64, microseconds since midnight.

TIME_NS 

Time of day — INT64, nanoseconds since midnight.

TIMESTAMP_MS 

Timestamp — INT64, milliseconds since Unix epoch.

TIMESTAMP_US 

Timestamp — INT64, microseconds since Unix epoch.

TIMESTAMP_NS 

Timestamp — INT64, nanoseconds since Unix epoch.

DECIMAL 

Fixed-point decimal (INT32/INT64/FIXED_LEN_BYTE_ARRAY).

JSON 

JSON document (stored as BYTE_ARRAY).

BSON 

BSON document (stored as BYTE_ARRAY).

FLOAT16 

IEEE 754 half-precision float (FIXED_LEN_BYTE_ARRAY(2)).

FLOAT32_VECTOR 

ML embedding vector — FIXED_LEN_BYTE_ARRAY(dim*4).

Signet AI-native extension; stored as standard Parquet types with logical annotation only.

Definition at line 41 of file types.hpp.

◆ MitigationStatus

enum class signet::forge::MitigationStatus : int32_t
strong

Mitigation status for a threat.

Enumerator
NOT_MITIGATED 

No mitigation in place.

PARTIAL 

Some controls, residual risk remains.

MITIGATED 

Fully mitigated by implemented controls.

ACCEPTED 

Risk accepted per organizational policy.

TRANSFERRED 

Risk transferred (insurance, third-party)

Definition at line 68 of file threat_model.hpp.

◆ NotificationChannel

enum class signet::forge::NotificationChannel : int32_t
strong

Notification channel for incident communications.

Enumerator
INTERNAL_LOG 

System log only.

EMAIL 

Email to responsible parties.

PAGER 

PagerDuty / on-call alert.

REGULATORY 

Formal regulatory notification (DORA Art.19(1))

Definition at line 72 of file incident_response.hpp.

◆ OnnxTensorType

enum class signet::forge::OnnxTensorType : int32_t
strong

ONNX tensor element data types, mirroring OrtTensorElementDataType.

Numeric values match the ONNX Runtime C API's OrtTensorElementDataType enum exactly, so they can be cast directly via static_cast<> when constructing OrtValues.

Note
Not all types have Signet TensorDataType equivalents. Use from_onnx_type() to convert, which returns an error for unsupported types (STRING, UINT16, UINT32, UINT64, BFLOAT16).
See also
to_onnx_type, from_onnx_type, onnx_type_name
Enumerator
UNDEFINED 

No type (invalid / uninitialized)

FLOAT 

32-bit IEEE float (float32)

UINT8 

8-bit unsigned integer

INT8 

8-bit signed integer

UINT16 

16-bit unsigned integer

INT16 

16-bit signed integer

INT32 

32-bit signed integer

INT64 

64-bit signed integer

STRING 

Variable-length string.

BOOL 

Boolean (1 byte)

FLOAT16 

16-bit IEEE float (float16)

DOUBLE 

64-bit IEEE float (float64)

UINT32 

32-bit unsigned integer

UINT64 

64-bit unsigned integer

BFLOAT16 

Brain floating-point (bfloat16)

Definition at line 39 of file onnx_bridge.hpp.

◆ OrderType

enum class signet::forge::OrderType : int32_t
strong

Order type classification for MiFID II RTS 24 Annex I Table 2 Field 7.

Enumerator
MARKET 

Market order.

LIMIT 

Limit order.

STOP 

Stop order.

STOP_LIMIT 

Stop-limit order.

PEGGED 

Pegged order.

OTHER 

Other order type.

Definition at line 70 of file decision_log.hpp.

◆ OverrideAction

enum class signet::forge::OverrideAction : int32_t
strong

What action the human override took — EU AI Act Art.14(4).

Enumerator
APPROVE 

Human approved the AI system's output as-is.

MODIFY 

Human modified the AI system's output.

REJECT 

Human rejected the AI system's output entirely.

ESCALATE 

Human escalated to a higher authority.

HALT 

Human triggered system halt ("stop button")

Definition at line 67 of file human_oversight.hpp.

◆ OverrideSource

enum class signet::forge::OverrideSource : int32_t
strong

Source of a decision or override — EU AI Act Art.14(4).

Tracks whether an output was produced algorithmically or overridden by a human.

Enumerator
ALGORITHMIC 

Original AI system output (no human intervention)

HUMAN 

Human operator override.

AUTOMATED 

Automated safety system override (e.g. risk gate)

Definition at line 60 of file human_oversight.hpp.

◆ PageType

enum class signet::forge::PageType : int32_t
strong

Parquet page types within a column chunk.

Enumerator
DATA_PAGE 

Data page (Parquet 1.0 format).

INDEX_PAGE 

Index page (reserved, not used by Signet).

DICTIONARY_PAGE 

Dictionary page — contains the value dictionary for RLE_DICTIONARY columns.

DATA_PAGE_V2 

Data page v2 (Parquet 2.0 format with separate rep/def level sections).

Definition at line 127 of file types.hpp.

◆ PhysicalType

enum class signet::forge::PhysicalType : int32_t
strong

Parquet physical (storage) types as defined in parquet.thrift.

Every column in a Parquet file stores values in one of these physical representations. The mapping from C++ types is provided by parquet_type_of.

See also
LogicalType, parquet_type_of, ColumnDescriptor
Enumerator
BOOLEAN 

1-bit boolean, bit-packed in pages.

INT32 

32-bit signed integer (little-endian).

INT64 

64-bit signed integer (little-endian).

INT96 

96-bit value (deprecated — legacy Impala timestamps).

FLOAT 

IEEE 754 single-precision float.

DOUBLE 

IEEE 754 double-precision float.

BYTE_ARRAY 

Variable-length byte sequence (strings, binary).

FIXED_LEN_BYTE_ARRAY 

Fixed-length byte array (UUID, vectors, decimals).

Definition at line 20 of file types.hpp.

◆ QuantizationScheme

enum class signet::forge::QuantizationScheme : int32_t
strong

Identifies the quantization method used for vector compression.

See also
QuantizationParams, Quantizer, Dequantizer
Enumerator
SYMMETRIC_INT8 

value = round(float / scale), range [-127, 127].

ASYMMETRIC_INT8 

value = round((float - zero_point) / scale), range [0, 255].

SYMMETRIC_INT4 

value = round(float / scale), range [-7, 7], nibble-packed.

Definition at line 60 of file quantized_vector.hpp.

◆ RegulatoryChangeType

enum class signet::forge::RegulatoryChangeType : int32_t
strong

Type of regulatory change being tracked.

Enumerator
NEW_REGULATION 

Entirely new regulation enacted.

AMENDMENT 

Existing regulation amended.

GUIDANCE 

Supervisory guidance or interpretation.

TECHNICAL_STANDARD 

RTS/ITS (Regulatory/Implementing Technical Standards)

ENFORCEMENT 

Enforcement action or precedent.

DEPRECATION 

Regulation repealed or superseded.

Definition at line 45 of file regulatory_monitor.hpp.

◆ RegulatoryImpact

enum class signet::forge::RegulatoryImpact : int32_t
strong

Impact level of a regulatory change on the system.

Enumerator
NONE 

No system impact.

INFORMATIONAL 

Awareness only, no action needed.

LOW 

Minor documentation update.

MEDIUM 

Code/configuration changes required.

HIGH 

Significant architectural changes.

CRITICAL 

Immediate action required (compliance deadline)

Definition at line 55 of file regulatory_monitor.hpp.

◆ RegulatoryRegime

enum class signet::forge::RegulatoryRegime : int32_t
strong

Regulatory regime(s) applicable to the data.

Enumerator
NONE 
GDPR 

EU General Data Protection Regulation.

MIFID2 

Markets in Financial Instruments Directive II.

DORA 

Digital Operational Resilience Act.

EU_AI_ACT 

EU Artificial Intelligence Act.

SOX 

Sarbanes-Oxley Act.

SEC_17A4 

SEC Rule 17a-4 (records retention)

PCI_DSS 

Payment Card Industry Data Security Standard.

HIPAA 

Health Insurance Portability and Accountability Act.

Definition at line 65 of file data_classification.hpp.

◆ Repetition

enum class signet::forge::Repetition : int32_t
strong

Parquet field repetition types (nullability / cardinality).

See also
ColumnDescriptor::repetition
Enumerator
REQUIRED 

Exactly one value per row (non-nullable).

OPTIONAL 

Zero or one value per row (nullable).

REPEATED 

Zero or more values per row (list).

Definition at line 140 of file types.hpp.

◆ ReportFormat

enum class signet::forge::ReportFormat
strong

Output serialization format for compliance reports.

Enumerator
JSON 

Pretty-printed JSON object (default)

NDJSON 

Newline-delimited JSON — one record per line (streaming-friendly)

CSV 

Comma-separated values with header row.

Definition at line 25 of file compliance_types.hpp.

◆ RiskGateResult

enum class signet::forge::RiskGateResult : int32_t
strong

Outcome of the pre-trade risk gate evaluation.

Records whether the order passed, was rejected, modified, or throttled by the risk management system.

Enumerator
PASSED 

All risk checks passed.

REJECTED 

Order rejected by risk gate.

MODIFIED 

Order modified by risk gate (e.g., size reduced)

THROTTLED 

Order delayed by rate limiting.

Definition at line 62 of file decision_log.hpp.

◆ StrideCategory

enum class signet::forge::StrideCategory : int32_t
strong

Microsoft STRIDE threat categories.

Enumerator
SPOOFING 

Authentication bypass, identity impersonation.

TAMPERING 

Unauthorized data modification.

REPUDIATION 

Denying actions without proof.

INFORMATION_DISCLOSURE 

Unauthorized data exposure.

DENIAL_OF_SERVICE 

Resource exhaustion, availability attacks.

ELEVATION_OF_PRIVILEGE 

Gaining unauthorized access levels.

Definition at line 50 of file threat_model.hpp.

◆ TensorDataType

enum class signet::forge::TensorDataType : int32_t
strong

Element data type for tensor storage, mapping to ONNX/PyTorch/TF type enums.

Enumerator
FLOAT32 

IEEE 754 single-precision (4 bytes)

FLOAT64 

IEEE 754 double-precision (8 bytes)

INT32 

Signed 32-bit integer.

INT64 

Signed 64-bit integer.

INT8 

Signed 8-bit integer.

UINT8 

Unsigned 8-bit integer.

INT16 

Signed 16-bit integer.

FLOAT16 

IEEE 754 half-precision (2 bytes)

BOOL 

Boolean (1 byte)

Definition at line 148 of file tensor_bridge.hpp.

◆ ThreatSeverity

enum class signet::forge::ThreatSeverity : int32_t
strong

Threat severity classification per NIST SP 800-30.

Enumerator
LOW 

DREAD composite < 4.0.

MEDIUM 

DREAD composite 4.0 - 6.9.

HIGH 

DREAD composite 7.0 - 8.9.

CRITICAL 

DREAD composite >= 9.0.

Definition at line 60 of file threat_model.hpp.

◆ TimeInForce

enum class signet::forge::TimeInForce : int32_t
strong

Time-in-force classification for MiFID II RTS 24 Annex I Table 2 Field 8.

Enumerator
DAY 

Day order (valid until end of trading day)

GTC 

Good-Till-Cancelled.

IOC 

Immediate-Or-Cancel.

FOK 

Fill-Or-Kill.

GTD 

Good-Till-Date.

OTHER 

Other.

Definition at line 80 of file decision_log.hpp.

◆ TimestampGranularity

Timestamp granularity for MiFID II RTS 24 Art.2(2) compliance.

Controls the sub-second precision emitted in ISO 8601 timestamp fields. RTS 24 requires nanosecond precision for high-frequency trading; lower granularities may be appropriate for non-HFT reporting regimes.

Enumerator
NANOS 

9 sub-second digits (default, MiFID II HFT compliant)

MICROS 

6 sub-second digits

MILLIS 

3 sub-second digits

Definition at line 36 of file compliance_types.hpp.

◆ VectorElementType

enum class signet::forge::VectorElementType : int32_t
strong

Specifies the numerical precision of each element within a vector column.

Enumerator
FLOAT32 

IEEE 754 single-precision (4 bytes per element)

FLOAT64 

IEEE 754 double-precision (8 bytes per element)

FLOAT16 

IEEE 754 half-precision (2 bytes per element)

Definition at line 47 of file vector_type.hpp.

◆ WalLifecycleMode

enum class signet::forge::WalLifecycleMode : uint8_t
strong

Controls safety guardrails for WAL segment lifecycle operations.

In Production mode, destructive operations like reset_on_open are denied.

Enumerator
Development 

Permissive: allows reset_on_open.

Benchmark 

Same as Development, for benchmark harnesses.

Production 

Strict: reset_on_open is rejected.

Definition at line 766 of file wal.hpp.

Function Documentation

◆ add_vector_column()

SchemaBuilder & signet::forge::add_vector_column ( SchemaBuilder builder,
const std::string &  name,
uint32_t  dimension,
VectorElementType  elem = VectorElementType::FLOAT32 
)
inline

Add a vector column to a SchemaBuilder.

Creates a FIXED_LEN_BYTE_ARRAY column with FLOAT32_VECTOR logical type, type_length set to dimension * element_size.

Usage: auto schema = add_vector_column( Schema::builder("embeddings") .column<int64_t>("id") .column<std::string>("text"), "embedding", 768) .build();

Parameters
builderThe SchemaBuilder to add the column to.
nameColumn name.
dimensionNumber of elements per vector (e.g. 768).
elemElement type (default FLOAT32).
Returns
Reference to the builder for chaining.

Definition at line 788 of file vector_type.hpp.

◆ append_le32()

void signet::forge::append_le32 ( std::vector< uint8_t > &  buf,
uint32_t  val 
)
inline

Append a uint32_t in little-endian byte order to a byte buffer.

Parameters
bufThe destination byte buffer.
valThe 32-bit value to append (4 bytes).

Definition at line 32 of file column_writer.hpp.

◆ append_le64()

void signet::forge::append_le64 ( std::vector< uint8_t > &  buf,
uint64_t  val 
)
inline

Append a uint64_t in little-endian byte order to a byte buffer.

Parameters
bufThe destination byte buffer.
valThe 64-bit value to append (8 bytes).

Definition at line 42 of file column_writer.hpp.

◆ arrow_format_to_tensor_dtype()

expected< TensorDataType > signet::forge::arrow_format_to_tensor_dtype ( const char *  format)
inline

Map an Arrow format string to a TensorDataType.

Supports the standard single-character format codes for primitive numeric types and booleans. Multi-character format codes (e.g. "tss:" for timestamp) are not supported and return UNSUPPORTED_TYPE.

Parameters
formatArrow format string (must not be null or empty).
Returns
The corresponding TensorDataType on success, or an UNSUPPORTED_TYPE error for unrecognized format strings.
See also
tensor_dtype_to_arrow_format

Definition at line 218 of file arrow_bridge.hpp.

◆ auto_select_compression()

Compression signet::forge::auto_select_compression ( const uint8_t *  sample_data,
size_t  sample_size 
)
inline

Automatically select the best available compression codec.

Priority order: ZSTD > Snappy > LZ4_RAW > LZ4 > UNCOMPRESSED.

  • ZSTD gives the best compression ratio for most Parquet workloads.
  • Snappy is the fastest but has moderate compression.
  • LZ4 is a good middle ground between speed and ratio.
Parameters
sample_dataReserved for future data-aware heuristics (e.g. entropy estimation). Currently unused.
sample_sizeReserved for future use. Currently unused.
Returns
The Compression enumerator for the highest-priority codec that is registered, or Compression::UNCOMPRESSED if none are available.
See also
CodecRegistry::has(), compress()

Definition at line 270 of file codec.hpp.

◆ bit_pack_8()

void signet::forge::bit_pack_8 ( std::vector< uint8_t > &  out,
const uint64_t *  values,
int  bit_width 
)
inline

Pack exactly 8 values at the given bit width into a byte buffer.

Each value occupies bit_width bits, packed LSB-first in little-endian byte order. Appends exactly bit_width bytes to out (since 8 values * bit_width bits = bit_width bytes). If bit_width is 0, no bytes are emitted (all values are implicitly zero).

Parameters
outOutput byte buffer to append packed bytes to.
valuesPointer to exactly 8 unsigned values to pack.
bit_widthBits per value (0–64).
See also
bit_unpack_8

Definition at line 116 of file rle.hpp.

◆ bit_unpack_8()

void signet::forge::bit_unpack_8 ( const uint8_t *  src,
uint64_t *  values,
int  bit_width 
)
inline

Unpack exactly 8 values at the given bit width from a byte buffer.

Reverses the packing performed by bit_pack_8(). Reads bit_width bytes from src and unpacks 8 values of bit_width bits each, stored LSB-first. If bit_width is 0, all output values are set to zero.

Parameters
srcPointer to at least bit_width bytes of packed data.
valuesOutput array of exactly 8 unsigned values.
bit_widthBits per value (0–64).
See also
bit_pack_8

Definition at line 154 of file rle.hpp.

◆ build_audit_metadata()

expected< AuditMetadata > signet::forge::build_audit_metadata ( const AuditChainWriter writer,
const std::string &  chain_id 
)
inline

Build an AuditMetadata from a populated AuditChainWriter.

Extracts the chain summary from a writer that has accumulated entries. The chain_id must be provided by the caller (use generate_chain_id() to create one, or reuse an existing ID for chain continuation).

Parameters
writerThe chain writer to extract metadata from.
chain_idUnique chain identifier for this audit trail.
Returns
The populated AuditMetadata, or an error if the writer has no entries.

Definition at line 967 of file audit_chain.hpp.

◆ compress()

expected< std::vector< uint8_t > > signet::forge::compress ( Compression  codec,
const uint8_t *  data,
size_t  size 
)
inline

Compress data using the specified codec via the global CodecRegistry.

For Compression::UNCOMPRESSED, returns a verbatim copy of the input without consulting the registry.

Parameters
codecThe compression type to use.
dataPointer to the raw input bytes.
sizeNumber of bytes to compress.
Returns
The compressed byte buffer, or an Error if the codec is not registered or compression fails.
See also
decompress(), auto_select_compression()

Definition at line 183 of file codec.hpp.

◆ decision_log_schema()

Schema signet::forge::decision_log_schema ( )
inline

Build the Parquet schema for AI decision log files.

Columns: timestamp_ns INT64 (TIMESTAMP_NS) – Decision timestamp strategy_id INT32 – Strategy identifier model_version BYTE_ARRAY (STRING) – Model version hash decision_type INT32 – DecisionType enum value input_features BYTE_ARRAY (STRING) – JSON array of floats model_output DOUBLE – Primary model output confidence DOUBLE – Model confidence [0,1] risk_result INT32 – RiskGateResult enum value order_id BYTE_ARRAY (STRING) – Associated order ID symbol BYTE_ARRAY (STRING) – Trading symbol price DOUBLE – Decision price quantity DOUBLE – Decision quantity venue BYTE_ARRAY (STRING) – Execution venue notes BYTE_ARRAY (STRING) – Free-text notes chain_seq INT64 – Hash chain sequence number chain_hash BYTE_ARRAY (STRING) – Hex entry hash prev_hash BYTE_ARRAY (STRING) – Hex previous hash

Definition at line 511 of file decision_log.hpp.

◆ decode_varint()

uint64_t signet::forge::decode_varint ( const uint8_t *  data,
size_t &  pos,
size_t  size 
)
inline

Decode an unsigned varint (LEB128) from a byte buffer.

Reads a variable-length encoded unsigned integer starting at data[pos]. On success, pos is advanced past the consumed bytes. Returns 0 with pos unchanged if the buffer is exhausted before a terminating byte is found. Includes overflow protection: decoding stops if the shift exceeds 63 bits.

Parameters
dataPointer to the encoded byte stream.
posCurrent read position (updated on return).
sizeTotal size of the byte stream.
Returns
The decoded unsigned integer, or 0 on failure.
See also
encode_varint

Definition at line 84 of file rle.hpp.

◆ decompress()

expected< std::vector< uint8_t > > signet::forge::decompress ( Compression  codec,
const uint8_t *  data,
size_t  size,
size_t  uncompressed_size 
)
inline

Decompress data using the specified codec via the global CodecRegistry.

For Compression::UNCOMPRESSED, returns a verbatim copy of the input without consulting the registry.

Parameters
codecThe compression type that was used to compress the data.
dataPointer to the compressed input bytes.
sizeNumber of compressed bytes.
uncompressed_sizeExpected size of the decompressed output (from the Parquet page header).
Returns
The decompressed byte buffer, or an Error if the codec is not registered or decompression fails.
See also
compress(), auto_select_compression()

Definition at line 213 of file codec.hpp.

◆ deserialize_and_verify_chain()

expected< std::vector< HashChainEntry > > signet::forge::deserialize_and_verify_chain ( const uint8_t *  chain_data,
size_t  chain_size 
)
inline

Deserialize and verify a chain from serialized bytes in one call.

Combines deserialization with AuditChainVerifier::verify(), returning HASH_CHAIN_BROKEN when verification fails.

Parameters
chain_dataPointer to the serialized chain bytes.
chain_sizeSize of the serialized chain in bytes.
Returns
The verified entries, or an error if deserialization or verification fails.
See also
AuditChainVerifier::verify

Definition at line 998 of file audit_chain.hpp.

◆ encode_varint()

size_t signet::forge::encode_varint ( std::vector< uint8_t > &  buf,
uint64_t  value 
)
inline

Encode an unsigned varint (LEB128) into a byte buffer.

Appends the variable-length encoding of value to buf. Each output byte uses 7 data bits and 1 continuation bit (MSB), following the unsigned LEB128 convention used by the Parquet wire format.

Parameters
bufOutput byte buffer to append to.
valueThe unsigned integer to encode.
Returns
Number of bytes written (1–10).
See also
decode_varint

Definition at line 62 of file rle.hpp.

◆ f16_to_f32()

float signet::forge::f16_to_f32 ( uint16_t  h)
inlinenoexcept

Convert a 16-bit IEEE 754 half-precision value to a 32-bit float.

Handles normals, subnormals, infinities, NaNs, and signed zero.

Definition at line 59 of file vector_type.hpp.

◆ f32_to_f16()

uint16_t signet::forge::f32_to_f16 ( float  val)
inlinenoexcept

Convert a 32-bit float to a 16-bit IEEE 754 half-precision value.

Rounds to nearest even. Handles overflow to infinity and subnormals.

Definition at line 100 of file vector_type.hpp.

◆ from_dlpack_dtype()

expected< TensorDataType > signet::forge::from_dlpack_dtype ( DLDataType  dl_dtype)
inline

Convert a DLPack DLDataType back to a Signet TensorDataType.

Returns an error for types that have no Signet equivalent (e.g. uint16, uint32, uint64, bfloat16, or multi-lane SIMD types).

Parameters
dl_dtypeThe DLPack data type descriptor.
Returns
The corresponding TensorDataType, or UNSUPPORTED_TYPE for multi-lane types, bfloat16, or unsupported bit widths.
See also
to_dlpack_dtype

Definition at line 191 of file numpy_bridge.hpp.

◆ from_le_bytes()

template<typename T >
T signet::forge::from_le_bytes ( const std::vector< uint8_t > &  bytes)
inline

Reconstruct an arithmetic value from its little-endian byte representation.

If bytes contains fewer than sizeof(T) bytes the result is value-initialized (zero).

Template Parameters
TAn arithmetic type.
Parameters
bytesAt least sizeof(T) bytes in little-endian order.
Returns
The reconstructed value.

Definition at line 66 of file statistics.hpp.

◆ from_onnx_type()

expected< TensorDataType > signet::forge::from_onnx_type ( OnnxTensorType  ort_type)
inline

Convert an OnnxTensorType back to a Signet TensorDataType.

Parameters
ort_typeThe ONNX tensor element type.
Returns
The corresponding TensorDataType, or UNSUPPORTED_TYPE for types that have no Signet equivalent (STRING, UINT16, UINT32, UINT64, BFLOAT16, UNDEFINED).
See also
to_onnx_type

Definition at line 90 of file onnx_bridge.hpp.

◆ generate_chain_id()

std::string signet::forge::generate_chain_id ( )
inline

Generate a simple chain identifier based on the current timestamp.

Format: "chain-<hex_timestamp_ns>" (e.g. "chain-1a2b3c4d5e6f7890"). This is NOT cryptographically random – it is a human-readable identifier for correlating chain segments across files.

Definition at line 203 of file audit_chain.hpp.

◆ has_encrypted_page_header_prefix()

bool signet::forge::has_encrypted_page_header_prefix ( const uint8_t *  data,
size_t  size 
)
inlinenoexcept

Definition at line 122 of file reader.hpp.

◆ hash_to_hex()

std::string signet::forge::hash_to_hex ( const std::array< uint8_t, 32 > &  hash)
inline

Convert a 32-byte SHA-256 hash to a lowercase hexadecimal string (64 chars).

Definition at line 150 of file audit_chain.hpp.

◆ hex_to_hash()

expected< std::array< uint8_t, 32 > > signet::forge::hex_to_hash ( const std::string &  hex)
inline

Convert a 64-character lowercase hex string back to a 32-byte hash.

Returns an error if the string is not exactly 64 hex characters.

Definition at line 164 of file audit_chain.hpp.

◆ human_override_log_schema()

Schema signet::forge::human_override_log_schema ( )
inline

Build the Parquet schema for human override log files.

Columns mirror HumanOverrideRecord fields plus hash chain + row lineage.

Definition at line 259 of file human_oversight.hpp.

◆ inference_log_schema()

Schema signet::forge::inference_log_schema ( )
inline

Build the Parquet schema for ML inference log files.

Columns: timestamp_ns INT64 (TIMESTAMP_NS) – Inference timestamp model_id BYTE_ARRAY (STRING) – Model identifier model_version BYTE_ARRAY (STRING) – Model version hash inference_type INT32 – InferenceType enum value input_embedding BYTE_ARRAY (STRING) – JSON array of floats input_hash BYTE_ARRAY (STRING) – SHA-256 of raw input output_hash BYTE_ARRAY (STRING) – SHA-256 of raw output output_score DOUBLE – Primary output score latency_ns INT64 – Inference latency (ns) batch_size INT32 – Batch size input_tokens INT32 – Input token count output_tokens INT32 – Output token count user_id_hash BYTE_ARRAY (STRING) – Hashed user ID session_id BYTE_ARRAY (STRING) – Session identifier metadata_json BYTE_ARRAY (STRING) – Additional JSON metadata chain_seq INT64 – Hash chain sequence number chain_hash BYTE_ARRAY (STRING) – Hex entry hash prev_hash BYTE_ARRAY (STRING) – Hex previous hash

Definition at line 441 of file inference_log.hpp.

◆ load_le32()

uint32_t signet::forge::load_le32 ( const uint8_t *  data)
inlinenoexcept

Definition at line 129 of file reader.hpp.

◆ make_column_batch()

SharedColumnBatch signet::forge::make_column_batch ( std::vector< ColumnDesc schema,
size_t  reserve_rows = 64 
)
inline

Convenience factory: create a shared batch with a given schema.

Definition at line 403 of file column_batch.hpp.

◆ now_ns()

int64_t signet::forge::now_ns ( )
inline

Return the current time as nanoseconds since the Unix epoch (UTC).

Gap R-5 (MiFID II RTS 25 Art.2-3): Uses system_clock for UTC traceability. The previous implementation used steady_clock, which has no relationship to UTC (arbitrary epoch, typically boot time) — every timestamp it produced was regulatory-invalid under RTS 25.

system_clock::now() returns UTC wall-clock time suitable for regulatory timestamp fields. On POSIX systems this is clock_gettime(CLOCK_REALTIME).

Guarantees monotonically increasing timestamps across concurrent callers (MiFID II RTS 24 Art.2 timestamp ordering, CWE-362 race guard). If system_clock returns a value <= the last observed (e.g. due to NTP step adjustment), the result is bumped to last_ns + 1 to preserve hash chain ordering invariants.

Definition at line 110 of file audit_chain.hpp.

◆ onnx_type_name()

const char * signet::forge::onnx_type_name ( OnnxTensorType  t)
inline

Return a human-readable string for an OnnxTensorType value.

Useful for diagnostics, logging, and error messages. Returns "UNKNOWN" for values not in the OnnxTensorType enumeration.

Parameters
tThe ONNX tensor type.
Returns
A static string literal (e.g. "FLOAT", "INT64", "UNDEFINED"). Never returns nullptr.

Definition at line 298 of file onnx_bridge.hpp.

◆ parquet_to_arrow_format()

const char * signet::forge::parquet_to_arrow_format ( PhysicalType  pt)
inline

Map a Parquet PhysicalType to an Arrow format string.

Arrow format strings for primitive types: "b" = bool, "c" = int8, "C" = uint8, "s" = int16, "S" = uint16, "i" = int32, "I" = uint32, "l" = int64, "L" = uint64, "f" = float32, "g" = float64, "e" = float16

Parameters
ptThe Parquet physical type to convert.
Returns
Single-character Arrow format string, or nullptr for types that have no direct Arrow primitive mapping (BYTE_ARRAY, FIXED_LEN_BYTE_ARRAY, INT96).
See also
tensor_dtype_to_arrow_format

Definition at line 176 of file arrow_bridge.hpp.

◆ physical_to_tensor_dtype()

expected< TensorDataType > signet::forge::physical_to_tensor_dtype ( PhysicalType  pt)
inline

Map a PhysicalType to a TensorDataType (for column export).

Parameters
ptThe Parquet physical type.
Returns
The corresponding TensorDataType, or UNSUPPORTED_TYPE for variable-length types (BYTE_ARRAY, FIXED_LEN_BYTE_ARRAY, INT96).

Definition at line 252 of file arrow_bridge.hpp.

◆ physical_type_byte_size()

size_t signet::forge::physical_type_byte_size ( PhysicalType  pt)
inline

Return the byte size for a PhysicalType (primitive types only).

Parameters
ptThe Parquet physical type.
Returns
Byte size of one element (1 for BOOLEAN, 4 for INT32/FLOAT, 8 for INT64/DOUBLE), or 0 for variable-length types.
Note
BOOLEAN returns 1 (byte-aligned storage), even though Arrow uses 1-bit packing in validity bitmaps.

Definition at line 272 of file arrow_bridge.hpp.

◆ prepare_for_onnx() [1/2]

expected< OnnxTensorInfo > signet::forge::prepare_for_onnx ( const OwnedTensor tensor)
inline

Prepare an OwnedTensor for ONNX Runtime consumption (zero-copy).

Delegates to the TensorView overload via the OwnedTensor's view(). The OwnedTensor must remain valid for the lifetime of the returned info.

Parameters
tensorThe OwnedTensor to export (must be valid and contiguous).
Returns
OnnxTensorInfo (zero-copy, is_owner = false), or an error.
See also
prepare_for_onnx(const TensorView&)

Definition at line 222 of file onnx_bridge.hpp.

◆ prepare_for_onnx() [2/2]

expected< OnnxTensorInfo > signet::forge::prepare_for_onnx ( const TensorView tensor)
inline

Prepare a TensorView for ONNX Runtime consumption (zero-copy).

For all supported numeric types (FLOAT32, FLOAT64, INT32, INT64, INT8, UINT8, INT16, FLOAT16, BOOL), this is zero-copy: the returned OnnxTensorInfo.data points directly into the TensorView's memory.

The TensorView must remain valid for the lifetime of the returned info (is_owner will be false). The tensor must be contiguous; non-contiguous tensors are rejected – call clone() first to produce a contiguous copy.

Parameters
tensorThe TensorView to export (must be valid and contiguous).
Returns
OnnxTensorInfo ready for OrtApi::CreateTensorWithDataAsOrtValue, or INTERNAL_ERROR for invalid tensors, UNSUPPORTED_TYPE for non-contiguous tensors or unmappable dtypes.
See also
OnnxTensorInfo, prepare_inputs_for_onnx

Definition at line 176 of file onnx_bridge.hpp.

◆ prepare_inputs_for_onnx()

expected< OnnxInputSet > signet::forge::prepare_inputs_for_onnx ( const std::vector< std::pair< std::string, TensorView > > &  inputs)
inline

Prepare a batch of named TensorViews for ONNX Runtime inference.

Each pair is (input_name, tensor_view). All tensors must be valid and contiguous. If any tensor fails preparation, the entire call fails with an error message identifying the failing input by name.

Parameters
inputsNon-empty vector of (name, TensorView) pairs. Names should match the model's input node names.
Returns
OnnxInputSet with all tensors prepared (zero-copy), or INTERNAL_ERROR for empty inputs, or any error that prepare_for_onnx() would return (prefixed with the input name).
See also
OnnxInputSet, prepare_for_onnx

Definition at line 264 of file onnx_bridge.hpp.

◆ register_snappy_codec()

void signet::forge::register_snappy_codec ( )
inline

Register the bundled Snappy codec with the global CodecRegistry.

Call this once at startup (e.g. from a top-level initializer or a codec_init function) to make Compression::SNAPPY available through compress() and decompress().

See also
SnappyCodec, CodecRegistry::register_codec()

Definition at line 608 of file snappy.hpp.

◆ tensor_dtype_name()

const char * signet::forge::tensor_dtype_name ( TensorDataType  dtype)
inlinenoexcept

Returns a human-readable name for a TensorDataType.

Definition at line 181 of file tensor_bridge.hpp.

◆ tensor_dtype_to_arrow_format()

const char * signet::forge::tensor_dtype_to_arrow_format ( TensorDataType  dtype)
inline

Map a TensorDataType to an Arrow format string.

Parameters
dtypeThe tensor data type to convert.
Returns
Single-character Arrow format string, or nullptr if no mapping exists (should not occur for valid TensorDataType values).
See also
parquet_to_arrow_format, arrow_format_to_tensor_dtype

Definition at line 193 of file arrow_bridge.hpp.

◆ tensor_element_size()

constexpr size_t signet::forge::tensor_element_size ( TensorDataType  dtype)
inlineconstexprnoexcept

Returns the byte size of a single element of the given tensor data type.

Definition at line 165 of file tensor_bridge.hpp.

◆ to_buffer_info()

expected< BufferInfo > signet::forge::to_buffer_info ( const TensorView tensor)
inline

Create a BufferInfo from a TensorView for Python buffer protocol export.

The tensor must be valid and contiguous. The returned BufferInfo's data pointer points directly into the TensorView's memory (zero-copy). The TensorView must remain valid for the lifetime of the BufferInfo.

Strides are computed as C-contiguous byte strides (innermost dimension has stride = itemsize, outer dimensions are products of inner shapes).

Parameters
tensorA valid, contiguous TensorView.
Returns
BufferInfo describing the tensor layout, or INTERNAL_ERROR for invalid tensors, UNSUPPORTED_TYPE for non-contiguous tensors or unmappable dtypes.
auto info = *to_buffer_info(view);
// Use with pybind11:
// return py::buffer_info(info.data, info.itemsize, info.format,
// info.ndim, info.shape, info.strides);
expected< BufferInfo > to_buffer_info(const TensorView &tensor)
Create a BufferInfo from a TensorView for Python buffer protocol export.
See also
BufferInfo, NumpyBridge

Definition at line 720 of file numpy_bridge.hpp.

◆ to_dlpack_dtype()

DLDataType signet::forge::to_dlpack_dtype ( TensorDataType  dtype)
inline

Convert a Signet TensorDataType to a DLPack DLDataType.

This function is total – all TensorDataType values have a DLPack mapping. Lanes is always set to 1 (scalar).

Mapping:

TensorDataType DLDataTypeCode bits
FLOAT32 kDLFloat 32
FLOAT64 kDLFloat 64
FLOAT16 kDLFloat 16
INT32 kDLInt 32
INT64 kDLInt 64
INT16 kDLInt 16
INT8 kDLInt 8
UINT8 kDLUInt 8
BOOL kDLUInt 8
Parameters
dtypeThe Signet tensor data type to convert.
Returns
The corresponding DLDataType descriptor.
Note
BOOL is mapped to UInt/8 per DLPack convention (no native bool type).
See also
from_dlpack_dtype

Definition at line 135 of file numpy_bridge.hpp.

◆ to_le_bytes() [1/2]

std::vector< uint8_t > signet::forge::to_le_bytes ( const std::string &  value)
inline

Overload for std::string – returns raw bytes (no endian conversion needed).

Parameters
valueThe string whose bytes are copied verbatim.
Returns
A vector containing the string's byte content.

Definition at line 53 of file statistics.hpp.

◆ to_le_bytes() [2/2]

template<typename T >
std::vector< uint8_t > signet::forge::to_le_bytes ( value)
inline

Convert an arithmetic value to its little-endian byte representation.

On little-endian platforms (x86, ARM) this is a straight memcpy. On big-endian platforms the bytes are reversed.

Template Parameters
TAn arithmetic type (int32_t, double, etc.).
Parameters
valueThe value to convert.
Returns
A vector of sizeof(T) bytes in little-endian order.

Definition at line 34 of file statistics.hpp.

◆ to_onnx_type()

OnnxTensorType signet::forge::to_onnx_type ( TensorDataType  dtype)
inline

Convert a Signet TensorDataType to the corresponding OnnxTensorType.

All TensorDataType values have a direct ONNX mapping; this function is total (never returns UNDEFINED for valid inputs).

Parameters
dtypeThe Signet tensor data type.
Returns
The corresponding OnnxTensorType.
See also
from_onnx_type

Definition at line 68 of file onnx_bridge.hpp.

◆ validate_mmap_page_value_count()

expected< size_t > signet::forge::validate_mmap_page_value_count ( int64_t  num_values,
const char *  context 
)
inline

Definition at line 256 of file mmap_reader.hpp.

◆ validate_page_value_count()

expected< size_t > signet::forge::validate_page_value_count ( int64_t  num_values,
const char *  context 
)
inline

Definition at line 110 of file reader.hpp.

Variable Documentation

◆ HASH_CHAIN_ENTRY_SIZE

constexpr size_t signet::forge::HASH_CHAIN_ENTRY_SIZE = 112
inlineconstexpr

Chain summary stored in Parquet key-value metadata.

Size of a single serialized HashChainEntry in bytes.

Definition at line 89 of file audit_chain.hpp.

◆ kEncryptedPageHeaderMagic

constexpr uint8_t signet::forge::kEncryptedPageHeaderMagic[4] = {'S', 'P', 'H', '1'}
inlineconstexpr

Definition at line 120 of file reader.hpp.

◆ PARQUET_MAGIC

constexpr uint32_t signet::forge::PARQUET_MAGIC = 0x31524150
inlineconstexpr

"PAR1" magic bytes (little-endian uint32) — marks a standard Parquet file.

Definition at line 205 of file types.hpp.

◆ PARQUET_MAGIC_ENCRYPTED

constexpr uint32_t signet::forge::PARQUET_MAGIC_ENCRYPTED = 0x45524150
inlineconstexpr

"PARE" magic bytes (little-endian uint32) — marks a Parquet file with an encrypted footer.

Definition at line 207 of file types.hpp.

◆ parquet_type_of_v

template<typename T >
constexpr PhysicalType signet::forge::parquet_type_of_v = parquet_type_of<T>::value
inlineconstexpr

Convenience variable template: parquet_type_of_v<double> == PhysicalType::DOUBLE.

Definition at line 180 of file types.hpp.

◆ PARQUET_VERSION

constexpr int32_t signet::forge::PARQUET_VERSION = 2
inlineconstexpr

Parquet format version written to the file footer.

Definition at line 201 of file types.hpp.

◆ SIGNET_CREATED_BY

constexpr const char* signet::forge::SIGNET_CREATED_BY = "SignetStack signet-forge version 0.1.0"
inlineconstexpr

Default "created_by" string embedded in every Parquet footer.

Definition at line 203 of file types.hpp.