![]() |
Signet Forge 0.1.0
C++20 Parquet library with AI-native extensions
|
DEMO |
Namespaces | |
| namespace | byte_stream_split |
| BYTE_STREAM_SPLIT encoding functions for float and double types. | |
| namespace | commercial |
| Commercial licensing and evaluation-tier usage enforcement. | |
| namespace | crypto |
| namespace | delta |
| namespace | detail |
| Internal implementation details for dictionary encoding. | |
| namespace | detail_mmap |
| namespace | detail_mmap_reader |
| namespace | detail_reader |
| namespace | dora |
| namespace | eu_ai_act |
| namespace | gdpr |
| namespace | mifid2 |
| namespace | regulatory |
| namespace | risk |
| namespace | simd |
| Platform-optimized SIMD routines for common vector operations. | |
| namespace | thrift |
| namespace | xxhash |
| xxHash64 hashing functions for Parquet bloom filter support. | |
| namespace | z_order |
| Z-order curve (Morton code) utilities for spatial sort keys. | |
Classes | |
| class | Arena |
| Bump-pointer arena allocator for batch Parquet reads. More... | |
| class | ArrowExporter |
| Exports Signet Forge tensors and columns as Arrow C Data Interface structs. More... | |
| class | ArrowImporter |
| Imports Arrow C Data Interface arrays into Signet TensorView or OwnedTensor. More... | |
| struct | Art15Metrics |
| Computed accuracy, robustness, and drift metrics per EU AI Act Art.15. More... | |
| class | Art15MetricsCalculator |
| Computes Art.15 accuracy, robustness, and drift metrics from inference records. More... | |
| class | AuditChainVerifier |
| Verifies hash chain integrity. More... | |
| class | AuditChainWriter |
| Builds SHA-256 hash chains during Parquet writes. More... | |
| struct | AuditMetadata |
| Chain summary stored in Parquet key-value metadata. More... | |
| class | BatchTensorBuilder |
| Builds a single contiguous 2D tensor from multiple column tensors, suitable for passing to an ML inference engine (ONNX Runtime, etc.). More... | |
| struct | BufferInfo |
| Simple C-contiguous buffer descriptor for Python interop. More... | |
| struct | ClockSyncStatus |
| NTP/PTP clock synchronization status for MiFID II RTS 25 Art.3. More... | |
| class | CodecRegistry |
| Thread-safe singleton registry of compression codec implementations. More... | |
| struct | Column |
| Typed column descriptor for the Schema::build() variadic API. More... | |
| class | ColumnBatch |
| A column-major batch of feature rows for ML inference and WAL serialization. More... | |
| struct | ColumnDesc |
| Describes a single column in a ColumnBatch schema. More... | |
| struct | ColumnDescriptor |
| Descriptor for a single column in a Parquet schema. More... | |
| struct | ColumnFileStats |
| Per-column statistics from ParquetReader::file_stats(). More... | |
| struct | ColumnIndex |
| Per-page min/max statistics for predicate pushdown. More... | |
| class | ColumnIndexBuilder |
| Builder that accumulates per-page statistics during column writing. More... | |
| class | ColumnReader |
| PLAIN-encoded Parquet column decoder. More... | |
| class | ColumnStatistics |
| Per-column-chunk statistics tracker. More... | |
| class | ColumnToTensor |
| Provides static methods to convert Parquet column data into tensor form. More... | |
| class | ColumnWriter |
| PLAIN encoding writer for a single Parquet column. More... | |
| struct | ColumnWriteStats |
| Per-column statistics produced by ParquetWriter::close(). More... | |
| struct | ComplianceReport |
| The generated compliance report returned to the caller. More... | |
| class | CompressionCodec |
| Abstract base class for all compression/decompression codecs. More... | |
| class | DataClassificationOntology |
| A named collection of data classification rules forming a formal ontology. More... | |
| struct | DataClassificationRule |
| Per-field data classification and handling policy. More... | |
| class | DecisionLogReader |
| Reads AI decision log Parquet files and verifies hash chain integrity. More... | |
| class | DecisionLogWriter |
| Writes AI trading decision records to Parquet files with cryptographic hash chaining for tamper-evident audit trails. More... | |
| struct | DecisionRecord |
| A single AI-driven trading decision with full provenance. More... | |
| class | Dequantizer |
| Dequantizes INT8/INT4 quantized vectors back to float32. More... | |
| class | DictionaryDecoder |
| Dictionary decoder for Parquet PLAIN_DICTIONARY / RLE_DICTIONARY encoding. More... | |
| class | DictionaryEncoder |
| Dictionary encoder for Parquet PLAIN_DICTIONARY / RLE_DICTIONARY encoding. More... | |
| struct | DLDataType |
| DLPack data type descriptor. More... | |
| struct | DLDevice |
| DLPack device descriptor (type + ordinal). More... | |
| struct | DLManagedTensor |
| DLPack managed tensor – the exchange object for from_dlpack(). More... | |
| struct | DLTensor |
| DLPack tensor descriptor (non-owning). More... | |
| struct | DreadScore |
| DREAD risk quantification — 5 factors scored 1..10. More... | |
| struct | Error |
| Lightweight error value carrying an ErrorCode and a human-readable message. More... | |
| class | EUAIActReporter |
| EU AI Act compliance report generator (Regulation (EU) 2024/1689). More... | |
| class | EventBus |
| Multi-tier event bus for routing SharedColumnBatch events. More... | |
| struct | EventBusOptions |
| Configuration options for EventBus. More... | |
| class | expected |
A lightweight result type that holds either a success value of type T or an Error. More... | |
| class | expected< void > |
| Specialization of expected for void — used for operations that return success or error only. More... | |
| struct | FeatureGroupDef |
| Schema definition for a single feature group. More... | |
| class | FeatureReader |
| Point-in-time correct ML feature store reader over Parquet files. More... | |
| struct | FeatureReaderOptions |
| Configuration options for FeatureReader::open(). More... | |
| struct | FeatureVector |
| A single versioned observation for one entity. More... | |
| class | FeatureWriter |
| Append-only writer for a single feature group. More... | |
| struct | FeatureWriterOptions |
| Configuration options for FeatureWriter::create(). More... | |
| struct | FileStats |
| Aggregate file-level statistics returned by ParquetReader::file_stats(). More... | |
| struct | HashChainEntry |
| A single link in the cryptographic hash chain. More... | |
| class | HumanOverrideLogReader |
| Reads human override log Parquet files and verifies hash chain integrity. More... | |
| class | HumanOverrideLogWriter |
| Writes human override events to Parquet files with cryptographic hash chaining for tamper-evident audit trails. More... | |
| struct | HumanOverrideRecord |
| A single human oversight event with full provenance. More... | |
| struct | HybridQueryOptions |
| Per-query filter options passed to HybridReader::read(). More... | |
| class | HybridReader |
| Reads StreamRecords across historical Parquet files and (optionally) a live StreamingSink ring buffer snapshot. More... | |
| struct | HybridReaderOptions |
| Options for constructing a HybridReader via HybridReader::create(). More... | |
| struct | IncidentPlaybook |
| An ordered sequence of response steps for a specific incident type. More... | |
| class | IncidentResponseTracker |
| Tracks execution progress of a playbook during an active incident. More... | |
| class | InferenceLogReader |
| Reads ML inference log Parquet files and verifies hash chain integrity. More... | |
| class | InferenceLogWriter |
| Writes ML inference records to Parquet files with cryptographic hash chaining for tamper-evident audit trails. More... | |
| struct | InferenceRecord |
| A single ML inference event with full operational metadata. More... | |
| class | LogRetentionManager |
| Manages log file lifecycle: retention, archival, and deletion. More... | |
| class | MappedSegment |
| RAII cross-platform memory-mapped file segment. More... | |
| class | MiFID2Reporter |
| MiFID II RTS 24 algorithmic trading compliance report generator. More... | |
| struct | Mitigation |
| A specific mitigation control for a threat. More... | |
| class | MmapParquetReader |
| class | MmapReader |
| Low-level memory-mapped file handle. More... | |
| class | MpmcRing |
| Lock-free bounded multi-producer multi-consumer ring buffer. More... | |
| class | MpscRingBuffer |
| Multiple-producer single-consumer (MPSC) bounded ring buffer. More... | |
| struct | native_type_of |
| Maps a Parquet PhysicalType back to its corresponding C++ native type. More... | |
| struct | native_type_of< PhysicalType::BOOLEAN > |
| struct | native_type_of< PhysicalType::BYTE_ARRAY > |
| struct | native_type_of< PhysicalType::DOUBLE > |
| struct | native_type_of< PhysicalType::FLOAT > |
| struct | native_type_of< PhysicalType::INT32 > |
| struct | native_type_of< PhysicalType::INT64 > |
| class | NumpyBridge |
| Exports and imports Signet tensors via DLPack, enabling zero-copy interoperability with PyTorch, NumPy, JAX, and other DLPack-aware frameworks. More... | |
| struct | OffsetIndex |
| Page locations for random access within a column chunk. More... | |
| struct | OnnxInputSet |
| A set of named ONNX tensors for multi-input model inference. More... | |
| struct | OnnxTensorInfo |
| Contains all information needed to create an OrtValue externally. More... | |
| class | OverrideRateMonitor |
| Sliding-window override rate monitor — EU AI Act Art.14(5). More... | |
| struct | OverrideRateMonitorOptions |
| Options for the override rate monitor. More... | |
| class | OwnedTensor |
| An owning tensor that manages its own memory via a std::vector<uint8_t> buffer. More... | |
| struct | PageLocation |
| File offset and size descriptor for a single data page. More... | |
| struct | parquet_type_of |
| Maps a C++ type to its corresponding Parquet PhysicalType at compile time. More... | |
| struct | parquet_type_of< bool > |
| struct | parquet_type_of< double > |
| struct | parquet_type_of< float > |
| struct | parquet_type_of< int32_t > |
| struct | parquet_type_of< int64_t > |
| struct | parquet_type_of< std::string > |
| class | ParquetReader |
| Parquet file reader with typed column access and full encoding support. More... | |
| class | ParquetWriter |
| Streaming Parquet file writer with row-based and column-based APIs. More... | |
| class | PlaybookRegistry |
| Registry of incident response playbooks indexed by incident type. More... | |
| struct | PlaybookStep |
| A single step in an incident response playbook. More... | |
| struct | QuantizationParams |
| Parameters that fully describe a quantization mapping. More... | |
| class | QuantizedVectorReader |
| Reads quantized page data (FIXED_LEN_BYTE_ARRAY) and dequantizes to float32 on demand. More... | |
| class | QuantizedVectorWriter |
| Accumulates float32 vectors, quantizes them, and produces FIXED_LEN_BYTE_ARRAY page data suitable for Parquet column chunks. More... | |
| class | Quantizer |
| Quantizes float32 vectors to INT8 or INT4 representation. More... | |
| struct | RegulatoryChange |
| A tracked regulatory change record. More... | |
| class | RegulatoryChangeMonitor |
| Registry and tracker for regulatory changes affecting the system. More... | |
| struct | ReportOptions |
| Query and formatting parameters for compliance report generation. More... | |
| struct | RetentionPolicy |
| Retention policy configuration for log lifecycle management. More... | |
| struct | RetentionSummary |
| Summary of a retention enforcement pass. More... | |
| class | RleDecoder |
| Streaming decoder for the Parquet RLE/Bit-Packing Hybrid scheme. More... | |
| class | RleEncoder |
| Streaming encoder for the Parquet RLE/Bit-Packing Hybrid scheme. More... | |
| class | RowLineageTracker |
| Per-row lineage tracking inspired by Iceberg V3-style data governance. More... | |
| class | Schema |
| Immutable schema description for a Parquet file. More... | |
| class | SchemaBuilder |
| Fluent builder for constructing a Schema one column at a time. More... | |
| class | SnappyCodec |
| Bundled Snappy compression codec (header-only, no external dependency). More... | |
| class | SplitBlockBloomFilter |
| Parquet-spec Split Block Bloom Filter for probabilistic set membership. More... | |
| class | SpscRingBuffer |
| Lock-free single-producer single-consumer (SPSC) bounded ring buffer. More... | |
| class | StreamingSink |
| Background-thread Parquet compaction sink fed by a lock-free ring buffer. More... | |
| struct | StreamRecord |
| A single record flowing through the StreamingSink pipeline. More... | |
| struct | TensorShape |
| Describes the shape of a tensor as a vector of dimension sizes. More... | |
| class | TensorView |
| A lightweight, non-owning view into a contiguous block of typed memory, interpreted as a multi-dimensional tensor. More... | |
| struct | ThreatEntry |
| A single identified threat in the threat model. More... | |
| struct | ThreatModel |
| A threat model for a specific component or the entire system. More... | |
| struct | ThreatModelAnalysis |
| Analysis result from validating a threat model. More... | |
| class | ThreatModelAnalyzer |
| Validates threat model coverage and produces audit-ready JSON. More... | |
| struct | VectorColumnSpec |
| Configuration for a vector column: dimensionality and element precision. More... | |
| class | VectorReader |
| Reads FIXED_LEN_BYTE_ARRAY page data back into float vectors. More... | |
| class | VectorWriter |
| Buffers float vectors and encodes them as FIXED_LEN_BYTE_ARRAY PLAIN data. More... | |
| struct | WalEntry |
| A single decoded WAL record returned by WalReader::next() or read_all(). More... | |
| class | WalManager |
| Manages multiple rolling WAL segment files in a directory. More... | |
| struct | WalManagerOptions |
| Configuration options for WalManager::open(). More... | |
| struct | WalMmapOptions |
| Configuration options for WalMmapWriter::open(). More... | |
| class | WalMmapWriter |
| High-performance WAL writer using a ring of N memory-mapped segments. More... | |
| class | WalReader |
| Sequential WAL file reader for crash recovery and replay. More... | |
| class | WalWriter |
| Append-only Write-Ahead Log writer with CRC-32 integrity per record. More... | |
| struct | WalWriterOptions |
| Configuration options for WalWriter::open(). More... | |
| struct | WriterOptions |
| Configuration options for ParquetWriter. More... | |
| struct | WriteStats |
| File-level write statistics returned by ParquetWriter::close(). More... | |
Typedefs | |
| using | TDT = TensorDataType |
| Convenience alias for TensorDataType (shorter schema declarations). | |
| using | SharedColumnBatch = std::shared_ptr< ColumnBatch > |
| Thread-safe shared pointer to a ColumnBatch – the unit transferred between producer and consumer threads via EventBus. | |
| using | WalRecord = WalEntry |
| Alias so callers can use either WalEntry or WalRecord. | |
| template<PhysicalType PT> | |
| using | native_type_of_t = typename native_type_of< PT >::type |
Convenience alias: native_type_of_t<PhysicalType::INT64> == int64_t. | |
Enumerations | |
| enum class | ReportFormat { JSON , NDJSON , CSV } |
| Output serialization format for compliance reports. More... | |
| enum class | TimestampGranularity { NANOS , MICROS , MILLIS } |
| Timestamp granularity for MiFID II RTS 24 Art.2(2) compliance. More... | |
| enum class | ComplianceStandard { MIFID2_RTS24 , EU_AI_ACT_ART12 , EU_AI_ACT_ART13 , EU_AI_ACT_ART19 } |
| Which regulatory standard a compliance report satisfies. More... | |
| enum class | DataClassification : int32_t { PUBLIC = 0 , INTERNAL = 1 , RESTRICTED = 2 , HIGHLY_RESTRICTED = 3 } |
| Data confidentiality level per DORA Art.8 + ISO 27001 Annex A. More... | |
| enum class | DataSensitivity : int32_t { NEUTRAL = 0 , PSEUDONYMISED = 1 , ANONYMISED = 2 , PII = 3 , FINANCIAL_PII = 4 , BIOMETRIC = 5 , HEALTH = 6 } |
| Data sensitivity per GDPR Art.9 special categories. More... | |
| enum class | RegulatoryRegime : int32_t { NONE = 0 , GDPR = 1 , MIFID2 = 2 , DORA = 3 , EU_AI_ACT = 4 , SOX = 5 , SEC_17A4 = 6 , PCI_DSS = 7 , HIPAA = 8 } |
| Regulatory regime(s) applicable to the data. More... | |
| enum class | DecisionType : int32_t { SIGNAL = 0 , ORDER_NEW = 1 , ORDER_CANCEL = 2 , ORDER_MODIFY = 3 , POSITION_OPEN = 4 , POSITION_CLOSE = 5 , RISK_OVERRIDE = 6 , NO_ACTION = 7 } |
| Classification of the AI-driven trading decision. More... | |
| enum class | RiskGateResult : int32_t { PASSED = 0 , REJECTED = 1 , MODIFIED = 2 , THROTTLED = 3 } |
| Outcome of the pre-trade risk gate evaluation. More... | |
| enum class | OrderType : int32_t { MARKET = 0 , LIMIT = 1 , STOP = 2 , STOP_LIMIT = 3 , PEGGED = 4 , OTHER = 99 } |
| Order type classification for MiFID II RTS 24 Annex I Table 2 Field 7. More... | |
| enum class | TimeInForce : int32_t { DAY = 0 , GTC = 1 , IOC = 2 , FOK = 3 , GTD = 4 , OTHER = 99 } |
| Time-in-force classification for MiFID II RTS 24 Annex I Table 2 Field 8. More... | |
| enum class | BuySellIndicator : int32_t { BUY = 0 , SELL = 1 , SHORT_SELL = 2 } |
| Buy/sell direction for MiFID II RTS 24 Annex I Table 2 Field 6. More... | |
| enum class | OverrideSource : int32_t { ALGORITHMIC = 0 , HUMAN = 1 , AUTOMATED = 2 } |
| Source of a decision or override — EU AI Act Art.14(4). More... | |
| enum class | OverrideAction : int32_t { APPROVE = 0 , MODIFY = 1 , REJECT = 2 , ESCALATE = 3 , HALT = 4 } |
| What action the human override took — EU AI Act Art.14(4). More... | |
| enum class | HaltReason : int32_t { MANUAL = 0 , SAFETY_THRESHOLD = 1 , ANOMALY_DETECTED = 2 , REGULATORY = 3 , MAINTENANCE = 4 , EXTERNAL = 5 } |
| Reason for system halt — EU AI Act Art.14(4) "stop button". More... | |
| enum class | IncidentPhase : int32_t { PREPARATION = 0 , DETECTION = 1 , CONTAINMENT = 2 , ERADICATION = 3 , RECOVERY = 4 , LESSONS_LEARNED = 5 } |
| NIST SP 800-61 incident response lifecycle phases. More... | |
| enum class | IncidentSeverity : int32_t { P4_LOW = 0 , P3_MEDIUM = 1 , P2_HIGH = 2 , P1_CRITICAL = 3 } |
| Incident severity per DORA Art.10(1) classification. More... | |
| enum class | EscalationLevel : int32_t { L1_OPERATIONS = 0 , L2_ENGINEERING = 1 , L3_MANAGEMENT = 2 , L4_REGULATORY = 3 } |
| Escalation hierarchy for incident routing. More... | |
| enum class | NotificationChannel : int32_t { INTERNAL_LOG = 0 , EMAIL = 1 , PAGER = 2 , REGULATORY = 3 } |
| Notification channel for incident communications. More... | |
| enum class | InferenceType : int32_t { CLASSIFICATION = 0 , REGRESSION = 1 , EMBEDDING = 2 , GENERATION = 3 , RANKING = 4 , ANOMALY = 5 , RECOMMENDATION = 6 , CUSTOM = 255 } |
| Classification of the ML inference operation. More... | |
| enum class | QuantizationScheme : int32_t { SYMMETRIC_INT8 = 0 , ASYMMETRIC_INT8 = 1 , SYMMETRIC_INT4 = 2 } |
| Identifies the quantization method used for vector compression. More... | |
| enum class | RegulatoryChangeType : int32_t { NEW_REGULATION = 0 , AMENDMENT = 1 , GUIDANCE = 2 , TECHNICAL_STANDARD = 3 , ENFORCEMENT = 4 , DEPRECATION = 5 } |
| Type of regulatory change being tracked. More... | |
| enum class | RegulatoryImpact : int32_t { NONE = 0 , INFORMATIONAL = 1 , LOW = 2 , MEDIUM = 3 , HIGH = 4 , CRITICAL = 5 } |
| Impact level of a regulatory change on the system. More... | |
| enum class | ChangeComplianceStatus : int32_t { NOT_ASSESSED = 0 , ASSESSED = 1 , IN_PROGRESS = 2 , IMPLEMENTED = 3 , VERIFIED = 4 , NOT_APPLICABLE = 5 } |
| Compliance status for a tracked regulatory change. More... | |
| enum class | TensorDataType : int32_t { FLOAT32 = 0 , FLOAT64 = 1 , INT32 = 2 , INT64 = 3 , INT8 = 4 , UINT8 = 5 , INT16 = 6 , FLOAT16 = 7 , BOOL = 8 } |
| Element data type for tensor storage, mapping to ONNX/PyTorch/TF type enums. More... | |
| enum class | StrideCategory : int32_t { SPOOFING = 0 , TAMPERING = 1 , REPUDIATION = 2 , INFORMATION_DISCLOSURE = 3 , DENIAL_OF_SERVICE = 4 , ELEVATION_OF_PRIVILEGE = 5 } |
| Microsoft STRIDE threat categories. More... | |
| enum class | ThreatSeverity : int32_t { LOW = 0 , MEDIUM = 1 , HIGH = 2 , CRITICAL = 3 } |
| Threat severity classification per NIST SP 800-30. More... | |
| enum class | MitigationStatus : int32_t { NOT_MITIGATED = 0 , PARTIAL = 1 , MITIGATED = 2 , ACCEPTED = 3 , TRANSFERRED = 4 } |
| Mitigation status for a threat. More... | |
| enum class | VectorElementType : int32_t { FLOAT32 = 0 , FLOAT64 = 1 , FLOAT16 = 2 } |
| Specifies the numerical precision of each element within a vector column. More... | |
| enum class | WalLifecycleMode : uint8_t { Development = 0 , Benchmark = 1 , Production = 2 } |
| Controls safety guardrails for WAL segment lifecycle operations. More... | |
| enum class | ErrorCode { OK = 0 , IO_ERROR , INVALID_FILE , CORRUPT_FOOTER , CORRUPT_PAGE , CORRUPT_DATA , INVALID_ARGUMENT , UNSUPPORTED_ENCODING , UNSUPPORTED_COMPRESSION , UNSUPPORTED_TYPE , SCHEMA_MISMATCH , OUT_OF_RANGE , THRIFT_DECODE_ERROR , ENCRYPTION_ERROR , HASH_CHAIN_BROKEN , LICENSE_ERROR , LICENSE_LIMIT_EXCEEDED , INTERNAL_ERROR } |
| Error codes returned by all Signet Forge operations. More... | |
| enum class | OnnxTensorType : int32_t { UNDEFINED = 0 , FLOAT = 1 , UINT8 = 2 , INT8 = 3 , UINT16 = 4 , INT16 = 5 , INT32 = 6 , INT64 = 7 , STRING = 8 , BOOL = 9 , FLOAT16 = 10 , DOUBLE = 11 , UINT32 = 12 , UINT64 = 13 , BFLOAT16 = 16 } |
| ONNX tensor element data types, mirroring OrtTensorElementDataType. More... | |
| enum class | PhysicalType : int32_t { BOOLEAN = 0 , INT32 = 1 , INT64 = 2 , INT96 = 3 , FLOAT = 4 , DOUBLE = 5 , BYTE_ARRAY = 6 , FIXED_LEN_BYTE_ARRAY = 7 } |
| Parquet physical (storage) types as defined in parquet.thrift. More... | |
| enum class | LogicalType : int32_t { NONE = 0 , STRING = 1 , ENUM = 2 , UUID = 3 , DATE = 4 , TIME_MS = 5 , TIME_US = 6 , TIME_NS = 7 , TIMESTAMP_MS = 8 , TIMESTAMP_US = 9 , TIMESTAMP_NS = 10 , DECIMAL = 11 , JSON = 12 , BSON = 13 , FLOAT16 = 14 , FLOAT32_VECTOR = 100 } |
| Parquet logical types (from parquet.thrift LogicalType union). More... | |
| enum class | ConvertedType : int32_t { NONE = -1 , UTF8 = 0 , MAP = 1 , MAP_KEY_VALUE = 2 , LIST = 3 , ENUM = 4 , DECIMAL = 5 , DATE = 6 , TIME_MILLIS = 7 , TIME_MICROS = 8 , TIMESTAMP_MILLIS = 9 , TIMESTAMP_MICROS = 10 , UINT_8 = 11 , UINT_16 = 12 , UINT_32 = 13 , UINT_64 = 14 , INT_8 = 15 , INT_16 = 16 , INT_32 = 17 , INT_64 = 18 , JSON = 19 , BSON = 20 , INTERVAL = 21 } |
| Legacy Parquet converted types for backward compatibility with older readers. More... | |
| enum class | Encoding : int32_t { PLAIN = 0 , PLAIN_DICTIONARY = 2 , RLE = 3 , BIT_PACKED = 4 , DELTA_BINARY_PACKED = 5 , DELTA_LENGTH_BYTE_ARRAY = 6 , DELTA_BYTE_ARRAY = 7 , RLE_DICTIONARY = 8 , BYTE_STREAM_SPLIT = 9 } |
| Parquet page encoding types. More... | |
| enum class | Compression : int32_t { UNCOMPRESSED = 0 , SNAPPY = 1 , GZIP = 2 , LZO = 3 , BROTLI = 4 , LZ4 = 5 , ZSTD = 6 , LZ4_RAW = 7 } |
| Parquet compression codecs. More... | |
| enum class | PageType : int32_t { DATA_PAGE = 0 , INDEX_PAGE = 1 , DICTIONARY_PAGE = 2 , DATA_PAGE_V2 = 3 } |
| Parquet page types within a column chunk. More... | |
| enum class | Repetition : int32_t { REQUIRED = 0 , OPTIONAL = 1 , REPEATED = 2 } |
| Parquet field repetition types (nullability / cardinality). More... | |
DLPack type definitions (matching dlpack.h v0.8) | |
Self-contained DLPack struct definitions for zero-dependency interop. | |
| enum class | DLDeviceType : int32_t { kDLCPU = 1 , kDLCUDA = 2 , kDLCUDAHost = 3 , kDLROCM = 10 , kDLMetal = 8 , kDLVulkan = 7 } |
| DLPack device type, matching DLDeviceType from dlpack.h. More... | |
| enum class | DLDataTypeCode : uint8_t { kDLInt = 0 , kDLUInt = 1 , kDLFloat = 2 , kDLBfloat = 4 } |
| DLPack data type code, matching DLDataTypeCode from dlpack.h. More... | |
Functions | |
| int64_t | now_ns () |
| Return the current time as nanoseconds since the Unix epoch (UTC). | |
| std::string | hash_to_hex (const std::array< uint8_t, 32 > &hash) |
| Convert a 32-byte SHA-256 hash to a lowercase hexadecimal string (64 chars). | |
| expected< std::array< uint8_t, 32 > > | hex_to_hash (const std::string &hex) |
| Convert a 64-character lowercase hex string back to a 32-byte hash. | |
| std::string | generate_chain_id () |
| Generate a simple chain identifier based on the current timestamp. | |
| expected< AuditMetadata > | build_audit_metadata (const AuditChainWriter &writer, const std::string &chain_id) |
| Build an AuditMetadata from a populated AuditChainWriter. | |
| expected< std::vector< HashChainEntry > > | deserialize_and_verify_chain (const uint8_t *chain_data, size_t chain_size) |
| Deserialize and verify a chain from serialized bytes in one call. | |
| SharedColumnBatch | make_column_batch (std::vector< ColumnDesc > schema, size_t reserve_rows=64) |
| Convenience factory: create a shared batch with a given schema. | |
| Schema | decision_log_schema () |
| Build the Parquet schema for AI decision log files. | |
| Schema | human_override_log_schema () |
| Build the Parquet schema for human override log files. | |
| Schema | inference_log_schema () |
| Build the Parquet schema for ML inference log files. | |
| constexpr size_t | tensor_element_size (TensorDataType dtype) noexcept |
| Returns the byte size of a single element of the given tensor data type. | |
| const char * | tensor_dtype_name (TensorDataType dtype) noexcept |
| Returns a human-readable name for a TensorDataType. | |
| float | f16_to_f32 (uint16_t h) noexcept |
| Convert a 16-bit IEEE 754 half-precision value to a 32-bit float. | |
| uint16_t | f32_to_f16 (float val) noexcept |
| Convert a 32-bit float to a 16-bit IEEE 754 half-precision value. | |
| SchemaBuilder & | add_vector_column (SchemaBuilder &builder, const std::string &name, uint32_t dimension, VectorElementType elem=VectorElementType::FLOAT32) |
| Add a vector column to a SchemaBuilder. | |
| void | append_le32 (std::vector< uint8_t > &buf, uint32_t val) |
| Append a uint32_t in little-endian byte order to a byte buffer. | |
| void | append_le64 (std::vector< uint8_t > &buf, uint64_t val) |
| Append a uint64_t in little-endian byte order to a byte buffer. | |
| expected< std::vector< uint8_t > > | compress (Compression codec, const uint8_t *data, size_t size) |
| Compress data using the specified codec via the global CodecRegistry. | |
| expected< std::vector< uint8_t > > | decompress (Compression codec, const uint8_t *data, size_t size, size_t uncompressed_size) |
| Decompress data using the specified codec via the global CodecRegistry. | |
| Compression | auto_select_compression (const uint8_t *sample_data, size_t sample_size) |
| Automatically select the best available compression codec. | |
| void | register_snappy_codec () |
| Register the bundled Snappy codec with the global CodecRegistry. | |
| size_t | encode_varint (std::vector< uint8_t > &buf, uint64_t value) |
| Encode an unsigned varint (LEB128) into a byte buffer. | |
| uint64_t | decode_varint (const uint8_t *data, size_t &pos, size_t size) |
| Decode an unsigned varint (LEB128) from a byte buffer. | |
| void | bit_pack_8 (std::vector< uint8_t > &out, const uint64_t *values, int bit_width) |
| Pack exactly 8 values at the given bit width into a byte buffer. | |
| void | bit_unpack_8 (const uint8_t *src, uint64_t *values, int bit_width) |
| Unpack exactly 8 values at the given bit width from a byte buffer. | |
| expected< BufferInfo > | to_buffer_info (const TensorView &tensor) |
| Create a BufferInfo from a TensorView for Python buffer protocol export. | |
| expected< OnnxInputSet > | prepare_inputs_for_onnx (const std::vector< std::pair< std::string, TensorView > > &inputs) |
| Prepare a batch of named TensorViews for ONNX Runtime inference. | |
| const char * | onnx_type_name (OnnxTensorType t) |
| Return a human-readable string for an OnnxTensorType value. | |
| expected< size_t > | validate_mmap_page_value_count (int64_t num_values, const char *context) |
| expected< size_t > | validate_page_value_count (int64_t num_values, const char *context) |
| bool | has_encrypted_page_header_prefix (const uint8_t *data, size_t size) noexcept |
| uint32_t | load_le32 (const uint8_t *data) noexcept |
| template<typename T > | |
| std::vector< uint8_t > | to_le_bytes (T value) |
| Convert an arithmetic value to its little-endian byte representation. | |
| std::vector< uint8_t > | to_le_bytes (const std::string &value) |
| Overload for std::string – returns raw bytes (no endian conversion needed). | |
| template<typename T > | |
| T | from_le_bytes (const std::vector< uint8_t > &bytes) |
| Reconstruct an arithmetic value from its little-endian byte representation. | |
Format string mappings | |
Conversion functions between Parquet/Tensor types and Arrow format strings. | |
| const char * | parquet_to_arrow_format (PhysicalType pt) |
| Map a Parquet PhysicalType to an Arrow format string. | |
| const char * | tensor_dtype_to_arrow_format (TensorDataType dtype) |
| Map a TensorDataType to an Arrow format string. | |
| expected< TensorDataType > | arrow_format_to_tensor_dtype (const char *format) |
| Map an Arrow format string to a TensorDataType. | |
| expected< TensorDataType > | physical_to_tensor_dtype (PhysicalType pt) |
| Map a PhysicalType to a TensorDataType (for column export). | |
| size_t | physical_type_byte_size (PhysicalType pt) |
| Return the byte size for a PhysicalType (primitive types only). | |
Type conversion: TensorDataType <-> DLDataType | |
| DLDataType | to_dlpack_dtype (TensorDataType dtype) |
| Convert a Signet TensorDataType to a DLPack DLDataType. | |
| expected< TensorDataType > | from_dlpack_dtype (DLDataType dl_dtype) |
| Convert a DLPack DLDataType back to a Signet TensorDataType. | |
Type conversion: TensorDataType <-> OnnxTensorType | |
| OnnxTensorType | to_onnx_type (TensorDataType dtype) |
| Convert a Signet TensorDataType to the corresponding OnnxTensorType. | |
| expected< TensorDataType > | from_onnx_type (OnnxTensorType ort_type) |
| Convert an OnnxTensorType back to a Signet TensorDataType. | |
Zero-copy tensor export for ONNX Runtime | |
| expected< OnnxTensorInfo > | prepare_for_onnx (const TensorView &tensor) |
| Prepare a TensorView for ONNX Runtime consumption (zero-copy). | |
| expected< OnnxTensorInfo > | prepare_for_onnx (const OwnedTensor &tensor) |
| Prepare an OwnedTensor for ONNX Runtime consumption (zero-copy). | |
Variables | |
| constexpr size_t | HASH_CHAIN_ENTRY_SIZE = 112 |
| Chain summary stored in Parquet key-value metadata. | |
| constexpr uint8_t | kEncryptedPageHeaderMagic [4] = {'S', 'P', 'H', '1'} |
| template<typename T > | |
| constexpr PhysicalType | parquet_type_of_v = parquet_type_of<T>::value |
Convenience variable template: parquet_type_of_v<double> == PhysicalType::DOUBLE. | |
| constexpr int32_t | PARQUET_VERSION = 2 |
| Parquet format version written to the file footer. | |
| constexpr const char * | SIGNET_CREATED_BY = "SignetStack signet-forge version 0.1.0" |
| Default "created_by" string embedded in every Parquet footer. | |
| constexpr uint32_t | PARQUET_MAGIC = 0x31524150 |
| "PAR1" magic bytes (little-endian uint32) — marks a standard Parquet file. | |
| constexpr uint32_t | PARQUET_MAGIC_ENCRYPTED = 0x45524150 |
| "PARE" magic bytes (little-endian uint32) — marks a Parquet file with an encrypted footer. | |
| using signet::forge::native_type_of_t = typedef typename native_type_of<PT>::type |
Convenience alias: native_type_of_t<PhysicalType::INT64> == int64_t.
| using signet::forge::SharedColumnBatch = typedef std::shared_ptr<ColumnBatch> |
Thread-safe shared pointer to a ColumnBatch – the unit transferred between producer and consumer threads via EventBus.
Definition at line 400 of file column_batch.hpp.
| using signet::forge::TDT = typedef TensorDataType |
Convenience alias for TensorDataType (shorter schema declarations).
Definition at line 46 of file column_batch.hpp.
| using signet::forge::WalRecord = typedef WalEntry |
|
strong |
Buy/sell direction for MiFID II RTS 24 Annex I Table 2 Field 6.
| Enumerator | |
|---|---|
| BUY | |
| SELL | |
| SHORT_SELL | Short selling (RTS 24 Annex I Field 16) |
Definition at line 90 of file decision_log.hpp.
|
strong |
Compliance status for a tracked regulatory change.
Definition at line 65 of file regulatory_monitor.hpp.
|
strong |
Which regulatory standard a compliance report satisfies.
Definition at line 43 of file compliance_types.hpp.
|
strong |
Parquet compression codecs.
Snappy is bundled (header-only); ZSTD, LZ4, and Gzip require linking external libraries enabled via CMake options.
|
strong |
Legacy Parquet converted types for backward compatibility with older readers.
Prefer LogicalType for new code. ConvertedType is written to the Thrift footer only when a corresponding LogicalType mapping exists (e.g. STRING → UTF8).
|
strong |
Data confidentiality level per DORA Art.8 + ISO 27001 Annex A.
Definition at line 46 of file data_classification.hpp.
|
strong |
Data sensitivity per GDPR Art.9 special categories.
Definition at line 54 of file data_classification.hpp.
|
strong |
Classification of the AI-driven trading decision.
Covers the full lifecycle of order management decisions that must be logged under MiFID II RTS 24 and EU AI Act Article 12.
Definition at line 48 of file decision_log.hpp.
|
strong |
DLPack data type code, matching DLDataTypeCode from dlpack.h.
| Enumerator | |
|---|---|
| kDLInt | Signed integer. |
| kDLUInt | Unsigned integer. |
| kDLFloat | IEEE floating point. |
| kDLBfloat | Brain floating point (bfloat16) |
Definition at line 50 of file numpy_bridge.hpp.
|
strong |
DLPack device type, matching DLDeviceType from dlpack.h.
Only kDLCPU and kDLCUDAHost are supported for import by NumpyBridge. Other device types are defined for completeness and forward compatibility.
| Enumerator | |
|---|---|
| kDLCPU | System main memory. |
| kDLCUDA | NVIDIA CUDA GPU memory. |
| kDLCUDAHost | CUDA pinned host memory. |
| kDLROCM | AMD ROCm GPU memory. |
| kDLMetal | Apple Metal GPU memory. |
| kDLVulkan | Vulkan GPU memory. |
Definition at line 40 of file numpy_bridge.hpp.
|
strong |
Parquet page encoding types.
Each data page stores values using one of these encodings. The writer selects encoding per-column (or auto-selects based on data characteristics).
|
strong |
Error codes returned by all Signet Forge operations.
Every function in the library that can fail returns an expected<T> whose error payload carries one of these codes together with a human-readable message string. Codes are grouped by subsystem so that callers can pattern-match on categories (I/O, format corruption, unsupported features, licensing) without inspecting the message text.
|
strong |
Escalation hierarchy for incident routing.
Definition at line 64 of file incident_response.hpp.
|
strong |
Reason for system halt — EU AI Act Art.14(4) "stop button".
Definition at line 76 of file human_oversight.hpp.
|
strong |
NIST SP 800-61 incident response lifecycle phases.
Definition at line 46 of file incident_response.hpp.
|
strong |
Incident severity per DORA Art.10(1) classification.
Definition at line 56 of file incident_response.hpp.
|
strong |
Classification of the ML inference operation.
Covers common ML workloads from classical models to LLM generation.
Definition at line 47 of file inference_log.hpp.
|
strong |
Parquet logical types (from parquet.thrift LogicalType union).
Logical types add semantic meaning on top of a PhysicalType. For example, a STRING column is stored as BYTE_ARRAY but interpreted as UTF-8 text.
| Enumerator | |
|---|---|
| NONE | No logical annotation — raw physical type. |
| STRING | UTF-8 string (stored as BYTE_ARRAY). |
| ENUM | Enum string (stored as BYTE_ARRAY). |
| UUID | RFC 4122 UUID (stored as FIXED_LEN_BYTE_ARRAY(16)). |
| DATE | Calendar date — INT32, days since 1970-01-01. |
| TIME_MS | Time of day — INT32, milliseconds since midnight. |
| TIME_US | Time of day — INT64, microseconds since midnight. |
| TIME_NS | Time of day — INT64, nanoseconds since midnight. |
| TIMESTAMP_MS | Timestamp — INT64, milliseconds since Unix epoch. |
| TIMESTAMP_US | Timestamp — INT64, microseconds since Unix epoch. |
| TIMESTAMP_NS | Timestamp — INT64, nanoseconds since Unix epoch. |
| DECIMAL | Fixed-point decimal (INT32/INT64/FIXED_LEN_BYTE_ARRAY). |
| JSON | JSON document (stored as BYTE_ARRAY). |
| BSON | BSON document (stored as BYTE_ARRAY). |
| FLOAT16 | IEEE 754 half-precision float (FIXED_LEN_BYTE_ARRAY(2)). |
| FLOAT32_VECTOR | ML embedding vector — FIXED_LEN_BYTE_ARRAY(dim*4). Signet AI-native extension; stored as standard Parquet types with logical annotation only. |
|
strong |
Mitigation status for a threat.
Definition at line 68 of file threat_model.hpp.
|
strong |
Notification channel for incident communications.
| Enumerator | |
|---|---|
| INTERNAL_LOG | System log only. |
Email to responsible parties. | |
| PAGER | PagerDuty / on-call alert. |
| REGULATORY | Formal regulatory notification (DORA Art.19(1)) |
Definition at line 72 of file incident_response.hpp.
|
strong |
ONNX tensor element data types, mirroring OrtTensorElementDataType.
Numeric values match the ONNX Runtime C API's OrtTensorElementDataType enum exactly, so they can be cast directly via static_cast<> when constructing OrtValues.
Definition at line 39 of file onnx_bridge.hpp.
|
strong |
Order type classification for MiFID II RTS 24 Annex I Table 2 Field 7.
| Enumerator | |
|---|---|
| MARKET | Market order. |
| LIMIT | Limit order. |
| STOP | Stop order. |
| STOP_LIMIT | Stop-limit order. |
| PEGGED | Pegged order. |
| OTHER | Other order type. |
Definition at line 70 of file decision_log.hpp.
|
strong |
What action the human override took — EU AI Act Art.14(4).
Definition at line 67 of file human_oversight.hpp.
|
strong |
Source of a decision or override — EU AI Act Art.14(4).
Tracks whether an output was produced algorithmically or overridden by a human.
| Enumerator | |
|---|---|
| ALGORITHMIC | Original AI system output (no human intervention) |
| HUMAN | Human operator override. |
| AUTOMATED | Automated safety system override (e.g. risk gate) |
Definition at line 60 of file human_oversight.hpp.
|
strong |
Parquet page types within a column chunk.
|
strong |
Parquet physical (storage) types as defined in parquet.thrift.
Every column in a Parquet file stores values in one of these physical representations. The mapping from C++ types is provided by parquet_type_of.
|
strong |
Identifies the quantization method used for vector compression.
Definition at line 60 of file quantized_vector.hpp.
|
strong |
Type of regulatory change being tracked.
Definition at line 45 of file regulatory_monitor.hpp.
|
strong |
Impact level of a regulatory change on the system.
Definition at line 55 of file regulatory_monitor.hpp.
|
strong |
Regulatory regime(s) applicable to the data.
Definition at line 65 of file data_classification.hpp.
|
strong |
Parquet field repetition types (nullability / cardinality).
| Enumerator | |
|---|---|
| REQUIRED | Exactly one value per row (non-nullable). |
| OPTIONAL | Zero or one value per row (nullable). |
| REPEATED | Zero or more values per row (list). |
|
strong |
Output serialization format for compliance reports.
| Enumerator | |
|---|---|
| JSON | Pretty-printed JSON object (default) |
| NDJSON | Newline-delimited JSON — one record per line (streaming-friendly) |
| CSV | Comma-separated values with header row. |
Definition at line 25 of file compliance_types.hpp.
|
strong |
Outcome of the pre-trade risk gate evaluation.
Records whether the order passed, was rejected, modified, or throttled by the risk management system.
| Enumerator | |
|---|---|
| PASSED | All risk checks passed. |
| REJECTED | Order rejected by risk gate. |
| MODIFIED | Order modified by risk gate (e.g., size reduced) |
| THROTTLED | Order delayed by rate limiting. |
Definition at line 62 of file decision_log.hpp.
|
strong |
Microsoft STRIDE threat categories.
Definition at line 50 of file threat_model.hpp.
|
strong |
Element data type for tensor storage, mapping to ONNX/PyTorch/TF type enums.
Definition at line 148 of file tensor_bridge.hpp.
|
strong |
Threat severity classification per NIST SP 800-30.
| Enumerator | |
|---|---|
| LOW | DREAD composite < 4.0. |
| MEDIUM | DREAD composite 4.0 - 6.9. |
| HIGH | DREAD composite 7.0 - 8.9. |
| CRITICAL | DREAD composite >= 9.0. |
Definition at line 60 of file threat_model.hpp.
|
strong |
Time-in-force classification for MiFID II RTS 24 Annex I Table 2 Field 8.
| Enumerator | |
|---|---|
| DAY | Day order (valid until end of trading day) |
| GTC | Good-Till-Cancelled. |
| IOC | Immediate-Or-Cancel. |
| FOK | Fill-Or-Kill. |
| GTD | Good-Till-Date. |
| OTHER | Other. |
Definition at line 80 of file decision_log.hpp.
|
strong |
Timestamp granularity for MiFID II RTS 24 Art.2(2) compliance.
Controls the sub-second precision emitted in ISO 8601 timestamp fields. RTS 24 requires nanosecond precision for high-frequency trading; lower granularities may be appropriate for non-HFT reporting regimes.
| Enumerator | |
|---|---|
| NANOS | 9 sub-second digits (default, MiFID II HFT compliant) |
| MICROS | 6 sub-second digits |
| MILLIS | 3 sub-second digits |
Definition at line 36 of file compliance_types.hpp.
|
strong |
Specifies the numerical precision of each element within a vector column.
| Enumerator | |
|---|---|
| FLOAT32 | IEEE 754 single-precision (4 bytes per element) |
| FLOAT64 | IEEE 754 double-precision (8 bytes per element) |
| FLOAT16 | IEEE 754 half-precision (2 bytes per element) |
Definition at line 47 of file vector_type.hpp.
|
strong |
|
inline |
Add a vector column to a SchemaBuilder.
Creates a FIXED_LEN_BYTE_ARRAY column with FLOAT32_VECTOR logical type, type_length set to dimension * element_size.
Usage: auto schema = add_vector_column( Schema::builder("embeddings") .column<int64_t>("id") .column<std::string>("text"), "embedding", 768) .build();
| builder | The SchemaBuilder to add the column to. |
| name | Column name. |
| dimension | Number of elements per vector (e.g. 768). |
| elem | Element type (default FLOAT32). |
Definition at line 788 of file vector_type.hpp.
|
inline |
Append a uint32_t in little-endian byte order to a byte buffer.
| buf | The destination byte buffer. |
| val | The 32-bit value to append (4 bytes). |
Definition at line 32 of file column_writer.hpp.
|
inline |
Append a uint64_t in little-endian byte order to a byte buffer.
| buf | The destination byte buffer. |
| val | The 64-bit value to append (8 bytes). |
Definition at line 42 of file column_writer.hpp.
|
inline |
Map an Arrow format string to a TensorDataType.
Supports the standard single-character format codes for primitive numeric types and booleans. Multi-character format codes (e.g. "tss:" for timestamp) are not supported and return UNSUPPORTED_TYPE.
| format | Arrow format string (must not be null or empty). |
Definition at line 218 of file arrow_bridge.hpp.
|
inline |
Automatically select the best available compression codec.
Priority order: ZSTD > Snappy > LZ4_RAW > LZ4 > UNCOMPRESSED.
| sample_data | Reserved for future data-aware heuristics (e.g. entropy estimation). Currently unused. |
| sample_size | Reserved for future use. Currently unused. |
|
inline |
Pack exactly 8 values at the given bit width into a byte buffer.
Each value occupies bit_width bits, packed LSB-first in little-endian byte order. Appends exactly bit_width bytes to out (since 8 values * bit_width bits = bit_width bytes). If bit_width is 0, no bytes are emitted (all values are implicitly zero).
| out | Output byte buffer to append packed bytes to. |
| values | Pointer to exactly 8 unsigned values to pack. |
| bit_width | Bits per value (0–64). |
|
inline |
Unpack exactly 8 values at the given bit width from a byte buffer.
Reverses the packing performed by bit_pack_8(). Reads bit_width bytes from src and unpacks 8 values of bit_width bits each, stored LSB-first. If bit_width is 0, all output values are set to zero.
| src | Pointer to at least bit_width bytes of packed data. |
| values | Output array of exactly 8 unsigned values. |
| bit_width | Bits per value (0–64). |
|
inline |
Build an AuditMetadata from a populated AuditChainWriter.
Extracts the chain summary from a writer that has accumulated entries. The chain_id must be provided by the caller (use generate_chain_id() to create one, or reuse an existing ID for chain continuation).
| writer | The chain writer to extract metadata from. |
| chain_id | Unique chain identifier for this audit trail. |
Definition at line 967 of file audit_chain.hpp.
|
inline |
Compress data using the specified codec via the global CodecRegistry.
For Compression::UNCOMPRESSED, returns a verbatim copy of the input without consulting the registry.
| codec | The compression type to use. |
| data | Pointer to the raw input bytes. |
| size | Number of bytes to compress. |
|
inline |
Build the Parquet schema for AI decision log files.
Columns: timestamp_ns INT64 (TIMESTAMP_NS) – Decision timestamp strategy_id INT32 – Strategy identifier model_version BYTE_ARRAY (STRING) – Model version hash decision_type INT32 – DecisionType enum value input_features BYTE_ARRAY (STRING) – JSON array of floats model_output DOUBLE – Primary model output confidence DOUBLE – Model confidence [0,1] risk_result INT32 – RiskGateResult enum value order_id BYTE_ARRAY (STRING) – Associated order ID symbol BYTE_ARRAY (STRING) – Trading symbol price DOUBLE – Decision price quantity DOUBLE – Decision quantity venue BYTE_ARRAY (STRING) – Execution venue notes BYTE_ARRAY (STRING) – Free-text notes chain_seq INT64 – Hash chain sequence number chain_hash BYTE_ARRAY (STRING) – Hex entry hash prev_hash BYTE_ARRAY (STRING) – Hex previous hash
Definition at line 511 of file decision_log.hpp.
|
inline |
Decode an unsigned varint (LEB128) from a byte buffer.
Reads a variable-length encoded unsigned integer starting at data[pos]. On success, pos is advanced past the consumed bytes. Returns 0 with pos unchanged if the buffer is exhausted before a terminating byte is found. Includes overflow protection: decoding stops if the shift exceeds 63 bits.
| data | Pointer to the encoded byte stream. |
| pos | Current read position (updated on return). |
| size | Total size of the byte stream. |
|
inline |
Decompress data using the specified codec via the global CodecRegistry.
For Compression::UNCOMPRESSED, returns a verbatim copy of the input without consulting the registry.
| codec | The compression type that was used to compress the data. |
| data | Pointer to the compressed input bytes. |
| size | Number of compressed bytes. |
| uncompressed_size | Expected size of the decompressed output (from the Parquet page header). |
|
inline |
Deserialize and verify a chain from serialized bytes in one call.
Combines deserialization with AuditChainVerifier::verify(), returning HASH_CHAIN_BROKEN when verification fails.
| chain_data | Pointer to the serialized chain bytes. |
| chain_size | Size of the serialized chain in bytes. |
Definition at line 998 of file audit_chain.hpp.
|
inline |
Encode an unsigned varint (LEB128) into a byte buffer.
Appends the variable-length encoding of value to buf. Each output byte uses 7 data bits and 1 continuation bit (MSB), following the unsigned LEB128 convention used by the Parquet wire format.
| buf | Output byte buffer to append to. |
| value | The unsigned integer to encode. |
|
inlinenoexcept |
Convert a 16-bit IEEE 754 half-precision value to a 32-bit float.
Handles normals, subnormals, infinities, NaNs, and signed zero.
Definition at line 59 of file vector_type.hpp.
|
inlinenoexcept |
Convert a 32-bit float to a 16-bit IEEE 754 half-precision value.
Rounds to nearest even. Handles overflow to infinity and subnormals.
Definition at line 100 of file vector_type.hpp.
|
inline |
Convert a DLPack DLDataType back to a Signet TensorDataType.
Returns an error for types that have no Signet equivalent (e.g. uint16, uint32, uint64, bfloat16, or multi-lane SIMD types).
| dl_dtype | The DLPack data type descriptor. |
Definition at line 191 of file numpy_bridge.hpp.
|
inline |
Reconstruct an arithmetic value from its little-endian byte representation.
If bytes contains fewer than sizeof(T) bytes the result is value-initialized (zero).
| T | An arithmetic type. |
| bytes | At least sizeof(T) bytes in little-endian order. |
Definition at line 66 of file statistics.hpp.
|
inline |
Convert an OnnxTensorType back to a Signet TensorDataType.
| ort_type | The ONNX tensor element type. |
Definition at line 90 of file onnx_bridge.hpp.
|
inline |
Generate a simple chain identifier based on the current timestamp.
Format: "chain-<hex_timestamp_ns>" (e.g. "chain-1a2b3c4d5e6f7890"). This is NOT cryptographically random – it is a human-readable identifier for correlating chain segments across files.
Definition at line 203 of file audit_chain.hpp.
|
inlinenoexcept |
Definition at line 122 of file reader.hpp.
|
inline |
Convert a 32-byte SHA-256 hash to a lowercase hexadecimal string (64 chars).
Definition at line 150 of file audit_chain.hpp.
|
inline |
Convert a 64-character lowercase hex string back to a 32-byte hash.
Returns an error if the string is not exactly 64 hex characters.
Definition at line 164 of file audit_chain.hpp.
|
inline |
Build the Parquet schema for human override log files.
Columns mirror HumanOverrideRecord fields plus hash chain + row lineage.
Definition at line 259 of file human_oversight.hpp.
|
inline |
Build the Parquet schema for ML inference log files.
Columns: timestamp_ns INT64 (TIMESTAMP_NS) – Inference timestamp model_id BYTE_ARRAY (STRING) – Model identifier model_version BYTE_ARRAY (STRING) – Model version hash inference_type INT32 – InferenceType enum value input_embedding BYTE_ARRAY (STRING) – JSON array of floats input_hash BYTE_ARRAY (STRING) – SHA-256 of raw input output_hash BYTE_ARRAY (STRING) – SHA-256 of raw output output_score DOUBLE – Primary output score latency_ns INT64 – Inference latency (ns) batch_size INT32 – Batch size input_tokens INT32 – Input token count output_tokens INT32 – Output token count user_id_hash BYTE_ARRAY (STRING) – Hashed user ID session_id BYTE_ARRAY (STRING) – Session identifier metadata_json BYTE_ARRAY (STRING) – Additional JSON metadata chain_seq INT64 – Hash chain sequence number chain_hash BYTE_ARRAY (STRING) – Hex entry hash prev_hash BYTE_ARRAY (STRING) – Hex previous hash
Definition at line 441 of file inference_log.hpp.
|
inlinenoexcept |
Definition at line 129 of file reader.hpp.
|
inline |
Convenience factory: create a shared batch with a given schema.
Definition at line 403 of file column_batch.hpp.
|
inline |
Return the current time as nanoseconds since the Unix epoch (UTC).
Gap R-5 (MiFID II RTS 25 Art.2-3): Uses system_clock for UTC traceability. The previous implementation used steady_clock, which has no relationship to UTC (arbitrary epoch, typically boot time) — every timestamp it produced was regulatory-invalid under RTS 25.
system_clock::now() returns UTC wall-clock time suitable for regulatory timestamp fields. On POSIX systems this is clock_gettime(CLOCK_REALTIME).
Guarantees monotonically increasing timestamps across concurrent callers (MiFID II RTS 24 Art.2 timestamp ordering, CWE-362 race guard). If system_clock returns a value <= the last observed (e.g. due to NTP step adjustment), the result is bumped to last_ns + 1 to preserve hash chain ordering invariants.
Definition at line 110 of file audit_chain.hpp.
|
inline |
Return a human-readable string for an OnnxTensorType value.
Useful for diagnostics, logging, and error messages. Returns "UNKNOWN" for values not in the OnnxTensorType enumeration.
| t | The ONNX tensor type. |
Definition at line 298 of file onnx_bridge.hpp.
|
inline |
Map a Parquet PhysicalType to an Arrow format string.
Arrow format strings for primitive types: "b" = bool, "c" = int8, "C" = uint8, "s" = int16, "S" = uint16, "i" = int32, "I" = uint32, "l" = int64, "L" = uint64, "f" = float32, "g" = float64, "e" = float16
| pt | The Parquet physical type to convert. |
Definition at line 176 of file arrow_bridge.hpp.
|
inline |
Map a PhysicalType to a TensorDataType (for column export).
| pt | The Parquet physical type. |
Definition at line 252 of file arrow_bridge.hpp.
|
inline |
Return the byte size for a PhysicalType (primitive types only).
| pt | The Parquet physical type. |
Definition at line 272 of file arrow_bridge.hpp.
|
inline |
Prepare an OwnedTensor for ONNX Runtime consumption (zero-copy).
Delegates to the TensorView overload via the OwnedTensor's view(). The OwnedTensor must remain valid for the lifetime of the returned info.
| tensor | The OwnedTensor to export (must be valid and contiguous). |
Definition at line 222 of file onnx_bridge.hpp.
|
inline |
Prepare a TensorView for ONNX Runtime consumption (zero-copy).
For all supported numeric types (FLOAT32, FLOAT64, INT32, INT64, INT8, UINT8, INT16, FLOAT16, BOOL), this is zero-copy: the returned OnnxTensorInfo.data points directly into the TensorView's memory.
The TensorView must remain valid for the lifetime of the returned info (is_owner will be false). The tensor must be contiguous; non-contiguous tensors are rejected – call clone() first to produce a contiguous copy.
| tensor | The TensorView to export (must be valid and contiguous). |
Definition at line 176 of file onnx_bridge.hpp.
|
inline |
Prepare a batch of named TensorViews for ONNX Runtime inference.
Each pair is (input_name, tensor_view). All tensors must be valid and contiguous. If any tensor fails preparation, the entire call fails with an error message identifying the failing input by name.
| inputs | Non-empty vector of (name, TensorView) pairs. Names should match the model's input node names. |
Definition at line 264 of file onnx_bridge.hpp.
|
inline |
Register the bundled Snappy codec with the global CodecRegistry.
Call this once at startup (e.g. from a top-level initializer or a codec_init function) to make Compression::SNAPPY available through compress() and decompress().
Definition at line 608 of file snappy.hpp.
|
inlinenoexcept |
Returns a human-readable name for a TensorDataType.
Definition at line 181 of file tensor_bridge.hpp.
|
inline |
Map a TensorDataType to an Arrow format string.
| dtype | The tensor data type to convert. |
Definition at line 193 of file arrow_bridge.hpp.
|
inlineconstexprnoexcept |
Returns the byte size of a single element of the given tensor data type.
Definition at line 165 of file tensor_bridge.hpp.
|
inline |
Create a BufferInfo from a TensorView for Python buffer protocol export.
The tensor must be valid and contiguous. The returned BufferInfo's data pointer points directly into the TensorView's memory (zero-copy). The TensorView must remain valid for the lifetime of the BufferInfo.
Strides are computed as C-contiguous byte strides (innermost dimension has stride = itemsize, outer dimensions are products of inner shapes).
| tensor | A valid, contiguous TensorView. |
Definition at line 720 of file numpy_bridge.hpp.
|
inline |
Convert a Signet TensorDataType to a DLPack DLDataType.
This function is total – all TensorDataType values have a DLPack mapping. Lanes is always set to 1 (scalar).
Mapping:
| TensorDataType | DLDataTypeCode | bits |
|---|---|---|
| FLOAT32 | kDLFloat | 32 |
| FLOAT64 | kDLFloat | 64 |
| FLOAT16 | kDLFloat | 16 |
| INT32 | kDLInt | 32 |
| INT64 | kDLInt | 64 |
| INT16 | kDLInt | 16 |
| INT8 | kDLInt | 8 |
| UINT8 | kDLUInt | 8 |
| BOOL | kDLUInt | 8 |
| dtype | The Signet tensor data type to convert. |
Definition at line 135 of file numpy_bridge.hpp.
|
inline |
Overload for std::string – returns raw bytes (no endian conversion needed).
| value | The string whose bytes are copied verbatim. |
Definition at line 53 of file statistics.hpp.
|
inline |
Convert an arithmetic value to its little-endian byte representation.
On little-endian platforms (x86, ARM) this is a straight memcpy. On big-endian platforms the bytes are reversed.
| T | An arithmetic type (int32_t, double, etc.). |
| value | The value to convert. |
sizeof(T) bytes in little-endian order. Definition at line 34 of file statistics.hpp.
|
inline |
Convert a Signet TensorDataType to the corresponding OnnxTensorType.
All TensorDataType values have a direct ONNX mapping; this function is total (never returns UNDEFINED for valid inputs).
| dtype | The Signet tensor data type. |
Definition at line 68 of file onnx_bridge.hpp.
|
inline |
Definition at line 256 of file mmap_reader.hpp.
|
inline |
Definition at line 110 of file reader.hpp.
|
inlineconstexpr |
Chain summary stored in Parquet key-value metadata.
Size of a single serialized HashChainEntry in bytes.
Definition at line 89 of file audit_chain.hpp.
|
inlineconstexpr |
Definition at line 120 of file reader.hpp.
|
inlineconstexpr |
|
inlineconstexpr |
|
inlineconstexpr |
Convenience variable template: parquet_type_of_v<double> == PhysicalType::DOUBLE.
|
inlineconstexpr |