Signet Forge 0.1.0
C++20 Parquet library with AI-native extensions
DEMO
Loading...
Searching...
No Matches
split_block.hpp File Reference

Split Block Bloom Filter as specified by the Apache Parquet format. More...

#include "signet/bloom/xxhash.hpp"
#include <algorithm>
#include <cassert>
#include <cmath>
#include <cstdint>
#include <cstring>
#include <stdexcept>
#include <string>
#include <vector>

Go to the source code of this file.

Classes

class  signet::forge::SplitBlockBloomFilter
 Parquet-spec Split Block Bloom Filter for probabilistic set membership. More...
 

Namespaces

namespace  signet
 
namespace  signet::forge
 

Detailed Description

Split Block Bloom Filter as specified by the Apache Parquet format.

Implements the bloom filter described in the Apache Parquet specification: https://github.com/apache/parquet-format/blob/master/BloomFilter.md

Key properties:

  • Filter is divided into 32-byte blocks (256 bits, 8 x uint32_t words).
  • Each insertion sets 8 bits (one per word) within a single block.
  • Block selection and bit positions are deterministic from the hash.
  • Uses xxHash64 for hashing typed values.
  • Total size is always a positive multiple of 32 bytes.
See also
signet::forge::xxhash for the hash function used by convenience methods.

Definition in file split_block.hpp.