[Notes]: Bloom Filters
Prelude
These are just some of my notes and references on Bloom filters while doing research for building an LSM tree based database.
Notes
A Bloom filter is a probablistic data structure that allows checks for set membership. These checks for set membership can return false positives but will never produce false negatives. More clearly: “if a negative match is returned, the element is guaranteed not to be a member of the set” (Alex Petrov - Database Internals).
Used to lower I/O costs incurred by the tiered structure of LSM-trees
Useful references
- Alex Petrov - Database Internals
- Jamie Talbot - What are Bloom filters?
- Onat - Let’s Implement a Bloom Filter
- Niv Dayan, et al. - Optimal Bloom Filters and Adaptive Merging for LSM-Trees
- Kirsch and Mitzenmacher - Building a Better Bloom Filter
- Issues with Bloom filters in RocksDB and LevelDB. A lot of great context and information double hashing schemes.
Read other posts