Notes: TiKV Sharding and Scheduling
Prelude
Notes on how TiKV does dynamic sharding with Raft. This is particularly relevant to my own efforts to add dynamic sharding and scheduling to a toy key-value store that I’m building. Incidentally, I will also be trying to use Raft which is in contrast to the usage of Paxos in Google’s Spanner, AWS’s DynamoDB, and Meta’s ZippyDB.
Notes
TiKV has a module called the Placement Driver (PD) that aggregates statistics from heartbeats and makes resource scheduling decisions based off these statistics. TiKV names the PD after a similar component in Google’s Spanner.
TiKV is organized in the following way:
- Compute resources (VM’s/containers) hosting the database are called nodes
- TiKV does range-based sharding where a shard contains a contiguous block of keys. TiKV calls a shard a region.
- Multiple regions can be stored on a node.
- A region consists of multiple replicas of the data. Each region’s replicas are grouped together in a Raft group to maintain consistency.
TiKV stores metrics in a Prometheus instance and uses this instance to scrape for metrics in addition to the metrics recieved via heartbeats
PD will keep a cache of read/write statistics for the top N regions of each store. This is used to determine whether a region is hot and thus a candidate for migration.