Prelude

Notes on how TiKV does dynamic sharding with Raft. This is particularly relevant to my own efforts to add dynamic sharding and scheduling to a toy key-value store that I’m building. Incidentally, I will also be trying to use Raft which is in contrast to the usage of Paxos in Google’s Spanner, AWS’s DynamoDB, and Meta’s ZippyDB.

Notes

  • TiKV has a module called the Placement Driver (PD) that aggregates statistics from heartbeats and makes resource scheduling decisions based off these statistics. TiKV names the PD after a similar component in Google’s Spanner.

  • TiKV is organized in the following way:

    • Compute resources (VM’s/containers) hosting the database are called nodes
    • TiKV does range-based sharding where a shard contains a contiguous block of keys. TiKV calls a shard a region.
    • Multiple regions can be stored on a node.
    • A region consists of multiple replicas of the data. Each region’s replicas are grouped together in a Raft group to maintain consistency.
  • TiKV stores metrics in a Prometheus instance and uses this instance to scrape for metrics in addition to the metrics recieved via heartbeats

  • PD will keep a cache of read/write statistics for the top N regions of each store. This is used to determine whether a region is hot and thus a candidate for migration.

Useful references