Consistent hashing was first described in a paper, Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web (1997) by David Karger et al. It is used in distributed storage systems like Amazon Dynamo, memcached, Project Voldemort and Riak.
Consistent hashing is a very simple solution to a common problem: how can you find a server in a distributed system to store or retrieve a value identified by a key, while at the same time being able to cope with server failures and network partitions?
Simply finding a server for value is easy; just number your set of s servers from 0 to s - 1. When you want to store or retrieve a value, hash the value's key modulo s, and that gives you the server.
The problem comes when servers fail or become unreachable through a network partition. At that point, the servers no longer fill the hash space, so the only option is to invalidate the caches on all servers, renumber them, and start again. Given that, in a system with hundreds or thousands of servers, failures are commonplace, this solution is not feasible.
In consistent hashing, the servers, as well as the keys, are hashed, and it is by this hash that they are looked up. The hash space is large, and is treated as if it wraps around to form a circle - hence hash ring. The process of creating a hash for each server is equivalent to placing it at a point on the circumference of this circle. When a key needs to be looked up, it is hashed, which again corresponds to a point on the circle. In order to find its server, one then simply moves round the circle clockwise from this point until the next server is found. If no server is found from that point to end of the hash space, the first server is used - this is the "wrapping round" that makes the hash space circular.
The only remaining problem is that in practice hashing algorithms are likely to result in clusters of servers on the ring (or, to be more precise, some servers with a disproportionately large space before them), and this will result in greater load on the first server in the cluster and less on the remainder. This can be ameliorated by adding each server to the ring a number of times in different places. This is achieved by having a replica count, which applies to all servers in the ring, and when adding a server, looping from 0 to the count - 1, and hashing a string made from both the server and the loop variable to produce the position. This has the effect of distributing the servers more evenly over the ring. Note that this has nothing to do with server replication; each of the replicas represents the same physical server, and replication of data between servers is an entirely unrelated issue.
I've written an example implementation of consistent hashing in C++. As you can imagine from the description above, it isn't terribly complicated. Here is the main class:
A few points to note:
hash
from <map>.HASH_NAMESPACE
because g++ puts the non-standard hash
in a different namespace than that which other compilers do.std::unordered_map
.
Node
and Data
types need to have operator <<
defined for a std::ostream
.ostringstream
in order to "stringify" them before getting the hash.
I've also written an example program that simulates using a cluster of cache servers to store and retrieve some data.
You can browse the source code and example program here:
Here is a compressed tar archive containing the source code, example program and makefile:
Copyright (C) 2010 Martin Broadhurst