group
uses raft to elect leaders.
This section aims to explain the Raft consensus algorithm in simple terms. The
idea is to give you just enough to make you understand the basic concepts,
without going into explanations about why it works accurately. For a detailed
explanation of Raft, please read the original thesis paper by
Diego Ongaro.
ElectionTimeout
, that
term can end without a leader.
AppendEntries
messages
to the followers with logs containing state updates. When the leader sends
AppendEntries
with zero logs (updates), that’s considered a
Heartbeat. The leader sends all followers Heartbeats at
regular intervals.
If a follower doesn’t receive a Heartbeat for ElectionTimeout
duration (generally between 150ms to 300ms), the leader may be down, so it
converts it’s state to candidate (as mentioned in
Server States). It then requests for votes by sending a
RequestVote
call to other servers. If it gets votes from the majority, the
candidate becomes the leader. On becoming leader, it sends Heartbeats
to all other servers to establish its authority.
Every communication request contains a term number. If a server receives a
request with a stale term number, it rejects the request.
AppendEntries
in parallel to other servers.RequestVote
RPC, the server denies its
vote if its log is more up-to-date than the candidate. It would also deny a
vote, if a minimum ElectionTimeout
hasn’t passed since the last
Heartbeat from the leader. Otherwise, it gives a vote and resets its
ElectionTimeout
timer.
Up-to-date property of logs is determined as follows:
AppendEntries
.
The significant difference in how cluster configuration changes are applied
compared to how typical Log Entries are applied is that the
followers don’t wait for a commitment confirmation from the leader before
enabling it.
A server can respond to both AppendEntries
and RequestVote
, without checking
current configuration. This mechanism allows new servers to participate without
officially being part of the cluster. Without this feature, things won’t work.
When a new server joins, it won’t have any logs, and they need to be streamed.
To ensure cluster availability, Raft allows this server to join the cluster as a
non-voting member. Once it’s caught up, voting can be enabled. This also allows
the cluster to remove this server in case it’s too slow to catch up, before
giving voting rights (sort of like getting a green card to allow assimilation
before citizenship is awarded providing voting rights).
--raft
superflag’s snapshot-after-entries
and
snapshot-after-duration
options respectively. Snapshots are created only when
conditions set by both of these options have been met.
RegisterClient
RPC. This creates a new client id, which is used for all
subsequent RPCs.