SKR 5302: Advanced Distributed Computing

6. Chapter 6: Coordination and Agreement

6.4. Elections Algorithm

Elections Algorithm

  • Need to find one process that is the coordinator
  •  Assume
    •  Each process has a unique identifier (for example, network address)
    • One process per machine
    • Every process knows the process number of every other process
    • Processes don’t know which processes are down and which ones are still running
  •  End result of the algorithm: all processes agree on who is the new coordinator/leader 
  • Bully algorithm & Ring Algorithm

Ring-based Elections

Does NOT use a token

Assume

  •  processes are ordered
  •  each process knows its successor and the successor’s successor, and so on (needed in case of failures)

 Process P detects that the coordinator is dead

  •  sends an ELECTION message to its successor
  •  includes its process number in the message
  •  each process that receives it, adds its own process number and then forwards it to its successor
  •  eventually it gets back that message, now what does it do?

Figure 6.7: A ring-based election in progress

A process notices that coordinator is not responding

  • it starts an election (any process can start one)

Election algorithm 

  • P sends an ELECTION message to processes with higher numbers
  •  If no one responds, P wins the election
  •  If some process with higher process number responds
    •  P’s job is done, that process takes over 
    •  the receiver sends an OK message to P
    •  receiver starts an election process

Eventually all processes give up, except one

This process sends out a message saying that it is the new “COORDINATOR”

A process that was down, when it comes back up starts a new election of its own

The bully algorithm

  • Allows processes to crash during an election
  • Assumes that message delivery between processes is reliable
  • Assumes that the system is synchronous
    • Uses timeouts to detect a process failure
  • Assumes that each process knows which processes have higher identifiers, and that it can communicate with all such processes. 
  • Three types of message
    • Election message – to announce an election
    • Answer message – response to an election message 
    • Coordinator message – to announce the identity of the elected process – the new ‘coordinator’.
  • A process begins an election when it notices, through timeouts, that the coordinator has failed.
  • Upper bound on the time between sending a message to another process and receiving a response:
    • T = 2Ttrans + Tprocess
        Ttrans: maximum message transmission delay
        Tprocess: maximum delay for processing a message
  • The process that knows it has the highest identifier can elect itself as the coordinator simply by sending a coordinator message to all processes with lower identifiers. 
  • A process with a lower identifier can begin an election by sending an election message to those processes that have a higher identifier and awaiting answer messages in response. 
  • If none arrives within time T, the process considers itself as the coordinator and sends a coordinator message to all processes with lower identifiers. 
  • Otherwise, the process waits a further period T’ for a coordinator message to arrive from the new coordinator. 
  • If none arrives, it begins another election.


  • If a process pi receives a coordinator message, it sets its variable electedi to the identifier of the coordinator contained within it.
  • If a process receives an election message, it sends back an answer message and begins another election – unless it has begun one already.
  • When a process is started to replace a crashed process, it begins an election.
  • If it has the higher process identifier, then it will decide that it is the coordinator and announce this to the other processes.
  • Thus, it will become the coordinator, even though the current coordinator is functioning. 

    Figure 6.8: The bully algorithm