DAJ --- A Toolkit for the Simulation of Distributed Algorithms in Java -- B.2 Algorithm

Go backward to B.1 Problem
Go up to B Distributed Snapshots
Go forward to B.3 Program

B.2 Algorithm

We implement the solution of the problem by an application of the Chandy-Lamport Algorithm for determining consistent global snapshots of a network⁵. In our case, this algorithm is applied as follows:

Initially all nodes are in mode RUNNING exchanging value messages as described in the problem statement. At some time, node 0 switches to mode SNAPPING and sends out snap messages (see below); every node that receives a snap message, also switches to mode SNAPPING.
When a node switches to mode SNAPPING, it determines the current value of its local deposit in a variable snapValue and sends snap messages to all output channels.
Each node in mode SNAPPING continues normal operation. However the value of every message from every input channel is added to snapValue until a snap message arrives through that channel.
If snap messages have arrived through all input channels of a node, the current snapValue represents the node's part of the network value.

The following figure illustrates the basic idea of the algorithm: the stream of messages passing through a channel is split by the snap message into two parts; a node's snapValue eventually represents the value of the node after having processed all messages within the partition bounded by the snap messages of the incoming and outgoing channels. Since every message is contained in exactly one such partition, the sum of all snapValues represents the total network value.

Distributed Snapshots: Basic Idea

In order to broadcast a node's snapValue to every other node of the network, the following (simple-minded) strategy is applied:

When a node has determined its final snapValue, it sends this value (together with its node number) to all neighbors and continues normal operation.
A node that receives some other node's snapValue determines whether it has already seen this value. If not, this value is added to the node's local variable totalValue and the message is forwarded to all neighbor nodes.
If a node has seen and forwarded the snapValue of every other node, it terminates execution.

The program thus ultimately terminates in a state where every node holds the same totalValue which represents the value of the network.

Maintainer: Wolfgang Schreiner
Last Modification: November 14, 1997