B.2 Algorithm
We implement the solution of the problem by an application of the
Chandy-Lamport Algorithm for determining consistent global
snapshots of a network5. In our case, this algorithm is applied as follows:
- Initially all nodes are in mode RUNNING exchanging value messages
as described in the problem statement. At some time, node 0 switches to mode
SNAPPING and sends out snap messages (see below); every node that
receives a snap message, also switches to mode SNAPPING.
- When a node switches to mode SNAPPING, it determines the current
value of its local deposit in a variable snapValue and sends snap
messages to all output channels.
- Each node in mode SNAPPING continues normal operation. However
the value of every message from every input channel is added to snapValue
until a snap message arrives through that channel.
- If snap messages have arrived through all input channels of a
node, the current snapValue represents the node's part of the network
value.
The following figure illustrates the basic
idea of the algorithm: the stream of messages passing through a channel is
split by the snap message into two parts; a node's snapValue eventually
represents the value of the node after having processed all messages within
the partition bounded by the snap messages of the incoming and outgoing
channels. Since every message is contained in exactly one such partition, the
sum of all snapValues represents the total network value.
Distributed Snapshots: Basic Idea
In order to broadcast a node's snapValue to every other node of the
network, the following (simple-minded) strategy is applied:
- When a node has determined its final snapValue, it sends this
value (together with its node number) to all neighbors and continues normal
operation.
- A node that receives some other node's snapValue determines
whether it has already seen this value. If not, this value is added to the
node's local variable totalValue and the message is forwarded to all
neighbor nodes.
- If a node has seen and forwarded the snapValue of every other
node, it terminates execution.
The program thus ultimately terminates in a state where every node holds the
same totalValue which represents the value of the network.
Maintainer: Wolfgang Schreiner
Last Modification: November 14, 1997