Vassal 4:Design Blog 6

17-Dec-21 Vassal 4 Messaging protocols

Ensuring State Consistency in Vassal

There are two aspects to ensuring state consistency across multiple Vassal clients.

The first aspect is to ensure that all clients receive the same sequence of Command Packets in the same order. (Packet Order Violation)

The second aspect is to identify and reject any Command Packet that attempts to change the State of any Component in an inconsistent way. (State Consistency Clash)

These two aspects are relatively independent. Packet Order Violations may, but do not necessarily result in State Consistency Clashes. State Consistency Clashes may occur independently, of Packet Order Violations, but will likely indicate Vassal bugs, or bad MVC interaction decisions.

Ensuring Command Packet Ordering

Overall, V3 does fairly well at ensuring Command Packer ordering, but it has a fundamental flaw that allows Command Packets to be executed out of order in some cases.

The current V3 design of a central 'dumb' multicasting messaging server has the property of forcing all communication between clients through a single FIFO queue that enforces a nominal sequential ordering of Command Packets. This ensures the same Command Packets are delivered in the same order to all attached Clients.

However, the current design does not reflect a Command Packet from a client back to the client it originated it from, so a client that originates a Command Packet never actually knows the order that its own Command Packet is delivered to other clients in. When the next Packet arrives from Client B, Client A has no idea if the last packet it sent was received by client B before or after client B sent the next packet.

The current clients assume that any Command Packet it sends is received by all other clients BEFORE it receives the next inward Command Packet from the server. This is where our current V3 State inconsistencies arise.

There are 2 approaches to resolve this problem, Pessimistic and Optimistic.

A Pessimistic approach enforces the ordering of Command Packets by tightly controlling when Clients can emit them. Essentially, each Client must ask permission from the server to send a Packet and then receive back permission and a sequence number to be used to send the message. Essentially, Packet Order Violations are prevented from happening in the first place by holding up all clients waiting for permission to send in turn. The disadvantage of this approach is that it is slow. There is an additional round-trip between client and server for every Command. And this method is very heavy handed, It prevents the generation of simultaneous actions that are changing the state of unrelated components. e.g., 2 players making simultaneous changes in their private windows.

An optimistic approach allows Command Packets to be generated and sent arbitrarily, leaving it up to the clients to detect and deal with Packet Order Violations.

I propose an optimistic approach. My feeling is that the rate of UI event generation is relatively low and the rate of simultaneous UI events will be relatively rare, and easily detected.

The main change we need to make to the Server to support this is to change the Vassal Server to do a full Multicast and reflect Command Packets back to the originating clients as well as all other clients.

This ensures that all clients see an identical set of messages delivered in the same order (Totally ordered FIFO). The order Packets are received by the Server defines the order the Packets must be executed by all clients, INCLUDING the clients that generate the Packets.

If a client is idle and not generating UI events, then it is not possible for it to see a Packet Order Violation, as it is just seeing the Packets from other Clients correctly ordered by the Server.

Packet Order Violations can only be seen by a client when it transmits Packets, and the Violation can only be between its own Packet and a Packet from another Client. A Client will never see a Packet Order Violation between packets other clients, the Totally Ordered FIFO server ensures this.

I suggest we prevent a new Command Packet (i.e. lock the UI) from being generated until the last UI generated Command Packet by the client has been received back from the Server, then the only time a Client can see a Packet Order Violation is when it Sends a Packet and then receives a Packet from a different Client BEFORE it receives its own Packet back from the server.

Given that UI events are generated at human speed, it is unlikely that this need to wait for each Command packet to echo back will be noticeable.

Implementing this has two important consequences:

 1. It sets the start of any potential roll-back as just prior to the last Packet sent.
 2. Once a Packet a client sends is received back from the Server, then no future Packet can be received from any client that Violates the Packet Ordering seen by this client. So the any potential for roll-back ends when we receive a Packet we sent back from the Server.

Recovering from Packet Order Violations

UI Event initiates creation of a Command Packet:

 1. Create the Command Packet and record the initial state of all Components updated in the roll-back cache.
 2. Send the Command Packet to the Server
 3. If a Packet arrives from another client, then
    3.1 Roll-back Game State by applying all initial states in the roll-back cache created in 1.
    3.2 Apply the newly arrived packet to the Game State
 4. If more Packets arrive from another client, then
    4.1 Apply the newly arrived packet to the Game State
 5. When the Command Packet we sent returns then
    5.1 If we had to do a roll-back in step 3.1 then
        5.1.1 Re-apply the Command Packet we created in 1 to our Game State, now in the correct order
 6. Clear the roll-back cache.

This undo's the actions generated by out Ur event, applies any foreign packets that arrive while our packet was transiting to the server, then re-applies the results of our UI event on top of those packets. NOTE that this may now cause a State Clash in step 4.1.3 when we try and re-apply our packet.

At this level of the protocol, we are just ensuring that all Packets are executed in the same order on all clients. We are not concerned with any State Clashes, these are handled independently of the ordering protocol.

Handling State Clashes

With complete MVC separation, then the State of all Components is stored in the Model and there is only one Command type:- Change component state. The plethora of Commands we have in V3 are mostly doing UI stuff and this code will be moved to the V component, fired off by Property Listeners in the C Component.

The first thing we need to do is add a Lamport clock to the state of every Component that is included as part of the state.

A Lamport clock is just a counter that will be incremented each time the state of the component is updated. This 'Clock' or 'Evolution' level is transmitted in Command packets between clients as part of the Component State This allows a client that receives a Command to compare the Evolution level of any components referenced in the Command Packet with the current Evolution level of those same component in the current client.

A Change Component State Command that we receive from client B might look something like:

 Component: x
 Evolution Level: 23
 New State Value: xyzzy

This Command says that client B would like to change Component 'x' with a current evolution level of 23 to have a new State value of 'xyzzy'. If our client already has the Evolution level of Component x at 24 or greater, then we have a State Clash. Our client has made a change to Component x that Client B did not know about when it created the Command.

When our client detects a State Clash in a message from Client B, we just ignore it. All other clients will also ignore it, except for Client B who has clean itself up.

When out client detects a State Clash in a message from ourselves coming back from the Server, then we are in the step 5.1.1 of the Packet Order Violation protocol, trying to re-apply a Command Packet that we sent out of sequence, but an intervening Packet has arrived in the meantime and change the state out from under us. In this case, we still just ignore the Packet and display a subtle 'Packet sync error, action cancelled' message. We have already undone the original UI Event during the roll-back phase.

We are making no attempt to causally order events (so no need for expensive Vector Clocks etc.). If a Packet comes through and doesn't generate a State Clash, we do not really care if we don't process it in exactly the same order as the client that issued it. The Key is that a) There is no State Clash, so there is no impact on UI Events we have generated and b) All clients will process this is in the same way. If a cleaned-up Packet Order Violation does not generate a State Clash, then it must be affecting a different part of the Game State that does not impact on what we are doing. e.g., if all players are moving units about in their own private windows, this may generate Packet Order Violations, but there is no intersection between the Components that are being modified, so we can allow the updates. The overall Game State will be 'Eventually Consistent'.

Other Issues

Transient Data

We have been discussing sharing cursors and drag shadows between clients. We REALLY do not want these to be generating state changes. And probably don't want these appearing in log files? These should probably be implemented as 'Non-command' packets that are not subject to Packet Order or State Clash checking.

State Dependent UI Interactions

UI Events take time to be generated in a client. Right-clicking on a piece to generate a pop-up menu, then taking your time to select an option. Clicking on a button, then thinking, do I really want to do that? Dragging a piece of a stack or Deck, then later completing the drop (or not, snapping the piece back).

In V3, we have State dependent elements that enable or disable UI elements or affect the operation of GKC's and Triggers through Beanshell expressions. These elements are dependent on the State of Components (e.g. {Strength > 10}), but do not necessarily change them, so will not necessarily generate State Clashes if we get a Packet Order Violation.

This leads to several potential problems:

1. I right-click on a piece and a menu is displayed. Before I select one of the options, a Command comes in from another client that updates a Global Variable that causes a Restrict Command trait to disable half of the menu options.

We can probably deal with this through the MVC interactions. The V4 equivalent of the 'Restrict Command' trait (Which probably will be completely different, I know, but this is just for illustration), would have a Property Listener on the Global Variable that via the Controller will cause the Menu Options to disable themselves, even while the menu is showing.

2. The equivalent of Trigger actions that read some piece of state to determine an action. The State Clash checker only checks to see if a client has changed the state of a Component that we want to change (write). Where we have a state-dependent check that affects 'program flow' (e.g. the equivalent of Trigger and GKC expressions), then we may need to include a state check for the State's that these checks depend on. Using a V3 example, if a Trigger Action is dependent on {Strength < Max - 2}, and the UI Event does not update Strength or Max, then we need to include a check on Evolution level of both the Strength and Max properties. This involves a) recognising that these properties are used in the expression and b) The location of the actual Components that hold the value of these properties. I'm trying to think of a sneaky way around this.

Glossary

UI Event: An action taken in a client by a player (e.g. selecting a menu option, clicking a toolbar button, mouse operations) that generates one or more Commands that need to be communicated to other clients.

Command: A description of a change of state in a component. e.g., changing the value of a Global Property from 2 to 3. Changing the position of a Game Piece.

Command Packet: A set of Commands generated by a single UI event that are combined and sent as a single un-doable entity.

Component: A Vassal game component that has an internal, changeable 'state', where changes in that state need to be communicated to other connected clients.

State: The data the makes up the current status of a Component. For example, the State of a Property will be its current value. The state of a Game Piece will be made up of lots of data, including, for example, it's current map and x, y location.

State Evolution: Each component that has State, will have a counter as part of the state that is incremented by 1 each time the State of that Component is updated. The counter changes as the Component evolves over a Game Session and allows clients to test if the state described in a Command is consistent with the State recorded for that Component in the Client.

Client: A Vassal client capable of sending and receiving Command Packets via a Vassal Server during a Game Session, maintaining an internal Game State over time.

Composite State: The collection of all the states of sub-components of a given component. For example, a Game Piece will have some inherent state (e.g. location), but will also have sub-components (e.g. V3 traits) that have their own internal state.

Game State: The Composite State of the current game Session made up of the state of all Components currently existing in the current Game Session.

Game Session: The evolution of a Game State over time as Clients issue Command Packets.