Testing state handoff

Before implementing the DeltaCrdt library to create a replicated data store, I first want to build the interface for the data store. Yesterday, I created the StateHandoff module to play around with cluster node join/leave events, but it doesn’t actually do anything yet.

I create a new test module which will help me think about the public interface for this module and guide my design decsions as I implement the desired behavior.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
defmodule Minotaur.Cluster.StateHandoffTest do
  @moduledoc false

  use ExUnit.Case
  @moduletag :cluster_test

  alias Minotaur.ClusterHelper, as: Cluster
  alias Minotaur.GameEngine.Game
  alias Minotaur.StateHandoff

  @stash_key :the_key
  @stash_value %Game{id: "ABYZ"}

  setup ctx do
    cluster = start_supervised!({Cluster.Manager, ctx})

    [cluster: cluster]
  end

  describe "when state is stashed by one node" do
    setup [:start_cluster, :stash_state]

    test "state can be picked up from another node", ctx do
      res = Cluster.call(ctx.cluster, ctx.node2, StateHandoff, :pickup, [@stash_key])
      assert {:ok, @stash_value} == res
    end
  end

  defp stash_state(%{cluster: cluster, node1: node1}) do
    :ok = Cluster.call(cluster, node1, StateHandoff, :stash, [@stash_key, @stash_value])
  end

  defp start_cluster(%{cluster: cluster}) do
    node1 = Cluster.start_node(cluster, environment: cluster_env())
    node2 = Cluster.start_node(cluster, environment: cluster_env())

    [node1: node1, node2: node2]
  end

  defp cluster_env do
    endpoint_env =
      Application.get_env(:minotaur, MinotaurWeb.Endpoint, [])
      |> Keyword.put(:server, false)

    [minotaur: [{MinotaurWeb.Endpoint, endpoint_env}]]
  end
end

The cluster setup logic is duplicated from my existing Minotaur.Cluster.GameSessionTest module, but I’m not too worried about refactoring this code for reuse at this point.

The test module does not compile because StateHandoff does not yet implement the stash or pickup functions. The test is guiding me to write these functions, but I will only write enough to make the module compile. Once it compiles, I’ll continue to work in small steps to make the test pass.

defmodule Minotaur.Cluster.StateHandoffTest do
  # ...

  def stash(_key, _value) do
    :ok
  end

  def pickup(_key) do
    nil
  end  
end

I now have a compiling, but failing test.

Breaking down the problem

Thinking about what I need to make this test case pass, I realize I can break the problem down further. I can first implement the feature for stashing and picking up state with a single node, then build another feature off that which supports multiple nodes interacting with the store. Splitting the work into two features will allow me to work in smaller steps and focus on less scope of work at a time.

I’ll leave this first test case in a failing state for now since it covers the second feature behavior. I create a new test case to validate a single node can pick up stashed state.

20
21
22
23
24
25
26
27
28
29
30
31
32
describe "when state is stashed by one node" do
  setup [:start_cluster, :stash_state]

  test "state can be picked up from same node", ctx do
    res = Cluster.call(ctx.cluster, ctx.node1, StateHandoff, :pickup, [@stash_key])
    assert {:ok, @stash_value} == res
  end

  test "state can be picked up from another node", ctx do
    res = Cluster.call(ctx.cluster, ctx.node2, StateHandoff, :pickup, [@stash_key])
    assert {:ok, @stash_value} == res
  end
end

I can now focus on implementing DeltaCrdt for a single node and delay worrying about how the state will be synced across the cluster.

I install DeltaCrdt as a dependency in my mix.exs file. Then, I update the StateHandoff module to start a linked CRDT process and store the pid as the GenServer state.

14
15
16
17
18
19
def init(_opts) do
  :net_kernel.monitor_nodes(true, node_type: :visible)
  {:ok, crdt_pid} = DeltaCrdt.start_link(DeltaCrdt.AWLWWMap)

  {:ok, crdt_pid}
end

I update the stash function to call the GenServer with a :stash event and then implement the callback handler for the event.

21
22
23
24
25
26
27
28
29
30
31
32
def stash(key, value) do
  GenServer.call(__MODULE__, {:stash, key, value})
end

def pickup(_key) do
  nil
end

def handle_call({:stash, key, value}, _from, crdt_pid) do
  DeltaCrdt.put(crdt_pid, key, value)
  {:reply, :ok, crdt_pid}
end

I run the test case which fails because the StateHandoff GenServer is not yet started. It is time to add the handoff module to the top-level application supervision tree so it starts up on each node.

The test still fails, but it doesn’t blow up before it reaches the assertion in the test case.

1) test when state is stashed by one node state can be picked up from same node (Minotaur.Cluster.StateHandoffTest)
   test/cluster/state_handoff_test.exs:23
   Assertion with == failed
   code:  assert {:ok, @stash_value} == res
   left:  {:ok, %Minotaur.GameEngine.Game{id: "ABYZ", round_end_time: nil, world: nil, players: nil, registered_actions: %{}, round: 1}}
   right: nil

Now I need to implement the pickup function to retrieve the stored data from the CRDT process.

25
26
27
28
29
30
31
32
33
34
35
36
37
def pickup(key) do
  GenServer.call(__MODULE__, {:pickup, key})
end

def handle_call({:stash, key, value}, _from, crdt_pid) do
  DeltaCrdt.put(crdt_pid, key, value)
  {:reply, :ok, crdt_pid}
end

def handle_call({:pickup, key}, _from, crdt_pid) do
  value = DeltaCrdt.get(crdt_pid, key)
  {:reply, {:ok, value}, crdt_pid}
end

The test case for a single node is now passing!

I don’t have much time today to continue, but I’m happy with the small progress and a clear path for where to go next.