A third option

I am still working to find a reliable way to test process handoff in a cluster. LocalCluster seemed promising, but its internal mechanism for spawning remote nodes uses :slave.start_link which requires converting the calling process node into a distributed node and that is not going to work for the scenario I’m testing. I’m going to need to hack the library code or create my own implementation to support my use case.

Now that I’ve spent some time working with both LocalCluster and ExUnitClusteredCase, I have a fairly good idea of what I need to build my own implementation. I need to start remote nodes with :peer.start_link which will allow linking the remote node processes to the calling process without needing a base distributed node.

As I look online for a few more examples of using the :peer library for spawning nodes, I come across another package, ExUnitCluster, which is a very simple implementation of what I’m trying to build with :peer.start_link. The source code is only about 200 lines of code including empty lines so I don’t even bother adding the package as a dependency and instead just copy the file contents to a new module in my project.

I create a simple test case to demo the module:

test "demo ex_unit_cluster", ctx do
  cluster = start_supervised!({ClusterHelper.Manager, ctx})
  node1 = ClusterHelper.start_node(cluster)
  node2 = ClusterHelper.start_node(cluster)

  IO.inspect(ClusterHelper.call(cluster, node1, Node, :list, []))
  IO.inspect(ClusterHelper.call(cluster, node2, Node, :list, []))end

The resulting logs:

01:01:55.430 [error] Running MinotaurWeb.Endpoint with Bandit 1.5.0 at http failed, port 4002 already in use
01:01:56.143 [error] Running MinotaurWeb.Endpoint with Bandit 1.5.0 at http failed, port 4002 already in use
[:"Elixir_Minotaur_Cluster_GameSessionTest_test_demo_ex_unit_cluster-4-16088@127.0.0.1"]
[:"Elixir_Minotaur_Cluster_GameSessionTest_test_demo_ex_unit_cluster-450-16088@127.0.0.1"]

The result shows me two things. First, the hardcoded endpoint server port is having a conflict just like it was with LocalCluster before overriding the application environment. This will need to be addressed to allow running simultaneous Phoenix application nodes in the test environment. The second thing is that the two spawned nodes are connected to each other and there is not a master node in the cluster like was the case with LocalCluster. This is exactly what I’ve been trying to accomplish!

Since I added the ExUnitCluster source code directly to my project, it is very simple to implement the same environment override option that LocalCluster used. I change the code for loading application environment from this:

for {app, _, _} <- Application.loaded_applications() do
  for {key, val} <- Application.get_all_env(app) do
    peer_call(pid, Application, :put_env, [app, key, val])
  end
end

To this:

for {app, _, _} <- Application.loaded_applications() do
  base = Application.get_all_env(app)

  environment =
    opts
    |> Keyword.get(:environment, [])
    |> Keyword.get(app, [])
    |> Keyword.merge(base, fn _, v, _ -> v end)

  for {key, val} <- environment do
    peer_call(pid, Application, :put_env, [app, key, val])
  end
end

The change is so straightforward, I fork the repo, add some test cases, and open a pull request.

With my cluster test setup in place, I can finally finish this first cluster test case and get back to my original problem!

Wait what did my wife just say? She is going into labor with our second child! The project will have to wait…