Understanding Elixir OTP Applications - Part 1, Distribution
I love Elixir. Coming from Ruby I find myself in familiar territory with the syntax yet also rejoice at some of the things Elixir does differently. I'm doing a series on Elixir and Erlang's principals and how they work underneath. This post will cover distributed systems and processes.
Elixir is built on top of Erlang. Erlang was the language developed at Ericsson, later known as Sony Ericsson, the telecom company. When the original engineers designed Erlang's Open Telecom Protocol they had some ideas about what kind of characteristics the system would need to handle their telecom architecture reliably.
OTP was designed to be distributed, fault-tolerant, real-time, highly-available, and hot-swappable.
That's a lot to digest - what does that all mean? In this article I'm going to go over how the distributed portion of this works.
Erlang and Elixir's concepts are fairly interchangeable. A lot of modules in Elixir are abstractions around Erlang's abstractions so you can think of them as the same concept. In this article I'm going to discuss how Elixir's Processes and GenServer works which is at the core of their distributed systems.
Communicating with distributed processes through mailboxes
OTP was designed to work on a distributed networks. It manages this through its mailbox system. Any Elixir Process can spawn child processes - they share no memory and are therefore completely isolated. To communicate with each other, they send messages to the processes mailbox identified by the Process ID (PID). Each Process can send messages to other processes and also configure a listener on its own mailbox to handle incoming messages.
First let's spawn a simple Process that returns a message to the parent's mailbox:
First we're getting the PID for the current Process so we can use that within the anonymous function we're passing to spawn.
Inside the spawned function, we're calling send/3. The first argument is the Process we are sending the message to, and the second is the message. In this case we're sending a tuple with :ok and the PID of the spawned Process. The message can be in any format you like - in Elixir it's common to use tuples so we can match against the first element in order to know how to process the second element.
We can inspect the messages using Process.info/2 and see that our message has already been returned.
Next let's process those messages:
parent = self()
spawn fn -> send(parent, {:ok, self()}) end
receive do
{:ok, pid} -> "Got ok from #{inspect pid}"
_ -> "Unknown message type"
end
The receive/1 function allows us to create a listener that waits for a message to be sent to the inbox for the current Process. Inside the receive block then we can pattern match the type of message we received. If you're confused by the syntax of receive, that's because it's a SpecialForm macro.
Ok, now let's create a loop and spawn a linked Process, ensuring our Process doesn't just die. With this simple setup we can store state or in this simple example just return a message mimicking a primitive bot.
defmodule Bot do
def start_link do
Task.start_link(fn -> loop() end)
end
defp loop() do
receive do
{:help, caller} ->
send caller, "You can ask me what today's date is by passing :date_today"
loop()
{:date_today, caller} ->
send caller, "Today's date is #{Date.utc_today()}"
loop()
end
end
end
{:ok, pid} = Bot.start_link
send pid, {:help, self()}
flush()
=> "You can ask me what today's date is by passing :date_today"
=> :ok
send pid, {:date_today, self()}
=> "Today's date is 2020-11-20"
=> :ok
Notice how the spawned task is looping each time the message is processed inside receive. This is the basis for GenServer.
Understanding the GenServer Interface
When building the OTP libraries the developers recognized this pattern would be common, so they created an abstraction to handle looping and message processing for us.
A module that implements GenServer can implement various callbacks. We're going to use init/1, handle_cast/2, and handle_call/3. We can then use GenServer to start a Process which will loop, waiting for a cast/2, or call/3. Once the Process receives cast/2 or call/3, it will trigger our handle_cast/2 and handle_call/3 callbacks respectively.
Let's create a key-value store, a simplified version of Redis, or Memcached. For convenience and readability we'll add start/1, set/3, get/2, and delete/2 functions to abstract the GenServer.cast/2 and GenServer.call/3 calls
defmodule KeyValueStore do
use GenServer
def start(initial_state \\ %{}) do
GenServer.start(__MODULE__, initial_state)
end
def set(server, key, value) do
GenServer.cast(server, {:set, key, value})
end
def get(server, key) do
GenServer.call(server, {:get, key})
end
def delete(server, key) do
GenServer.cast(server, {:delete, key})
end
def init(state) do
{:ok, state}
end
def handle_cast({:set, key, value}, state) do
{:noreply, Map.put(state, key, value)}
end
def handle_cast({:delete, key}, state) do
{:noreply, Map.delete(state, key)}
end
def handle_call({:get, key}, _from, state) do
value = case Map.fetch(state, key) do
{:ok, v} -> v
:error -> nil
end
{:reply, value, state}
end
end
Now we can start our KeyValueStore, pass key value pairs to it and retrieve them at a later date.