Using Akka cluster-sharding and Akka HTTP on Kubernetes
This article captures the implementation of an application serving data over HTTP which is stored in cluster-sharded actors and deployed on Kubernetes.
Use case: An application, serving data over HTTP and with a high request rate, and the latency of order of 10ms with limited database IOPS available.
My initial idea was to cache it in memory, which worked pretty well for some time. But this meant larger instances due to duplication of cached data in the instances behind the load balancer. As an alternative I wanted to use Kubernetes for this problem and do a proof of concept (PoC) of a distributed cache with Akka cluster-sharding and Akka-HTTP on Kubernetes.
This article is by no means a complete tutorial to Akka cluster sharding or Kubernetes. It outlines knowledge I gained while doing this PoC. The code for this PoC can be found here.
Let’s dig into the details of this implementation.
To form an Akka Cluster, there needs to a pre-defined ordered set of contact points often called seed nodes. Each Akka node will try to register itself with the first node from the list of seed nodes. Once, all the seed nodes have joined the cluster, any new node can join the cluster programmatically.
The ordered part is important here, because if the first seed node changes frequently then the chances of split-brain increases. More info about Akka Clustering can be found here.
So, the challenge here with Kubernetes was the ordered set of predefined nodes, and here comes StatefulSet(s) and Headless Services to the rescue.
StatefulSet guarantees stable and ordered pod creation, which satisfies the requirement of our seed nodes, and Headless Service is responsible for their deterministic discovery in the network. So, the first node will be “<application>-0” and the second will be “<application>-1” and so on.
- <application> is replaced by the actual name of the application
The DNS for the seed nodes will be of the form:
- Start with creating the Kubernetes resources. First, the Headless Service, which is responsible for deterministic discovery of seed nodes(Pods), can be created using the following manifest:
- port: 2551
Note, that the “clusterIP” is set to “None.” Which indicates it’s a Headless Service.
2. Create a StatefulSet, which is a manifest for ordered pod creation:
- name: distributed-cache
- name: AKKA_ACTOR_SYSTEM_NAME
- name: AKKA_REMOTING_BIND_PORT
- name: POD_NAME
- name: AKKA_REMOTING_BIND_DOMAIN
- name: AKKA_SEED_NODES
- containerPort: 2551
3. Create a service, which will be responsible for redirecting outside internet traffic to pods:
- port: 80
# this needs to match your container port
# DNS name your application should be exposed on
- host: "distributed-cache.com"
And the distributed cache is ready to use:
This article covers Akka Cluster-sharding on Kubernetes with the pre-requirements of an ordered set of Seed Nodes and their deterministic discovery in the network, and how it can be solved with StatefulSet(s) and Headless Service(s).
This approach of caching data in a distributed fashion offered the following advantages:
- Less database lookup, saving database IOPS
- Efficient usage of resources; fewer instances as a result of no duplication of data
- Lower latencies to serve data
Interested in working at Zalando Tech? Our job openings are here.