Unified deployment process with Kubernetes for a Kafka connector

Thu, 23 Aug 2018 ǀ Martin Leuthold

THE CHALLENGE

Developing and deploying Kafka connectors involves different kinds of technologies and processes. Before our unified deployment process we had these four building blocks:


Locally we used either the Confluent package to run a single-node Kafka connect cluster to test our code or we were using a local Docker Swarm architecture, which included a dockerized Zookeeper, Broker and Kafka connect cluster. This was the first step in our process.

In our second step, when the code was in a mature state, we needed to deploy the code in the cloud. Our cloud infrastructure was either AWS ECS or a self-hosted Kubernetes cluster running on AWS resources. However, this step took us a lot of time to get right, because we had to test against "real" cloud resources. Some of the drawbacks were:

UNIFIED DEPLOYMENT

With the company support for Kubernetes and the decision within the team to use Kubernetes over AWS ECS, we strove for a full Kubernetes development environment which can be used locally and in the cloud. We therefore ditched Docker Swarm completely, because we didn't had enough resources to maintain it in conjunction with Kubernetes. In addition, going for Kubernetes allowed us to solve some minor issues we had with Kafka connector deployment in the past, e.g. where to put cron jobs? Our Kafka connector deployment consists of six components, which are shown on the picture below:

On the right hand side we provide the Kafka broker and zookeeper as PODs in our Kubernetes environment. These PODs are only available when developing and deploying locally. When the Kafka connector is deployed in the cloud the configuration will point to the existing Kafka cluster omitting the PODs for Kafka broker and zookeeper.

On the left hand side (from left to right) we provide all the PODs necessary to operate a Kafka connector:

A detailed description for each POD is not part of this blog post, but could be the topic of another blog post (Leave a comment, if you want to hear more). We just wanted to illustrate how many components it takes to have a full deployment of a Kafka connector.

TECHNICAL IMPLEMENTATION

To get to the unified deployment process, we were using the project Minikube, the Confluent Docker images for the Broker, Zookeeper and the Kafka connector itself and a Makefile.

On the diagram on the left hand side is the Makefile, which holds the targets "build, deploy and undeploy". In addition for local deployment the target "startMinikube" is provided to launch a local Kubernetes cluster.

For testing the deployment code these steps are required:

SHORT DEVELOPMENT CYCLES

Having a local Kubernetes cluster with our own Kafka cluster allowed us to debug real world problem like network bandwidth limitations faster.

In one case we finished our development code and our deployment code and assumed our solution would work in the cloud the same way as it works locally. Unfortunately the connection between the Kafka connector and the Kafka brokers is much slower in the cloud than local, because locally we could avoid any network traffic. But in the cloud we stumbled upon timeout issues (classic!), when we were loading data from the Kafka connector to the Kafka broker. In the cloud the connection between the two were 10x times slower, causing the flushing of a single file to exceed a 5 minutes time limit and killing the Kafka connector task.

With our local Kubernetes cluster we were able to simulate the real world network limitation and test our code locally, resulting in faster optimization cycles for our Kafka connector configuration and a more resilient code to handle timeouts.

RESULT

We became able to unify all the different approaches to Kubernetes and also became able to develop and test the deployment code locally.

Developing the deployment code locally saved us a lot of time, i.e. the feedback cycles were much shorter:

A side effect was that our way of deployment became cloud agnostic, i.e. we could easily switch to Google Cloud or Microsoft Azure platform as long as a Kubernetes cluster is present there.

For the future we want to use Kubernetes extensively for all our Kafka connectors but also Kafka streams. Also we would like to migrate all our existing Kafka connectors from AWS ECS to Kubernetes.

If you are interested in this topic or want to participate in any way feel free to take a look at our job offerings and apply for team Lambda. We are hiring!