pixel
Select Page

NETWAYS Blog

Application Management in Kubernetes

Working with Kubernetes means working with YAML. A lot of YAML. Some might say too much YAML. Your standard deployment usually consists of a workload resource, such as a Deployment or DaemonSet, in order to get some nice little Pods running. Remember, a Pod consists of one or more Containers and a Container is just a Linux process with some glitter. In addition, we might need a ConfigMap in order to store some configuration for our workload.

Now that our Pods are running we need some infrastructure around them. A Service resource will give us a stable name resolution for the IPs of our Containers (called Endpoints). We should also add some Network Policies, which control traffic flow (think Firewall).

Finally, we might have a Secret containing a TLS key/certificate and an Ingress resource, so that the IngressController knows where to find our workload. Remember, an IngressController is a Reverse Proxy.

So that is a lot of YAML we just wrote. a minimal example can easily have more than 200 lines [No citation needed]. Now multiply the environments we are running (i.e. development, staging, production) et voilà even more YAML. In order to reduce this repetition some tools have emerged in the Kubernetes ecosystem.

This article will give an overview of some of these tools. The focus will be on tools that use templates, overlays or functions to generate YAML. Tooling that leverages Custom Resources and Controllers are outside the scope of this article, but will be briefly mentioned in the conclusion. Remember, Controllers are just applications that watch the state of the cluster and make changes if the desired state and the current state differ.

Kustomize

One of the quickest ways to minimize repetition is “Kustomize”, since it is built into kubectl. Kustomize does not use templates or a custom DSL, unlike other solutions, instead it merges and overlays YAML to produce an output we can send to the Kubernetes API. Meaning, it is YAML all the way down.

In this and all other examples we will create a Service resource with a custom port and a namespace, just to keep it simple. To get started with kustomize we need a base YAML manifest, overlays containing the changes that are to be applied on the base and a kustomization.yml in which we describe what will happen:

ls -l

base/  # Contains the YAML with the Kubernetes Resources
overlays/  # Contains the YAML with which we customize the base Resources
kustomization.yml  # Describes what we want to customize

cat base/service.yml
---
apiVersion: v1
kind: Service
metadata:
  name: foobar-service
spec:
  selector:
    app.kubernetes.io/name: Foobar
  ports:
    - protocol: TCP
      port: 8080
      targetPort: 80

cat base/kustomization.yml
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
commonLabels:
  app: foobar
resources:
  - service.yml

In `overlays/` we now define what we want to change. The metadata/name of the resources are required so that kustomize knows where to customize. In this example we change the port from 8080 and add a namespace.

cat overlays/dev/kustomization.yml
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
  - ../../base
patchesStrategicMerge:
  - port.yml
namespace: foobar-prod

cat overlays/dev/port.yml
---
apiVersion: v1
kind: Service
metadata:
  name: foobar-service
spec:
  ports:
    - protocol: TCP
      port: 7777
      targetPort: 80

Using `kubectl kustomize` we can now generate the final full manifest with a specific overlay:

kubectl kustomize overlays/dev
...

kubectl kustomize overlays/prod

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: foobar
  name: foobar-service
  namespace: foobar-prod
spec:
  ports:
  - port: 9999
    protocol: TCP
    targetPort: 80
  selector:
    app: foobar
    app.kubernetes.io/name: Foobar

`patchesStrategicMerge` is one of many built-in generators and transformers that can be used to modify the base resources, an extensive list can be found in the documentation. Kustomize offers a quick and easy way to structure YAML manifests, being built into kubectl lowers the barrier of entry. You can ship the kustomize code directly with your application for example.
However, it can be somewhat cumbersome to work with larger codebases and is not really suited for distribution.

Helm

Helm uses a template language to generate YAML that can then be sent to the Kubernetes API. Either by Helm itself or kubectl. It is widely used in the Kubernetes ecosystem. In the CNCF Cloud Native Survey 2020 63% of the 1,324 respondents named Helm as their preferred method for packaging.

In order to use Helm we need its CLI and a Helm Chart, which we either create ourselves or use an existing one. A Helm Chart consists of one or more templates and a Chart.yaml (containing metadata) and can then be packaged and hosted for others to download or be used locally.

The templates use the Go Template Language in which we have various options to customize data, such as variables, pipelines and functions. For example, we can define variables that can be filled by the user, i.e. “{{ .Values.service.port }}” can be filled in by the user. This is done by creating a YAML-like file for the template. A values.yaml file serves as a place for defaults.

cat templates/service.yaml

---
apiVersion: v1
kind: Service
metadata:
  name: {{ include "foobar.fullname" . }}
  labels:
    {{- include "foobar.labels" . | nindent 4 }}
spec:
  type: {{ .Values.service.type }}
  ports:
    - port: {{ .Values.service.port }}
      type: ClusterIP
      targetPort: http
      protocol: TCP
      name: http
  selector:
    {{- include "foobar.selectorLabels" . | nindent 4 }}


cat values.yaml  # The defaults
---
service:
  port: 8080

Custom value files can now be created in order to change what is rendered:

cat prod.yaml  # A custom values file
---
service:
  port: 7777

The Helm CLI can then be used to either render the finale YAML manifest or directly talk to the Kubernetes API (creating the resources).

# helm template name-of-the-release path-to-helm-code -f path-to-values-file
helm template foobar-release foobar/ -f foobar/prod.yaml

# Now we can change the port depending on our value file
helm template foobar-release foobar/ -f foobar/prod.yaml | grep port:
   - port: 9999
helm template foobar-release foobar/ -f foobar/dev.yaml | grep port:
   - port: 7777

# Directly create the resources in the cluster
helm install foobar-release foobar/ -f foobar/prod.yaml

Getting started with Helm is usually quite simple and rewarding, it quickly reduces the amount of YAML and offers options for packaging ones applications. In my experience, things get tricky once the parameterized templates start growing. Trying to cover every option/edge-cases and maintaining sanity becomes a balancing act. Nonetheless working with Kubernetes you are likely to run into Helm, maybe even without knowing it.

Terraform

Terraform, a tool to write Infrastructure as Code, can also be used to deploy Kubernetes resources.

Terraform uses a DSL named HashiCorp Configuration Language (HCL) to describe the desired infrastructure, this code is then applied with the Terraform CLI. Storing the state of what is already deployed Terraform knows when to change what in a declarative way. It uses “Providers” to bundle the logic to talk a specific API (such as Hetzner, AWS, or Kubernetes).

In this example we will use a very simple Terraform project structure, a more complex structure is possible however. In a central Terraform file we will describe the resource we want to create together with its variables:

provider "kubernetes" {
# Here we could specify how to connect to a cluster
}

# Default variables
variable "port" {
  type        = string
  description = "Port of the Service"
  default     = "8080"
}

# The resources template
resource "kubernetes_service" "foobar" {
  metadata {
    name = "foobar-example"
  }
  spec {
    selector = {
      app = "Foobar"
    }
    session_affinity = "ClientIP"
    port {
      port        = var.port
      target_port = 80
    }

    type = "ClusterIP"
  }
}

The previously defined variables can be used to change the values depending on our needs. Within tfvars file we can create different sets of variables for different deployments of the resource:

# tfvars contain the actual values that we want
cat terraform/prod.tfvars
port = "9999"

cat terraform/dev.tfvars
port = "7777"

We can now generate the YAML or talk to the Kubernetes API to directly create the resources.

terraform plan -var-file=dev.tfvars

  # kubernetes_service.foobar will be created
  + resource "kubernetes_service" "foobar" {

terraform plan -var-file=prod.tfvars

  # kubernetes_service.foobar will be created
  + resource "kubernetes_service" "foobar" {

Terraform offers a nice solution when you are already using Terraform to manage infrastructure. Having a one-stop-shop for deploying infrastructure and the applications on top. Introducing Terraform in the tool-chain might be a hurdle though, since tools like Helm might be more suited when it comes to working with Kubernetes.

kpt

kpt uses predefined functions to modify YAML manifests, instead of templates or YAML generators/overlays. These functions – i.e. setters and substitutions – are then applied in a declarative way to a base YAML manifest. All this is combined in a kpt Package, containing a Kptfile (Metadata and declaration of what functions are applied) and YAML manifests of Kubernetes resources.

Kptfiles are written in YAML and are similar to Kubernetes resources. Within the Kptfile we declare a pipeline that contains the functions to be applied, for example:

pipeline:
  mutators:
    - image: gcr.io/kpt-fn/set-labels:v0.1
      configMap:
        app: foobar
  validators:
    - image: gcr.io/kpt-fn/kubeval:v0.3

As we can see, the functions can have different types (mutators, validator, etc) and are inside an OCI Image. This OCI Image contains the code to execute the described function. For example, “set-label” is a Golang tool which adds a set of labels to resources. You could also write your own.

ls foobar/

Kptfile  # Contains metadata and functions to apply
service.yaml # Kubernetes Manifest

cat Kptfile
---
apiVersion: kpt.dev/v1
kind: Kptfile
metadata:
  name: foobar-example
pipeline:
  mutators:
    - image: gcr.io/kpt-fn/apply-setters:v0.2.0
apiVersion: v1

cat service.yaml
---
apiVersion: v1
kind: Service
metadata:
  name: foobar-service
  labels:
    app: nginx
  annotations:
    internal.kpt.dev/upstream-identifier: '|Service|default|foobar-service'
spec:
  selector:
    app.kubernetes.io/name: Foobar
  ports:
    - protocol: TCP
      port: 8080 # kpt-set: ${foobar-service-port}
      targetPort: 80

Once we are done with our pipeline definition our kpt package is ready to deploy. We can directly talk to the Kubernetes API and deploy or just render the YAML.

kpt fn render --output unwrap

Packaging in kpt is based on Git forking. Once we are done with our package we can host it on a git server and users fork the package to use it. Users can then define their own pipelines to customize the manifests.

Note that kpt is still in development and subject to many major changes (the examples here are done with version 1.0.0-beta.24). This is currently my main issue with kpt, otherwise I think it is a very neat solution. Having the option to download and fully extend existing packages is very interesting, all while simply working with YAML manifests.

If you want to learn more about kpt, check out the “Kubernetes Podcast from Google” Episode 99 (https://kubernetespodcast.com/episode/099-kpt/)

Conclusion

The shown Application Management tools are either templates, generators/overlays or functions to generate YAML. Another approach for managing applications is by using Custom Resources in Kubernetes that are managed by a custom Controller. Examples of such tools would be Acorn (https://acorn.io/) or Ketch (https://www.theketch.io/)

Personally, I do not think that these CRD based tools are an optimal solution.

The complexity they introduce might be fine if the benefit is worth it, however, my real issue is with adding an abstraction for the API. Meaning, you never learn to work with the API. While yes, developers might want to focus on the Application not the Infrastructure, it is still very likely that you will have to know how the API works (i.e. for monitoring or debugging). When learning how to write/render correct Kubernetes manifests for your application, you also learn how to work with Kubernetes resources.

Furthermore, some of the CRD based tools might have limitations when creating resources (i.e. the Ketch version at the time of writing requires you to create PVCs). Thus, depending on the solution you still need to learn how to write YAML.

That being said, it is true that template or DSL based tools have a similar downside. It can be quite tricky to cover all use-cases in a parameterized templates and thus be difficult to maintain. I recommend this design-proposal for further reading on the topic:

https://github.com/kubernetes/design-proposals-archive/blob/main/architecture/declarative-application-management.md#parameterization-pitfalls

The application management ecosystem is still a very active and changing part of Kubernetes. We will probably see a couple of solutions come and go until it stabilizes, if ever. Helm will probably stick around due to its current popularity and the momentum this carries.

I personally will keep an eye on kpt, which seems a very nice balance between complexity and modularity, while still working with YAML. Oh how I love YAML⸮

References

Markus Opolka
Markus Opolka
Consultant

Markus war nach seiner Ausbildung als Fachinformatiker mehrere Jahre als Systemadministrator tätig und hat währenddessen ein Master-Studium Linguistik an der FAU absolviert. Seit 2022 ist er bei NETWAYS als Consultant tätig. Hier kümmert er sich um die Themen Container, Kubernetes, Puppet und Ansible. Privat findet man ihn auf dem Fahrrad, dem Sofa oder auf GitHub.

NETWAYS stellt sich vor – Markus Opolka

 

Name: Markus Opolka

Alter: 34

Position bei NETWAYS: Consultant

Bei NETWAYS seit: August 2022

 

 

 

Wie bist du zu NETWAYS gekommen und was genau gehört zu Deinem Aufgabenbereich?

Ich wusste schon früh in meiner Berufslaufbahn, dass ich gern mit oder um Open Source tätig sein wollte. Einfach mal einen Blick hinter die Kulissen der Software, die man tagtäglich nutzt, blicken zu können ist doch spannend, selbst wenn man nicht partizipieren kann und möchte.

Nach vielen Jahren in der IT-Industrie und auch in Nürnberg ansässig, war mir NETWAYS natürlich ein Begriff und schien mir der ideale Ort, um weiterhin mit offener Software zu arbeiten.

Aktuell bin ich hier als Consultant tätig und konsultiere den lieben langen Tag rund um Container-basierte Lösungen, GitLab, Kubernetes und Prometheus. Außerdem bin ich für diese Themen auch als Trainer zuständig.

 

Was macht dir an deiner Arbeit am meisten Spaß?

Sehr spannend (was ja auch irgendwie spaßig ist) finde ich, wie viele verschiedene IT Umgebungen man so vor sich hat. Jeder Anwendungsfall schaut irgendwie anders aus und mit jeder Lösung lernt man wieder was neues. Wirklich Spaß macht es mir aber, dass man so auch die Möglichkeit hat, Bugfixes und Features in Open Source Projekten einzubringen. Wenn man ein Problem schon beim Kunden gelöst hat, warum nicht diese Lösung dem Projekt zukommen lassen? Dann haben alle etwas davon.

Und gibt es was schöneres als einen akzeptierten Pull Request in einem bekannten Open Source Projekt? Wenn ja, will ich es gar nicht wissen!

 

Was machst du, wenn du nicht bei NETWAYS bist?

Zur Entspannung läuft gerne mal ein Podcast oder eine Serie. Von beidem gibt es aber mittlerweile so viel, dass man kaum mehr hinterher kommt. Ab und an versuche ich auch daheim mal etwas Code zu schreiben. Zusammen mit guter Musik in den Kopfhörern ist das schon sehr meditativ.

Ansonsten passiert es gerne, dass ich mehr darüber nachdenke, was ich denn so in “Cities: Skylines” bauen könnte, statt es dann wirklich zu tun. Irgendwann klappt es sicher mal, die perfekte, fahrradfreundliche Stadt zu entwerfen!

 

Wie geht es in Zukunft bei dir weiter?

Hoffentlich mit vielen spannenden Projekten. Längerfristig würde ich gerne mehr Code schreiben, vielleicht noch eine obskure Programmiersprache lernen (insert “Hello World in Brainf*ck” here).

Markus Opolka
Markus Opolka
Consultant

Markus war nach seiner Ausbildung als Fachinformatiker mehrere Jahre als Systemadministrator tätig und hat währenddessen ein Master-Studium Linguistik an der FAU absolviert. Seit 2022 ist er bei NETWAYS als Consultant tätig. Hier kümmert er sich um die Themen Container, Kubernetes, Puppet und Ansible. Privat findet man ihn auf dem Fahrrad, dem Sofa oder auf GitHub.

Ansible Continuous Deployment without AWX/Tower/AAP

Why Ansible?

Ansible is a configuration management tool to automate tasks in your IT infrastructure. It offers a rather low barrier of entry, when compared to other tools. A local Ansible installation (i.e. on your machine) with SSH access to the infrastructure you want to manage is sufficient for getting started. Meaning, it requires no substantial additions to existing infrastructure (e.g. management servers or agents to install). Ansible also ships with an extensive standard library and has a large selection of modules to extend functionality.

Why Continuous Deployment?

Once a simple Ansible setup is up & running and things start to scale to more contributors, servers or services, it is usually necessary to automate the integration of code changes. By creating one or more central Ansible repositories, we create a single source of truth for our infrastructure. We shift to continuous integration, start testing and verifying changes to the code base.

The next logical step is then to use automate the deployment of this single source of truth, to make sure changes are applied in a timely/consistent manner. Infrastructure code that is not deployed on a regular basis tends to become riskier to deploy each day, since it’s better to discover errors promptly so that they can be traced back to recent code changes; and we all know that people make undocumented hand-crafted changes that are then overwritten and all goes up in flames. Thus we want shorter, more frequent cycles and consistent deployments to avoid our infrastructure code becoming stale and risky.

Why not AWX/Tower/AAP?

AWX (aka. Tower, now Ansible Automation Platform) aims to provide a continuous deployment experience for Ansible. Quote:

Ansible Tower(formerly ‘AWX’) is a web-based solution that makes Ansible even more easy to use for IT teams of all kinds.

It offers a wide array of features for all your ‘Ansible at scale’ needs, however it comes with some strings attached. Namely, it involves management overhead for smaller environments as it introduces yet another tool to install, learn, update and manage throughout its life cycle. Not only that but from version 18.0 onward the preferred way to install AWX is the AWX (Kubernetes) Operator. Meaning – preferably – we would need a Kubernetes instance laying around somewhere. Of course, there is always the option to use “unorchestrated” Containers as an alternative, but that comes with its own obstacles.

Installation and management aside, there is also Red Hat’s upstream first approach to consider. Meaning, AWX is the upstream project of Ansible Tower and thus it might not be as ‘stable’. Furthermore, Red Hat does not recommend AWX for production environments. Quote:

The AWX team currently plans to release new builds approximately every 2 weeks. The AWX team will flag certain builds as “stable” at their discretion. Note that the term “stable” does not imply fitness for production usage or any kind of warranty whatsoever.

Obviously, there are alternatives to AWX/Ansible Tower. Rundeck allows for predefined workflows, these jobs can then be triggered from a Web GUI, API, CLI, or by schedule and works not just with Ansible. Semaphore offers a simple UI for Ansible to manage projects (environments, inventories, repositories, etc.) and includes an API for launching tasks. Puppet aficionados may already know Foreman, which is a great and battle-tested tool for provisioning machines. You can use the “Foreman Remote Execution” to run your Playbooks and use Ansible callbacks to register new machines in Foreman. Here are some recommended videos on this topic:

– FOSDEM 2020, Foreman meets Ansible: https://www.youtube.com/watch?v=PQYCiJlnpHM
– OSCamp 2019, Ansible automation for Foreman (hosts): https://www.youtube.com/watch?v=Lt0MksAIYuQ

That being said, the premise was to avoid substantially extending any existing infrastructure. Any of the mentioned tools need at least an external database service (e.g. MariaDB, MySQL or PostgreSQL). With that in mind, this article will now describe alternative solutions for continuous deployment without AWX/Ansible Tower. It will show examples using the GitLab CI, however, the presented solutions should be adaptable to various CI/CD solutions.

Ansible Continuous Deployment via the Pipeline

For this article, we will assume a central Ansible Repository on an existing GitLab Server with some GitLab CI Pipeline already in place. Meaning, we might also have some experience with CI jobs in Containers.

Many (if not all) CI/CD solutions feature isolated jobs within Containers, which enables us to quickly spin up predefined execution environments for these jobs (e.g. pre-installed with various tools for testing). Furthermore, it is possible to use specific machines for specific jobs, or place certain machine in different network zones (e.g. a node that triggers something in production environment could be isolated from the rest).

Given this setup we will now explore two scenarios for Ansible Continuous Deployment via pipeline jobs. One based on SSH and the other based on HTTP (Webhooks).

The example Ansible repository follows a standard pattern and is safely stored in a git repository:

git clone git@git.my-example-company.com:ansible/ansible-configuration.git
cd ansible-configuration/

ls -l
ansible.cfg
playbooks/
roles/
inventory/
collections/
site.yml
requirements.yml

SSH

Since the basis for all Ansible deployments is SSH we will leverage this protocol to deploy our code. Fundamentally, there are two options to achieve this:

– Connect from a pipeline job to a central machine with Ansible already installed, download the code changes there and trigger a playbook
– Run an Ansible playbook directly in a pipeline job (i.e. a Container)

For this example we will generate a specific SSH Keypair that is then used in the pipeline. The public key needs to be added to the `authorized_keys` of any machines we want to connect to. Secrets such as the SSH private key can be managed directly in GitLab (CI Variables) or be stored in an external secret management tool (e.g. Hashicorp Vault). Don’t hardcode secrets in the Ansible code base or CI configuration.

# -t keytype (preferably use ed25519 whenever possible)
# -f output file
# -N passphrase
# -C comment

ssh-keygen -t ed25519 -f ansible-deployment -N '' -C 'Ansible-Deployment-Key'

Option A: via an Ansible machine

In this scenario, we connect from a CI job in the pipeline to a machine with Ansible already installed. On this machine we will clone the Ansible configuration and trigger a playbook. This article will refer to this machine as ‘central Ansible node’, obviously a more complex infrastructure might need more of these machines (i.e. per network zone).

First, we need to copy the previously generated SSH Key onto the central Ansible node, so that we connect from the GitLab CI job. Second, we require a working Ansible setup on this node. Please note, that a detailed installation process will not be explained in this article, since the focus lies on the CI/CD part. We assume that this node has a dedicated user for Ansible is be able to successfully run the Ansible code.

# Copy public key for deployment on the central Ansible node
scp ansible-deployment.pub ansible@central-ansible-node.local
ssh ansible@central-ansible-node.local

# Authorize the public key for outside connections
cat 'ansible-deployment.pub' >> ~/.ssh/authorized_keys

# Install Ansible
pip3 install --user ansible # or ansible==version
# Further setup like inventory creation or dependency installation happens here...

At this point we assume, we can connect to our infrastructure and run Ansible playbooks at our leisure. Next we will create a GitLab CI job which do the following:

  • Retrieve the previously generated SSH private key from our secrets, so that we can connect to the central Ansible node
  • Connect to the central Ansible node and clone the repository there. We will use the GitLab’s CI job tokens for this
  • Create a temporary directory to isolate each pipeline job
  • Run a playbook via SSH on the central Ansible node
---
stages:
- deploy

variables:
CENTRAL_ANSIBLE_NODE: central-ansible-node.local
# Or you can provide a ssh_known_hosts file
ANSIBLE_HOST_KEY_CHECKING=False

deploy-ansible:
image: docker.io/busybox:latest
stage: deploy
before_script:
- mkdir -p ~/.ssh
# SSH_KNOWN_HOSTS is a CI variable to make sure we connect to the correct node
- echo $SSH_KNOWN_HOSTS ~/.ssh/known_hosts
# The SSH private key is a CI variable
- echo $SSH_PRIVATE_KEY > id_ed25519
- chmod 400 id_ed25519
script:
- TMPDIR=$(ssh -i id_ed25519 $CENTRAL_ANSIBLE_NODE "mktemp -d")
- ssh -i id_ed25519 $CENTRAL_ANSIBLE_SERVER "git clone https://gitlab-ci-token:${CI_JOB_TOKEN}@git.my-example-company.com:ansible/ansible-configuration.git $TMPDIR"
- ssh -i id_ed25519 $CENTRAL_ANSIBLE_SERVER "ansible-playbook $TMPDIR/site.yaml"

This basic example can be extended in many ways. For example, CI variables could be used to control which Ansible playbook is executed, change which hosts or tags are included. Furthermore GitLab can also trigger jobs on a schedule. Some of the benefits of this approach are that it is rather easy to set up since it mirrors the local execution workflow, plus the deployment can be debugged and triggered on the central Ansible node.

However, we now have a central Ansible node to manage and we might need several in different network zones. Additionally the `mktemp` solution is a bit hacky and might need a garbage collection job (e.g. `tmpreaper`). The next solution will alleviate some of these issues.

Option B: directly via a Pipeline

In this scenario, Ansible executed directly in the CI pipeline job (i.e. a Container). It is recommended to use a custom pre-build Ansible Container Image, to make the jobs faster and more consistent. This Image may contain a specific Ansible version and further tools required for the given code. The Image can be stored in the GitLab Container Registry. Building and storing Container Images is outside the scope of this article. Here’s a small example of how it might look like:

cat Dockerfile.ansible.example

FROM docker.io/python:3-alpine
RUN pip install --no-cache-dir ansible
# ... Install further tools or infrastructure specifics here
# The image will be stored at registry.my-example-company.com:ansible/ansible-configuration/runner:latest

cat .gitlab-ci.yaml
---
stages:
- deploy

variables:
# Or you can provide a ssh_known_hosts file
ANSIBLE_HOST_KEY_CHECKING=False

deploy-ansible:
image: registry.my-example-company.com:ansible/ansible-configuration/runner:latest
stage: deploy
before_script:
# The SSH private key is a CI variable
- echo $SSH_PRIVATE_KEY > id_ed25519
- chmod 400 id_ed25519
script:
- ansible-playbook --private-key id_ed25519 site.yaml

This removes the need for a central Ansible node and the need for external garbage collection, since these CI jobs are ephemeral by default. That being said, if we have a more complex network setup we might need runners in these zones and a way to control which job is executed where.

HTTP (Webhooks)

In this scenario, we setup another central Ansible node that will run the playbooks, however, there won’t be a SSH connection from the CI job. Instead we will trigger a webhook on this central Ansible node. While this scenario is more complex it offers some benefits when compared to previously discussed options.

Since there are several ways to implement incoming webhooks, we will not view a specific implementation but discuss the concept. Interestingly enough, a webhook-based feature is currently in developer preview to be provided by Ansible. Event-Based-Ansible provides a webhook service that can trigger Playbooks.

In this example we have a service providing webhooks running on central-ansible-node.local on port 8080. This service is configured to run Ansible with various options which we can pass via a HTTP POST request. This request will certain data that controls the Ansible playbook.

cat trigger-site-yaml.json
{
"token": "$WEBHOOK_TOKEN",
"playbook": "site.yaml",
"limit": "staging"
}

cat .gitlab-ci.yaml
---
stages:
- deploy

variables:
CENTRAL_ANSIBLE_NODE: central-ansible-node.local:8080

deploy-ansible:
image: docker.io/alpine:latest
stage: deploy
before_script:
- apk add curl gettext
script:
# Replace the $WEBHOOK_TOKEN placeholder in the file with the real value from the CI variables
- envsubst < trigger-site-yaml.json > trigger-site-yaml.run.json
- curl -X POST -H "Content-Type:application/json" -d @trigger-site-yaml.run.json $CENTRAL_ANSIBLE_NODE

From a security standpoint we remove the need for reachable SSH ports, the central Ansible node now just accepts HTTP (or specific HTTP methods) secured by Tokens. Furthermore there now is a layer between our CI jobs and the Ansible playbooks which can be used to validate requests.

That being said, this extra layer could also be seen as a hurdle that might break. And beside the central Ansible node we now need to manage a service that provides these webhooks. However, in the future Event-Based-Ansible might alleviate some of these issues.

Conclusions

Deploying Ansible is quite flexible due to its simple operational model based on SSH. As we have seen, there are some low-effort alternatives to AWX/Tower that can be applied in various use cases. However, at some point there is a maintainability tradeoff. Meaning, even though AWX/Tower might not appear as stable or is sometimes tricky to operate, once an environment is large enough it might be a better option than custom creations. Probably not a satisfying conclusion for an article named “without AWX/Tower”, I agree.

Foreman presents an interesting alternative due to its myriad of other features that you get with an installation. Finally, Event-Based-Ansible could be very promising webhook-based solution when it comes to automated deployments. Starting simple and then pivoting to a more complex system is always an option.

References

Markus Opolka
Markus Opolka
Consultant

Markus war nach seiner Ausbildung als Fachinformatiker mehrere Jahre als Systemadministrator tätig und hat währenddessen ein Master-Studium Linguistik an der FAU absolviert. Seit 2022 ist er bei NETWAYS als Consultant tätig. Hier kümmert er sich um die Themen Container, Kubernetes, Puppet und Ansible. Privat findet man ihn auf dem Fahrrad, dem Sofa oder auf GitHub.