DEV stories: Icinga Core trainees in the making

DEV stories: Icinga Core trainees in the making

When my dev leads approached me with the idea to guide a trainee in the Icinga core topic, I was like … wow, sounds interesting and finally a chance to share my knowledge.

But where should I start and how can it be organized with my ongoing projects?

 

Prepare for the unexpected

My view on the code and how things are organized changed quite a bit since then. You cannot expect things, nor should you throw everything you know into the pool. While working on Icinga 2.11, I’ve collected ideas and issues for moving Henrik into these topics. In addition to that, we’ve improved on ticket and documentation quality, including technical concepts and much more.

My colleagues know me as the “Where is the test protocol?” PR reviewer. Also, reliable configuration and steps on reproducing the issues and problems are highly encouraged. Why? The past has proven that little to zero content in ticket makes debugging and problem analysis really hard. Knowledge transfer is an investment in the future of both NETWAYS and Icinga.

Some say, that documentation would replace their job. My mission is to document everything for my colleagues to make their life easier. Later on, they contribute to fixing bugs and implement new features while I’m moving into project management, architecture and future trainees. They learn what I know, especially “fresh” trainees can be challenged to learn new things and don’t necessarily need to change habits.

 

Clear instructions?

At the beginning, yes. Henrik started with the C++ basics, a really old book from my studies in 2002. C++11 is a thing here, still, the real “old fashioned” basics with short examples and feedback workshops prove the rule. Later on, we went for an online course and our own requirements.

Since Icinga 2 is a complex tool with an even more complex source code, I decided to not immediately throw Henrik into it. Instead, we had the chance to work with the Tinkerforge weather station. This follows the evaluation from our Startupdays and new product inside the NETWAYS shop. The instructions were simple, but not so detailed:

  • Put the components together and learn about the main functionality. This is where the “learn by playing” feeling helps a lot.
  • Explore the online documentation and learn how to use the API bindings to program the Tinkerforge bricks and sensors.
  • Use the existing check_tinkerforge plugin written in Python to see how it works
  • Write C++ code which talks to the API and fetches sensor data

Documentation, a blog post and keeping sales updated in an RT ticket was also part of the project. Having learned about the requirements, totally new environment and communication with multiple teams, this paves the way for future development projects.

 

Freedom

In order to debug and analyse problems or implement new features, we need to first understand the overall functionality. Starting a new project allows for own code, experiences, feedback, refactoring and what not. Icinga 2 as core has grown since early 2012, so it is key to understand the components and how everything is put together.

Where to start? Yep, visit the official Icinga trainings for a sound base. Then start with some Icinga cluster scenarios, with just pointing to the docs. This takes a while to understand, so Henrik was granted two weeks to fully install, test and prepare his findings.

With the freedom provided, and the lessons learned about documentation and feedback, I was surprised with a Powerpoint presentation on the Icinga cluster exercises. Essentially we discussed everything in the main area in our new NETWAYS office. A big flat screen and the chance that colleagues stop by and listen or even add to the discussion. Henrik was so inspired to write a blogpost on TLS.

 

Focus on knowledge

In the latest session, I decided to prove things and did throw a lot of Icinga DSL exercises at Henrik, also with the main question – what’s a DSL anyways?

Many things in the Icinga DSL are hidden gems, with the base parts documented, but missing the bits on how to build them together. From my experience, you cannot explain them in one shot, specific user and customer questions or debugging sessions enforce you to put them together. At the point when lambda functions with callbacks were on the horizon, a 5 hour drive through the DSL ended. Can you explain the following snippet? 😘

object HostGroup "hg1" { assign where host.check_command == "dummy" }
object HostGroup "hg2" {
  assign where true
//  assign where host.name in Cities
 }

object Host "runtime" {
  check_command = "dummy"
  check_interval = 5s
  retry_interval = 5s

  vars.dummy_text = {{
    var mygroup = "hg2"
    var mylog = "henrik"
  //  var nodes = get_objects(Host).filter(node => mygroup in node.groups)

    f = function (node) use(mygroup, mylog) {
      log(LogCritical, "Filterfunc", mylog+node.name )
      return mygroup in node.groups
    }
    var nodes = get_objects(Host).filter(f)

    var nodenames = nodes.map(n => n.name)
    return nodenames.join(",")
   // return Json.encode(nodes.map(n => n.name))
  }}

}

We also did some live coding in the DSL, this is now a new howto on the Icinga community channels: “DSL: Count check plugin usage from service checks“. Maybe we’ll offer an Icinga DSL workshop in the future. This is where I want our trainees become an active part, since it also involves programming knowledge and building the Icinga architecture.

 

Code?

Henrik’s first PR was an isolated request by myself, with executing a check in-memory instead of forking a plugin process. We had drawn lots of pictures already how check execution generally works, including the macro resolver. The first PR approved and merged. What a feeling.

We didn’t stop there – our NETWAYS trainees are working together with creating PRs all over the Icinga project. Henrik had the chance to review a PR from Alex, and also merge it. Slowly granting responsibility and trust is key.

Thanks to trainees asking about this, Icinga 2 now also got a style guide. This includes modern programming techniques such as “auto”, lambda functions and function doc headers shown below.

/**
 * Main interface for notification type to string representation.
 *
 * @param type Notification type enum (int)
 * @return Type as string. Returns empty if not found.
 */
String Notification::NotificationTypeToString(NotificationType type)
{
	auto typeMap = Notification::m_TypeFilterMap;

	auto it = std::find_if(typeMap.begin(), typeMap.end(),
		[&type](const std::pair<String, int>& p) {
			return p.second == type;
	});

	if (it == typeMap.end())
		return Empty;

	return it->first;
}

 

 

Learn and improve

There are many more things in Icinga: The config compiler itself with AST expressions, the newly written network stack including the REST API parts, feature integration with Graphite or Elastic and even more. We’ll cover these topics with future exercises and workshops.

While Henrik is in school, I’m working on Icinga 2.11 with our core team. Thus far, the new release offers improved docs for future trainees and developers:

This also includes evaluating new technologies, writing unit tests and planning code rewrites and/or improvements. Here’s some ideas for future pair programming sessions:

  • Boost.DateTime instead of using C-ish APIs for date and time manipulation. This blocks other ideas with timezones for TimePeriods, etc.
  • DSL methods to print values and retrieve external data
  • Metric enhancements and status endpoints

 

Trainees rock your world

Treat them as colleagues, listen to their questions and see them “grow up”. I admit it, I am sometimes really tired in the evening after talking all day long. On the other day, it makes me smile to see a ready-to-merge pull request or a presentation with own ideas inspired by an old senior dev. This makes me a better person, every day.

I’m looking forward to September with our two new DEV trainees joining our adventure. We are always searching for passionate developers, so why not immediately dive into the above with us? 🙂 Promise, it will be fun with #lifeatnetways and #drageekeksi ❤️

Michael Friedrich
Michael Friedrich
Senior Developer

Michael ist seit vielen Jahren Icinga-Entwickler und hat sich Ende 2012 in das Abenteuer NETWAYS gewagt. Ein Umzug von Wien nach Nürnberg mit der Vorliebe, österreichische Köstlichkeiten zu importieren - so mancher Kollege verzweifelt an den süchtig machenden Dragee-Keksi und der Linzer Torte. Oder schlicht am österreichischen Dialekt der gerne mit Thomas im Büro intensiviert wird ("Jo eh."). Wenn sich Michael mal nicht in der Community helfend meldet, arbeitet er am nächsten LEGO-Projekt oder geniesst...

Modern C++ programming: Coroutines with Boost

(c) https://xkcd.com/303/

We’re rewriting our network stack in Icinga 2.11 in order to to eliminate bugs with timeouts, connection problems, improve the overall performance. Last but not least, we want to use modern library code instead of many thousands of lines of custom written code. More details can be found in this GitHub issue.

From a developer’s point of view, we’ve evaluated different libraries and frameworks before deciding on a possible solution. Alex created several PoCs and already did a deep-dive into several Boost libraries and modern application programming. This really is a challenge for me, keeping up with the new standards and possibilities. Always learning, always improving, so I had a read on the weekend in “Boost C++ Application Development Cookbook – Second Edition“.

One of things which are quite resource consuming in Icinga 2 Core is multi threading with locks, waits and context switching. The more threads you spawn and manage, the more work needs to be done in the Kernel, especially on (embedded) hardware with a single CPU core. Jean already shared insights how Go solves this with Goroutines, now I am looking into Coroutines in C++.

 

Coroutine – what’s that?

Typically, a function in a thread runs, waits for locks, and later returns, freeing the locked resource. What if such a function could be suspended at certain points, and continue once there’s resources available again? The benefit would also be that wait times for locks are reduced.

Boost Coroutine as library provides this functionality. Whenever a function is suspended, its frame is put onto the stack. At a later point, it is then resumed. In the background, the Kernel is not needed for context switching as only stack pointers are stored. This is done with Boost’s Context library which uses hardware registers, and is not portable. Some architectures don’t support it yet (like Sparc).

Boost.Context is a foundational library that provides a sort of cooperative multitasking on a single thread. By providing an abstraction of the current execution state in the current thread, including the stack (with local variables) and stack pointer, all registers and CPU flags, and the instruction pointer, a execution context represents a specific point in the application’s execution path. This is useful for building higher-level abstractions, like coroutinescooperative threads (userland threads) or an equivalent to C# keyword yield in C++.

callcc()/continuation provides the means to suspend the current execution path and to transfer execution control, thereby permitting another context to run on the current thread. This state full transfer mechanism enables a context to suspend execution from within nested functions and, later, to resume from where it was suspended. While the execution path represented by a continuation only runs on a single thread, it can be migrated to another thread at any given time.

context switch between threads requires system calls (involving the OS kernel), which can cost more than thousand CPU cycles on x86 CPUs. By contrast, transferring control vias callcc()/continuation requires only few CPU cycles because it does not involve system calls as it is done within a single thread.

TL;DR – in the way we write our code, we can suspend function calls and free resources for other functions requiring it, without typical thread context switches enforced by the Kernel. A more deep-dive into Coroutines, await and concurrency can be found in this presentation and this blog post.

 

A simple Example

$ vim coroutine.cpp

#include <boost/coroutine/all.hpp>
#include <iostream>

using namespace boost::coroutines;

void coro(coroutine::push_type &yield)
{
        std::cout << "[coro]: Helloooooooooo" << std::endl;
        /* Suspend here, wait for resume. */
        yield();
        std::cout << "[coro]: Just awesome, this coroutine " << std::endl;
}

int main()
{
        coroutine::pull_type resume{coro};
        /* coro is called once, and returns here. */

        std::cout << "[main]: ....... " << std::endl; //flush here

        /* Now resume the coro. */
        resume();

        std::cout << "[main]: here at NETWAYS! :)" << std::endl;
}

 

Build it

On macOS, you can install Boost like this, Linux and Windows require some more effort listed in the Icinga development docs). You’ll also need CMake and g++/clang as build tool.

brew install ccache boost cmake 

Add the following CMakeLists.txt file into the same directory:

$ vim CMakeLists.txt

cmake_minimum_required(VERSION 2.8.8)
set(BOOST_MIN_VERSION "1.66.0")

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++0x")

find_package(Boost ${BOOST_MIN_VERSION} COMPONENTS context coroutine date_time thread system program_options regex REQUIRED)

# Boost.Coroutine2 (the successor of Boost.Coroutine)
# (1) doesn't even exist in old Boost versions and
# (2) isn't supported by ASIO, yet.
add_definitions(-DBOOST_COROUTINES_NO_DEPRECATION_WARNING)

link_directories(${Boost_LIBRARY_DIRS})
include_directories(${Boost_INCLUDE_DIRS})

set(base_DEPS ${CMAKE_DL_LIBS} ${Boost_LIBRARIES})

set(base_SOURCES
  coroutine.cpp
)

add_executable(coroutine
        ${base_SOURCES}
)

target_link_libraries(coroutine ${base_DEPS})

set_target_properties(
        coroutine PROPERTIES
        FOLDER Bin
        OUTPUT_NAME boost-coroutine
)

Next, run CMake to check for the requirements and invoke make to build the project afters.

cmake .
make

 

Run and understand the program

$ ./boost-coroutine
[coro]: Helloooooooooo
[main]: .......
[coro]: Just awesome, this coroutine
[main]: here at NETWAYS! :)

Now, what exactly happened here? The Boost coroutine library allows us to specify the “push_type” where this functions should be suspended, after reaching this point, a subsequent call to “yield()” is required to resume this function.

void coro(coroutine::push_type &yield)

Up until “yield()”, the function logs the first line to stdout.

The first call happens inside the “main()” function, by specifying the pull_type and directly calling the function as coroutine. The pull_type called “resume()” (free form naming!) must then be explicitly invoked in order to resume the coroutine.

coroutine::pull_type resume{coro};

After the first line is logged from the coroutine, it stops before “yield()”. The main function logs the second line.

[coro]: Helloooooooooo
[main]: .......

Now comes the fun part – let’s resume the coroutine. It doesn’t start again, but the function’s progress is stored as stack pointer, targeting “yield()”. Exactly this resume function is called with “resume()”.

        /* Now resume the coro. */
        resume();

That being said, there’s more to log inside the coroutine.

[coro]: Just awesome, this coroutine

After that, it reaches the end and returns to the main function. That one logs the last line and terminates.

[main]: here at NETWAYS! :)

Without a coroutine, such synchronisation between functions and threads would need waits, condition variables and lock guards.

 

Icinga and Coroutines

With Boost ASIO, the spawn() method wraps coroutines on a higher level and hides the strand required. This is used in the current code and binds a function into its scope. We’re using lambda functions available with C++11 in most locations.

The following example implements the server side of our API waiting for new connections. An endless loop listens for incoming connections with “server->async_accept()”.

Then comes the tricky part:

  • 2.9 and before spawned a thread for each connection. Lots of threads, context switches and memory leaks with stalled connections.
  • 2.10 implemented a thread pool, managing the resources. Handling the client including asynchronous TLS handshakes are slower, and still many context switches ahead between multiple connections until everything stalls.
  • 2.11 spawns a coroutine which handles the client connection. The yield_context is required to suspend/resume the function inside.

 

void ApiListener::ListenerCoroutineProc(boost::asio::yield_context yc, const std::shared_ptr& server, const std::shared_ptr& sslContext)
{
	namespace asio = boost::asio;

	auto& io (server->get_io_service());

	for (;;) {
		try {
			auto sslConn (std::make_shared(io, *sslContext));

			server->async_accept(sslConn->lowest_layer(), yc);

			asio::spawn(io, [this, sslConn](asio::yield_context yc) { NewClientHandler(yc, sslConn, String(), RoleServer); });
		} catch (const std::exception& ex) {
			Log(LogCritical, "ApiListener")
				<< "Cannot accept new connection: " << DiagnosticInformation(ex, false);
		}
	}
}

The client handling is done in “NewClientHandlerInternal()” which follows this flow:

  • Asynchronous TLS handshake using the yield context (Boost does context switches for us), Boost ASIO internally suspends functions.
    • TLS Shutdown if needed (again, yield_context handled by Boost ASIO)
  • JSON-RPC client
    • Send hello message (and use the context for coroutines)

And again, this is the IOBoundWork done here. For the more CPU hungry tasks, we’re using a different CpuBoundWork pool which again spawns another coroutine. For JSON-RPC clients this mainly affects syncing runtime objects, config files and the replay logs.

Generally speaking, we’ve replaced our custom thread pool and message queues for IO handling with the power of Boost ASIO, Coroutines and Context thus far.

 

What’s next?

After finalizing the implementation, testing and benchmarks are on the schedule – snapshot packages are already available for you at packages.icinga.com. Coroutines will certainly help embedded devices with low CPU power to run even faster with many network connections.

Boost ASIO is not yet compatible with Coroutine2. Once it is, the next shift in modernizing our code is planned. Up until then, there are more Boost features available with the move from 1.53 to 1.66. Our developers are hard at work with implementing bug fixes, features and learning all the good things.

There’s many cool things under the hood with Icinga 2 Core. If you want to learn more and become a future maintainer, join our adventure! 🙂

Michael Friedrich
Michael Friedrich
Senior Developer

Michael ist seit vielen Jahren Icinga-Entwickler und hat sich Ende 2012 in das Abenteuer NETWAYS gewagt. Ein Umzug von Wien nach Nürnberg mit der Vorliebe, österreichische Köstlichkeiten zu importieren - so mancher Kollege verzweifelt an den süchtig machenden Dragee-Keksi und der Linzer Torte. Oder schlicht am österreichischen Dialekt der gerne mit Thomas im Büro intensiviert wird ("Jo eh."). Wenn sich Michael mal nicht in der Community helfend meldet, arbeitet er am nächsten LEGO-Projekt oder geniesst...

CGo. Halb C – halb Go.

In meinem letzten Blog-Post, The way to Go, habe ich bereits einige Vorteile der Programmiersprache Go aufgezeigt. Und als ob es nicht schon genug wäre, dass man beim Umstieg auf Go nicht gleich das Kind mit dem Bade ausschütten muss: Ein einzelnes Programm muss nicht mal komplett in Go geschrieben sein…

Multilingual unterwegs

Ich arbeite für einen deutschen IT-Dienstleister, aber beim recherchieren von Lösungen für Kundenprobleme muss ich die russischen Quellen nicht erst ins Deutsche übersetzen, sondern kann direkt mit ihnen arbeiten. Genau so kann bspw. ein Go-Programm direkt von C-Bibliotheken Gebrauch machen – und umgekehrt. Vorausgesetzt, die Schnittstellen beschränken sich auf den kleinsten gemeinsamen Nenner.

Was wäre das auch wirtschaftlich für ein Fass ohne Boden wenn sämtliche benötigte Bibliotheken erst in Go neu geschrieben werden müssten. Umso mehr wundert es mich, dass so manche Datenbank-Treiber doch komplett neu geschrieben wurden…

Good programmers know what to write.
Great ones know what to rewrite (and reuse).

Eric Steven Raymond, The Cathedral and the Bazaar

Multilingual in der Praxis

Ich bin zwar kein Fernsehkoch, aber habe trotzdem schon mal was vorbereitet – eine extrem abgespeckte Version von go-linux-sensors:

package go_linux_sensors

/*
#include <stdio.h>
#include <string.h>
#include <sensors/sensors.h>
#include <sensors/error.h>

typedef const char cchar;

#cgo LDFLAGS: -lsensors
*/
import "C"

import (
	"reflect"
	"unsafe"
)

type Error struct {
	errnr C.int
}

func (e Error) Error() string {
	return cStr2str(C.sensors_strerror(e.errnr))
}

func Init() error {
	if err := C.sensors_init((*C.FILE)(nil)); err != C.int(0) {
		return Error{err}
	}

	return nil
}

func Cleanup() {
	C.sensors_cleanup()
}

func GetLibsensorsVersion() string {
	return cStr2str(C.libsensors_version)
}

func cStr2str(cStr *C.cchar) string {
	slen := int(C.strlen(cStr))

	bridge := reflect.SliceHeader{
		Data: uintptr(unsafe.Pointer(cStr)),
		Len:  slen,
		Cap:  slen,
	}

	return string(*(*[]byte)(unsafe.Pointer(&bridge)))
}

Diese schlägt eine Brücke zu libsensors, um die Hardware-Sensoren auszulesen.

Direkt nach dem Package folgt auch schon der Import des ominösen Pakets “C”. Dieses enthält sämtliche C-Standard-Datentypen und alle Datentypen und Funktionen, die über Header-Dateien im Kommentar direkt darüber eingebunden wurden. Ja, richtig, in diesem “magischen” Kommentar lässt sich nahezu beliebiger C-Code unterbringen, um auf jenen im Go-Code zuzugreifen. Des weiteren lassen sich die tatsächlichen Bibliotheken mit “#cgo LDFLAGS:” automatisch linken. Bei höheren Mengen an Code/Includes bevorzuge ich persönlich eine extra Header-Datei statt einem langen Kommentar.

Etwas weiter unten ruft Init() auch schon sensors_init() auf – als wäre es eine gewöhnliche Go-Funktion im Paket C. Und auch die Datentypen lassen sich verwenden als seien es die eigenen – wovon Error fleißig Gebrauch macht.

Lediglich cStr2str() macht einen etwas unorthodoxen Eindruck… aber hey – es funktioniert. 😉

Jeder gesparte Euro zählt…

… und jede wiederverwendete Zeile Code. Wir bei NETWAYS wissen das zu schätzen und arbeiten dementsprechend effizient – an euren Projekten!

Alexander Klimov
Alexander Klimov
Developer

Alexander hat 2017 seine Ausbildung zum Developer bei NETWAYS erfolgreich abgeschlossen. Als leidenschaftlicher Programmierer und begeisterter Anhänger der Idee freier Software, hat er sich dabei innerhalb kürzester Zeit in die Herzen seiner Kollegen im Development geschlichen. Wäre nicht ausgerechnet Gandhi sein Vorbild, würde er von dort aus daran arbeiten, seinen geheimen Plan, erst die Abteilung und dann die Weltherrschaft an sich zu reißen, zu realisieren - tut er aber nicht. Stattdessen beschreitet er mit der...

On giving up and trying an IDE

I dislike IDEs, at least I tell myself and others that. A 200 line long .vimrc gives a lot more street cred than clicking on a colored icon and selecting some profile that mostly fits ones workflow. So does typing out breakpoints in GDB compared to just clicking left of a line. Despite those very good reasons I went ahead and gave Goland and CLion, two JetBrains products for Go and C/C++ respectively, a chance. The following details my experiences with a kind of software I never seen much use for, the problems I ran into, and how it changed my workflow.

Installation and Configuration

A picture of my IDE wouldn’t do much good, they all look the same. So here’s a baby seal.
Source: Ville Miettinen from Helsinki, Finland


First step is always the installation. With JetBrains products being mostly proprietary, there are no repositories for easy installation and updating. But for the first time I had something to put in /opt. When going through the initial configuration wizard one plugin caught my eye: “IdeaVim”. Of course I decided to install and activate said plugin but quickly had to realize it does not work the same simply running vim in a window.
This “Vim emulation plug-in for IDEs based on the IntelliJ platform” sadly does for one not offer the full Vim experience and the key bindings often clash with those of the IDE. Further I started getting bothered by having to manage the Vim modes when wanting to write code.
I sometimes miss the ability to easily edit and move text the way Vim allows, the time I spend using the mouse to mark and copy text using the mouse are seconds of my life wasted I’ll never get back!
Except for the underlying compiler and language specific things both IDEs work the same, identical layout, window options, and most plugins are shared, so I won’t talk about the tool chain too much. But it needs to be said that Goland just works, possibly because building a Go project seems simpler than a C++ one. Getting CLion to work with CMake is tiresome, the only way to edit directives is by giving them to the command like you would on the shell. This would not bother me as much if CMake didn’t already have a great GUI.

Coding and Debugging

Yet I wouldn’t be still using those programs if there weren’t upsides. The biggest being the overview over the whole project, easily finding function declarations and splitting windows as needed. These are things Vim can be made to do, but it does not work as seamless as it does in the IntelliJ world. It made me realize how little time is spent the actual typing of code, most of it is reading code, drawing things and staring at a prototype until your eyes bleed confusion (sometimes code is not well commented). The debuggers, again specifically the one of Goland, work great! Sometimes I have to talk to GDB directly since there are many functions but too few buttons, but the typical case of setting a breakpoint and stepping through to find some misplaced condition is simple and easy.

Alright, here it is.


There are a few features I have not found a use for yet e.g. code generators and I still manage my git repositories from the shell. The automatic formatting is cool, again especially in Go where there is one standard and one tool for it. Other than that I run into a few bugs now and then, one that proved to be quite a hassle is the search/search and replace sometimes killing my entire window manager. Were it free software, there’d be a bug report. But for now I work around it. Maybe I’ll drop CLion but I doubt I’ll be writing any Go code in Vim anytime soon.
If you think you have found the perfect IDE or just want to share Vim tips, meet me at the OSMC in November!

Diana Flach
Diana Flach
Developer

Geboren und aufgewachsen in Bamberg kam Diana, nach einem Ausflug an die Uni, als Azubi zu NETWAYS. Dort sitzt sie seit 2014 im Icinga 2 Core Entwicklungsteam.

The way to Go

Lange Zeit waren die Auftragnehmer der Raumfahrt große Rüstungskonzerne mit eingefahrenen Strukturen und dem entsprechenden Produkten. Die Raketen waren bspw. nicht gerade dazu gedacht, sie wieder zu verwenden. Wahrscheinlich konnte sich kaum einer der Auftraggeber vorstellen, dass das auch ganz anders geht. Und dann kam Elon Musk und hat “mal eben” SpaceX auf die Beine gestellt… und dann gingen viele Dinge auf einmal viel besser. So ähnlich auch in unserer Branche…
Lange Zeit gab es zwar maschinennahe Programmiersprachen, aber diese waren umständlich in der Handhabung – insbesondere im Hinblick auf die parallele Ausführung mehrerer Aufgaben. Die konstante Größe der Thread-Stacks limitierte zusätzlich die Anzahl der Threads, so dass bspw. das in C++ geschriebene Icinga 2 aktuell die E/A auf einige wenige Threads verteilen muss. Seit 2009 gibt es immerhin NodeJS, das gut und gerne viele E/A-Aufgaben parallel ausführt, aber auch nur diese – für Rechenoperationen steht nur ein Thread zur Verfügung. Zudem sind die Typen und Funktionen dynamisch und damit nicht so maschinennah und performant wie bspw. in C++. Und da saßen die Programmierer bis 2012 zwischen diesen zwei Stühlen. Und dann hat Google 2012 die erste stabile Version von Go veröffentlicht… und damit gingen viele Dinge auf einmal viel besser. So auch mittlerweile bei NETWAYS

Und was macht dieses Go jetzt besser als alle anderen?

Wie mein Kollege Florian sagen würde: “So einiges.” Aber Scherz beiseite…
Go ist maschinennah – d.h. die Typen und Funktionen sind allesamt statisch und werden wie auch bei bspw. C++ im voraus in Maschinencode umgewandelt – mehr Performance geht nicht.
Go ist einfach (obwohl es maschinennah ist). Die Datentypen sind zwar statisch, also explizit, aber deren Angabe ist nur so explizit wie nötig:

type IcingaStatus struct {
   Name, Description string
}
var IcingaStatusSet = map[uint8]IcingaStatus{
   0: {"OK",       "Alles im grünen Bereich"},
   1: {"WARNING",  "Die Ruhe vor dem Sturm"},
   2: {"CRITICAL", "Sämtliche Infrastruktur im Eimer"},
   3: {"UNKNOWN",  "Mein Name ist Hase, ich weiß von nichts"},
}

Im gerade gezeigten Beispiel muss der Datentyp der Map-Variable nur einmal angegeben werden. Weder die Typen der enthaltenen Werte, noch deren Felder müssen angegeben werden – sie werden vom Typ der Map abgeleitet. Wer befürchtet, den Überblick zu verlieren, kann auf die IDE GoLand zurückgreifen:

Go ist relativ sicher vor Unfällen (obwohl es maschinennah ist). Bei Zugriff auf eine Stelle eines Arrays, die gar nicht existiert oder unzulässiger Umwandlung von Zeiger-Datentypen wirft Go einen Fehler, um Schäden durch Programmierfehler abzuwenden:

type Laptop struct {
    DvdDrive uint32
}
func (l *Laptop) DoSomethingUseful() {
}
type SmartPhone struct {
    SimSlot uint16
}
func (s *SmartPhone) DoSomethingUseful() {
}
type Computer interface {
    DoSomethingUseful()
}
func main() {
    var computers = []Computer{&Laptop{}, &Laptop{}, &Laptop{}}
    _ = computers[3]
    var computer Computer = &Laptop{}
    _ = computer.(*SmartPhone)
}

Go erledigt von sich aus E/A-Aufgaben effizient (obwohl es nicht NodeJS ist). Aufgaben werden in Go nicht über Threads parallelisiert, sondern über sog. Go-Routinen (das gleiche in grün). Diese werden von Go selbst auf die eigentlichen Threads verteilt. Wenn eine Go-Routine eine blockierende E/A-Operation ausführt, wird diese transparent im Hintergrund vollzogen und eine andere Go-Routine beansprucht währenddessen den Thread.
Der Himmel ist die Grenze der Parallelisierung dank Scheduler und dynamischer Stack-Größe. Die o.g. Verteilung von Go-Routinen auf Threads verantwortet der sog. Scheduler von Go. Dies führt dazu, dass “zu” viele parallele Aufgaben sich und dem Rest des Systems nicht im Weg stehen. Zudem beansprucht jede Go-Routine nur soviel RAM wie sie auch wirklich braucht, d.h. eigentlich kann ein Programmierer so viele Go-Routinen starten wie er lustig ist (Beispiel). “Eigentlich” ist genau das richtige Stichwort, denn trotzdem sollte jeder Einzelfall für sich betrachtet werden. Ansonsten macht das OS irgendwann git push --feierabend (Beispiel).
Go geht einfach (daher kommt wahrscheinlich auch der Name). Im Gegensatz zu etablierten maschinennahen Sprachen muss ich mich nicht darum kümmern, dass libfoobar23.dll an der richtigen Stelle in der korrekten Version vorliegt. Das Ergebnis eines Go-Kompiliervorgangs ist eine Binary, die nichtmal gegen libc gelinked ist:

root@576214afd7e6:/# cat example.go
package main
func main() {
}
root@576214afd7e6:/# go build -o example example.go
root@576214afd7e6:/# ./example
root@576214afd7e6:/# ldd ./example
not a dynamic executable
root@576214afd7e6:/#

Alles kann, nichts muss. Go muss ja nicht von heute auf Morgen in sämtlichen Applikationen Anwendung finden. Man kann auch mit einem einzigen Programm anfangen, das nicht heute, jetzt und eigentlich schon vorgestern fertig sein muss. Und selbst wenn nicht alles beim ersten Mal klappt, bieten wir Ihnen gerne maßgeschneidertes Consulting an.

Alexander Klimov
Alexander Klimov
Developer

Alexander hat 2017 seine Ausbildung zum Developer bei NETWAYS erfolgreich abgeschlossen. Als leidenschaftlicher Programmierer und begeisterter Anhänger der Idee freier Software, hat er sich dabei innerhalb kürzester Zeit in die Herzen seiner Kollegen im Development geschlichen. Wäre nicht ausgerechnet Gandhi sein Vorbild, würde er von dort aus daran arbeiten, seinen geheimen Plan, erst die Abteilung und dann die Weltherrschaft an sich zu reißen, zu realisieren - tut er aber nicht. Stattdessen beschreitet er mit der...

The Icinga Config Compiler: An Overview

The Icinga configuration format was designed to be easy to use for novice users while at the same time providing more advanced features to expert users. At first glance it might look like a simple key-value configuration format:

object Host "example.icinga.com" {
	import "generic-host"
	address = "203.0.113.17"
	address6 = "2001:db8::17"
}

However, it features quite a bit of functionality that elevates it to the level of scripting languages: variables, functions, control flow (if, for, while) and a whole lot more.

Icinga’s scripting language is used in several places:

  • configuration files
  • API filter expressions
  • auto-generated files used for keeping track of modified attributes

In this post I’d like to show how some of the config machinery works internally.

The vast majority of the config compiler is contained in the lib/config directory. It weighs in at about 4.5% of the entire code base (3059 out of 68851 lines of code as of July 2018). The compiler is made up of three major parts:

Lexer

The lexer (lib/config/config_lexer.ll, based on flex) takes the configuration source code in text form and breaks it up into tokens. In doing so the lexer recognizes all the basic building blocks that make up the language:

  • keywords (e.g. “object”, “for”, “break”) and operators (e.g. >, ==, in)
  • identifiers
  • literals (numbers, strings, booleans)
  • comments

However, it has no understanding of how these tokens fit together syntactically. For that it forwards them to the parser. All in all the lexer is actually quite boring.

Parser/AST

The parser (lib/config/config_parser.yy, based on GNU Bison) takes the tokens from the lexer and tries to figure out whether they represent a valid program. In order to do so it has production rules which define the language’s syntax. As an example here’s one of those rules for “for” loops:

| T_FOR '(' identifier T_FOLLOWS identifier T_IN rterm ')'
{
        BeginFlowControlBlock(context, FlowControlContinue | FlowControlBreak, true);
}
rterm_scope_require_side_effect
{
        EndFlowControlBlock(context);
        $$ = new ForExpression(*$3, *$5, std::unique_ptr($7), std::unique_ptr($10), @$);
        delete $3;
        delete $5;
}

Here’s a list of some of the terms used in the production rule example:

Symbol Description
T_FOR Literal text “for”.
identifier A valid identifier
T_FOLLOWS Literal text “=>”.
T_IN Literal text “in”.
BeginFlowControlBlock, EndFlowControlBlock These functions enable the use of certain flow control statements which would otherwise not be allowed in code blocks. In this case the “continue” and “break” keywords can be used in the loop’s body.
rterm_scope_require_side_effect A code block for which Icinga can’t prove that the last statement doesn’t modify the program state.
An example for a side-effect-free code block would be { 3 } because Icinga can prove that executing its last statement has no effect.

After matching the lexer tokens against its production rules the parser continues by constructing an abstract syntax tree (AST). The AST is an executable representation of the script’s code. Each node in the tree corresponds to an operation which Icinga can perform (e.g. “multiply two numbers”, “create an object”, “retrieve a variable’s value”). Here’s an example of an AST for the expression “2 + 3 * 5”:

Note how the parser supports operator precedence by placing the AddExpression AST node on top of the MultiplyExpression node.

Icinga’s AST supports a total of 52 different AST node types. Here are some of the more interesting ones:

Node Type Description
ArrayExpression An array definition, e.g. [ varName, "3", true ]. Its interior values are also AST nodes.
BreakpointExpression Spawns the script debugger console if Icinga is started with the -X command-line option.
ImportExpression Corresponds to the import keyword which can be used to import another object or template.
LogicalOrExpression Implements the || operator. This is one of the AST node types which don’t necessarily evaluate all of its interior AST nodes, e.g. for true || func() the function call never happens.
SetExpression This is essentially the = operator in its various forms, e.g. host_name = "localhost"

On their own the AST nodes just describe the semantical structure of the script. They don’t actually do anything in terms of performing any real actions.

VM

Icinga contains a virtual machine (in the language sense) for executing AST expressions (mostly lib/config/expression.cpp and lib/config/vmops.hpp). Given a reference to an AST node – such as root node from the example in the previous section – it attempts to evaluate that node. What that means exactly depends on the kind of AST node:

The LiteralExpression class represents bare values (e.g. strings, numbers and boolean literals). Evaluating a LiteralExpression AST node merely yields the value that is stored inside of that node, i.e. no calculation of any kind takes place. On the other hand, the AddExpression and MultiplyExpression AST nodes each have references to two other AST nodes. These are the operands which are used when asking an AddExpression or MultiplyExpression AST node for “their” value.

Some AST nodes require additional context in order to run. For example a script function needs a way to access its arguments. The VM provides these – and a way to store local variables for the duration of the script’s execution – through an instance of the StackFrame (lib/base/scriptframe.hpp) class.

Future Considerations

All in all the Icinga scripting language is actually fairly simple – at least when compared to other more sophisticated scripting engines like V8. In particular Icinga does not implement any kind of optimization.
A first step would be to get rid of the AST and implement a bytecode interpreter. This would most likely result in a significant performance boost – because it allows us to use the CPU cache much more effectively than with thousands, if not hundreds of thousands AST nodes scattered around the address space. It would also decrease memory usage both directly and indirectly by avoiding memory fragmentation.
However, for now the config compiler seems to be doing its job just fine and is probably one of the most solid parts of the Icinga codebase.