On giving up and trying an IDE

I dislike IDEs, at least I tell myself and others that. A 200 line long .vimrc gives a lot more street cred than clicking on a colored icon and selecting some profile that mostly fits ones workflow. So does typing out breakpoints in GDB compared to just clicking left of a line. Despite those very good reasons I went ahead and gave Goland and CLion, two JetBrains products for Go and C/C++ respectively, a chance. The following details my experiences with a kind of software I never seen much use for, the problems I ran into, and how it changed my workflow.

Installation and Configuration

A picture of my IDE wouldn’t do much good, they all look the same. So here’s a baby seal.
Source: Ville Miettinen from Helsinki, Finland

First step is always the installation. With JetBrains products being mostly proprietary, there are no repositories for easy installation and updating. But for the first time I had something to put in /opt. When going through the initial configuration wizard one plugin caught my eye: “IdeaVim”. Of course I decided to install and activate said plugin but quickly had to realize it does not work the same simply running vim in a window.
This “Vim emulation plug-in for IDEs based on the IntelliJ platform” sadly does for one not offer the full Vim experience and the key bindings often clash with those of the IDE. Further I started getting bothered by having to manage the Vim modes when wanting to write code.
I sometimes miss the ability to easily edit and move text the way Vim allows, the time I spend using the mouse to mark and copy text using the mouse are seconds of my life wasted I’ll never get back!
Except for the underlying compiler and language specific things both IDEs work the same, identical layout, window options, and most plugins are shared, so I won’t talk about the tool chain too much. But it needs to be said that Goland just works, possibly because building a Go project seems simpler than a C++ one. Getting CLion to work with CMake is tiresome, the only way to edit directives is by giving them to the command like you would on the shell. This would not bother me as much if CMake didn’t already have a great GUI.

Coding and Debugging

Yet I wouldn’t be still using those programs if there weren’t upsides. The biggest being the overview over the whole project, easily finding function declarations and splitting windows as needed. These are things Vim can be made to do, but it does not work as seamless as it does in the IntelliJ world. It made me realize how little time is spent the actual typing of code, most of it is reading code, drawing things and staring at a prototype until your eyes bleed confusion (sometimes code is not well commented). The debuggers, again specifically the one of Goland, work great! Sometimes I have to talk to GDB directly since there are many functions but too few buttons, but the typical case of setting a breakpoint and stepping through to find some misplaced condition is simple and easy.

Alright, here it is.

There are a few features I have not found a use for yet e.g. code generators and I still manage my git repositories from the shell. The automatic formatting is cool, again especially in Go where there is one standard and one tool for it. Other than that I run into a few bugs now and then, one that proved to be quite a hassle is the search/search and replace sometimes killing my entire window manager. Were it free software, there’d be a bug report. But for now I work around it. Maybe I’ll drop CLion but I doubt I’ll be writing any Go code in Vim anytime soon.
If you think you have found the perfect IDE or just want to share Vim tips, meet me at the OSMC in November!

The Walrus Has Landed – Structured Logging in Go

Walrus pup at Kamogawa Seaworld, Japan. Photo by Max Smith.

Logs let us understand which combination over time lead to a specific event, we can further analyze them to monitor trends or have them print enough information to narrow down a problem to a single line of code. They are the written history of the IT world. Did I lay it on too thick? Probably. But the bottom line is, logs can be very useful. But before you can read logs, one has to write log messages, assign severities and decide which information is worth print and which is not.
An attempt to make logging easier for the programmer, more readable for the user and better formatted for log processors is structured logging. This concept has been around for a while but ‘got cool’ only recently with the advent of large scale logging processors like ElasticSearch or Graylog. Let’s go by example, the following are two log messages, the first being good old lines of text and the second structured logging

[2018-07-31T16:25:07+02:00] - Information/Api: New connection ("GET") from "" to endpoint "/v1/get-the-cookie" with token "aa76dfdf1023dfa567".
time="2018-07-31T16:25:07+02:00" level=Info msg="New connection" context=Api request_type=GET remote= endpoint=/v1/get-the-cookie authtoken="aa76dfdf1023dfa567

It’s easy to see how the structured logging message is easier to comprehend for a logging processor. And while you can grep use grep on both the same way to get a quick overview in this case, the key=’value’ scheme of the second line will always have that advantage while string logs may get convoluted – we all had to chain multiple instances of `grep` and `grep -i` to get rid of false positives at some point, no?
Now let’s look at the code, since I promised it would be easier for developers as well. In this case I’ll be using logrus, a structured logging library for Go. Here is code how these two lines could look like:

log.Information("Api", "New connection (\"%s\") from \"%s\" to endpoint \"%s\" with token \"%s\", conn.type, conn.remote, conn.path, conn.token)

Simple enough. Timestamps are automatic, the severity part of the function call and the context is one of the parameters. Now the same with logrus:

		"context":      "Api",
		"request_type": conn.type,
		"remote":       conn.remote,
		"endpoint":     conn.path,
		"authtoken":    conn.token
	}).Info("New connection")

This looks like a lot more to write and less simple at that! Was I lying? Of course not. The big advantage of using logrus are temporary loggers with default fields:

apiLogger := log.WithFields(log.Fields{
		"context":      "Api",
		"request_type": conn.type,
		"remote":       conn.remote,
		"endpoint":     conn.path,
		"authtoken":    conn.token
apiLogger.Info("New connection")

Any future logging within the Api context can use this `apiLogger`. Letting us shape the log messages with exactly the information we need, without too much copy and paste or having to think about the type of object we want to log:

apiLogger.WithField("data",conn.data).Debug("Submitted Data")
		"error": "Too much data",
		"error_id": 5,
	}).Warn("Connectivity issue")
apiLogger.Info("Connection terminated")
.... Outputs:
time="2018-07-31T16:25:17+02:00" level=Info msg="New connection" context=Api request_type=POST remote= endpoint=/v1/give-me-cookies authtoken=aa76dfdf1023dfa567
time="2018-07-31T16:25:09+02:00" level=Debug msg="Submitted Data" context=Api request_type=POST remote= endpoint=/v1/give-me-cookies authtoken=aa76dfdf1023dfa567 submitted_data="{some json blob}"
time="2018-07-31T16:25:17+02:00" level=Warn msg="Connectivity issue" context=Api request_type=POST remote= endpoint=/v1/give-me-cookies authtoken=aa76dfdf1023dfa567 error="Too much data" error_id=5
time="2018-07-31T16:25:24+02:00" level=Info msg="Connection terminated" context=Api request_type=POST remote= endpoint=/v1/give-me-cookies authtoken=aa76dfdf1023dfa567

It’s easy to see how having to write log messages the usual way would be quite tiresome and prone to missing some information, be it by oversight or misjudged importance. Stay tuned for future updates on our journey into the world of Golang and hopefully we’ll see you at OSMC.

A personal Linux backup solution

Having personal backups is a must, but what can you do if you don’t have a mac that runs timemachine?
My first instinct was using the tool of choice of a friend: duplicity. It’s free software, does encryption and incremental backups, what more could you want? But duplicity does not offer a very user experience. The docs are work in progress, the –help is a bit of a mess and the man page is too verbose for a quick start. Obviously I have little problem reading and learning before using a tool, which is why I gave up and looked for a different one.
Restic does all what I want and duplicity can, but it has a good documentation, bash completion and other optional bonuses for making usage and, in turn, my life much easier.  It makes sense to think about what to backup before thinking about the right tool. I only want to backup from ~, I don’t care about `/etc` or other places with config or data, it would be no use to me when someone was to throw this laptop down a bridge. So just how much is lying around in my home directory?

$ du -h -d1 /home/jflach | grep "G"
1.9G	/home/jflach/i2
4.3G	/home/jflach/.ccache
20G	/home/jflach/git
108G	/home/jflach/vmware
6.6G	/home/jflach/.cache
20G	/home/jflach/Documents
1.2G	/home/jflach/.thunderbird
3.3G	/home/jflach/Downloads
5.6G	/home/jflach/.vagrant.d
171G	/home/jflach

Luckily I have no folder with an upper case “G” in the name and I can see that over 50% are used up by vmare/. I don’t really need to backup my virtual machines, it’d be annoying to lose them but no reason to panic. `.ccache/`, `.cache/` and `Downloads` are completely irrelevant, bringing the total down to just above 50GB.
Back to restic. Creating a new local backup repository is easy:

$ restic init --repo /tmp/backup
enter password for new backend:
enter password again:
created restic backend 929fc4f740 at /tmp/backup
Please note that knowledge of your password is required to access
the repository. Losing your password means that your data is
irrecoverably lost.

Now the for the actual backup, I have a file containing the excluded directories:

$ cat ~/.config/restic.exclude

And the command is simply:

$ restic -r /tmp/backup backup ~ --exclude-file=.config/restic.exclude
enter password for repository:
password is correct
scan [/home/jflach]
scanned 10123 directories, 64039 files in 0:00
[11:07] 100.00%  76.166 MiB/s  49.612 GiB / 49.612 GiB  74162 / 74162 items  0 errors  ETA 0:00
duration: 11:07, 76.12MiB/s
snapshot dd45c515 saved

It took eleven minutes on my machine to encrypt and compress about 50GiB of data. Restic starts a few threads and voraciously consumes CPU time and memory as it runs. Get yourself a fresh cup of coffee, working is no fun while the tool runs.
All that’s now left to do is to copy the directory to some external server or hard drive. Restic offers support for common sync tools like sftp, google cloud or rclone, whatever you use it will be your job to automate and define its behavior.

The quest for su on Windows

“How to run cmd as different user?” I type into Google. When I search for full sentences instead of just keywords I must be very desperate. Violet links everywhere, I feel like I have tried everything and nothing works. What I want is something like “su”, temporarily changing the current user sounds like a very important thing to be able to do when one administrates Windows. So why is it not that simple?
There is a command called “RunAs”, sounds good, right? Just give the user and command you want to use and you are set! Just make sure you have the right permissions. But problems start with the username since Windows is localized. Sometimes English names work, sometimes you need to use “NT-AUTORITÄT\Netzwekdienst” instead of “NT AUTHORITY\Network Service”. RunAs is one of the latter. Except all of this does not matter since you need a password and system accounts tend to not have one. Bummer.

Notice the differences in the output between the English and German username

Without knowledge of Windows inner workings, I blame the lack of quality free material on the topic, I’m left to googling around until I find a 10 year old blog post or forum entry which solves the exact problem I’m having. And I did find something, it’s not even that old. PsExec is a tool which does exactly what I want! So I install the package, edit my path, run the program… and it just works! With English usernames even. (After I used the -accepteula flag, because for some reason it would not work without when running the first time).

It just works!

Working with Windows often feels to me like piloting a military submarine, not because it’s so advanced, but because I often have no idea what I am doing, the manual is in Russian and clicking the wrong button may or may not make a large strip of land uninhabitable for decades. Trial and error seems to be the way to go for most problems and that’s frustrating. So I hope if somebody else finds themselves in the unknown waters of the Windows user system, this blog post can help.

Flapping in Icinga 2.8.0

The author viewing the code for the first time

Flapping detection is a feature many monitoring suites offer. It is mainly used to detect unfortunately chosen thresholds, but can also help in detecting network issues or similar. In most cases two thresholds are used, high and low. If the flapping value, which is the percentage of state changes over a set time, gets higher than the high threshold, it is considered flapping. It will then stay flapping until the value drops below the low threshold.
Naturally Icinga 2 had such a feature, just that it implemented a different approach and didn’t work. For 2.8.0 we decided it was time to finally fix flapping, so I went to investigate. As I said the flapping was working differently from Icinga 1, Shinken, etc. Instead of two thresholds there was just one, instead of one flapping value there were two and they change based on the time since the last check. Broken down it looks like this:

positive; //value for state changes
negate; //value for stable changes
FLAPPING_INTERVAL; //Compile time constant to smoothen the values
OnCheckResult() {
  if (positive + negative > FLAPPING_INTERVAL) {
    pct = (positive + negative - FLAPPING_INTERVAL) / FLAPPING_INTERVAL;
    positive -= pct * positive;
    negative -= pct * negative;
  weight = now - timeOfLastCheck;
  if (stateChange)
    positive += weight;
    negative += weight;
IsFlapping() {
  return 100 * positive / (negative + positive);

The idea was to have the two flapping values (positive & negative) increase one or the other with every checkresult. Positive for state changes and negative for results which were not state changes, by the time since the last check result. The first problem which arises here, while in most cases the check interval is relatively stable, after a prolonged Icinga outage one of the values could be extremely inflated. Another problem is the degradation of the result, in my tests it took 17 consecutive stable results for a flapping value to calm down.
After some tweaking here and there, I decided it would be wisest to go with the old and proven style Icinga 1 was using. Save the last 20 checkresults, count the state changes and divide them by 20. I took inspiration in the way Shinken handles flapping and added weight to the sate changes, with the most recent one having a value of 1.2 and the 20th (oldest) one of 0.8. The issue of possibly wasting memory on saving the state changes could be resolved by using one integer as a bit array. This way we are actually using slightly less memory now \o/

The above example would then have a value of 39.1%, flapping in the case of default thresholds. More details on the usage and calculation of flapping in Icinga 2 can be found in the documentation once version 2.8.0 is released.

A naturally grown .vimrc

Vim is pretty great, honestly. But since I started using vim after growing tired of nano, a lot in my .vimrc changed… To the point where colleagues who use vim on their machine rather use nano on mine before trying to wrap their head around my workflow. Every .vimrc is different and a lot of them exist out there, over twelve thousand repositories of shared vim configurations on GitHub alone and many more on private laptops and computers.
Generally creating a .vimrc works like creating a Makefile: Copy and paste from different sources with small changes until you have an approximation of the desired result and then need to decide whether you want to go the extra mile of dealing with your configurations’ and Vims’ quirks to get it working properly or you just leave it be. This is followed by years of incremental tweaking until you have a Vim environment which works perfectly but you don’t know why. At least that’s how I do it ¯\_(ツ)_/¯
So I took a dive into my own to see what kind of madness lurks down there…

set nocompatible
set t_Co=16
set shiftwidth=4
set tabstop=4
set autoindent
set smartindent
filetype plugin on
filetype indent on

So far, so good. Nothing special. Other than that indentation is set and overwritten three times, this has not lead to any problems (yet). The next lines are a bit more interesting:

call pathogen#infect()
call pathogen#helptags()
syntax on
set background=dark " dark | light "
colorscheme solarized

pathogen is a runtime manipulator which allows you to add additional plugins to Vim easier, in my case these are vim-solarized, a popular colourscheme, and vim-fugitive, a plugin that adds git commands within Vim. #infect() loads these plugins and #helptags() generates documentation. The following lines make use of solarized and add syntax highlighting.

set nu
set listchars=eol:$,tab:>-,trail:.,extends:>,precedes:<,nbsp:_
set list
let &colorcolumn="121"
set splitbelow
set splitright
set viminfo='20,<1000

This controls numbering and control characters, colorcolumn adds a ugly line over the 121’st char to keep up with coding styles. split tells vim where to add new views when splitting (I rarely use these) and viminfo sets the copy buffer to up to 1000 lines.
Now we get to the really interesting bits, key remaps!

Artist rendition of the authors .vimrc

"See :help map-commands
"n        = Normal only
"v        = Visual only
"i        = Insert only
"         = Normal+Insert+Select only
" nore    = disable recusiveness
"     map = Recursive map
"split window switching
nnoremap <C-J> <C-W><C-J>
nnoremap <C-K> <C-W><C-K>
nnoremap <C-L> <C-W><C-L>
nnoremap <C-H> <C-W><C-H>
"Make Up and Down Arrows move half a screen
"instead of 1 line
noremap <Up> <C-U>
noremap <Down> <C-D>
"BUFFERS Used to be F6,F7. Now used by flake
set hidden
noremap :bprevious
noremap :bnext
"Because buffers change Flake to use F3 instead of F7
"autocmd FileType python map :call Flake8()
"Smart Home key
noremap <expr> <Home> (col('.') == matchend(getline('.'), '^\s*')+1 ? '0' : '^')
imap <Home> <C-o><Home>

Switching windows can be hassle without the first four remaps, but again, I rarely use split windows. Mapping the up and down arrows to jump half a screen instead just one line I added to stop myself from using them instead of j and k, I have grown quite used to it, so it gets to stay ^_^
Buffer remaps are a must! Because I do use them a lot. There have been problems with Flake, a python lint tool, which I tried to avoid by remapping flakes key… Didn’t work out, so I got rid of Flake and call pylint when required instead.
The smart home key makes the <home>-key jump to the first non-whitespace char in a line instead of the begining of the line, quite handy!

"Access commandwindow with ctrl+f or :cedit
"For dorks
map q: :q
command Wsudo w !sudo tee % > /dev/null
"Jump to next line > 120 chars
command Warnl :/\%>120v./+
"For easy copy-paste
command Cpm :set nonu paste cc=  nolist
"Catch regular caps failure
command WQ :wq

These are just a few quality of life improvements, I tend to mistype :q as q: and :wq as :WQ. The :Wsudo command lets me edit read-only files and :Cpm is for easy mouse copy and paste. :Warnl jumps to the next line with more than 120 chars, this again is to check for style problems.
Alright, that’s it. My current .vimrc. There were a few commented out lines I omitted because I have no clue what I was thinking at the time anymore, but I hope there were a few little bits someone else might find useful to feed their own beast of a .vimrc with.
Image sources:
Vim logo from vim.sexy
Shoggoth by twitter.com/nottsuo

Redis vs MongoDB as message queue and config proxy

Benchmarks and comparisons for these two NoSQL databases exist a plenty. But in all the blogs and whitepapers I found the use-case was quite different from ours, we were trying to find a tool to queue Icinga 2 events and speed up config updates. The queue would use Redis Pub/Sub or MongoDBs capped collections + tailable cursors. From an implementation standpoint MongoDB looks better, it offers good libraries for most programming languages and the document-based approach allows for filtering based on individual attributes while values in Redis are just opaque blobs. Redis Pub/Sub model also has the problem that it loses its queued items when restarted.
From these naive benchmarks (PHP scripts run with time), it looks like MongoDB likes to take their time with creating objects.
Inserting 250.000 objects:

  • Redis: 11.1s
  • MongoDB: InsertOne: 53.3s
  • MongoDB: InsertMany: 25.4s

Inserting 250.000 objects into list capped to 50.000

  • Redis: 12s
  • MongoDB: InsertOne: 57.6s

Redis speed advantage comes from keeping created objects in memory, while MongoDB writes them to the hard drive right away. This is a huge advantage for Redis, as we rotate our events from the queue, writing them to disk would be a waste of io time. But sadly MongoDB does not offer the possibility of keeping the data volatile in memory.
Another concern of ours is security, Redis does not offer any security features and for many application this is not a problem, but in our case, this means we have to finds ways to make it secure. There have been many attempts to do so, e.g using tools like stunnel, but it’s still an additional topic that would need to be tackled. MongoDB on the other hand offers ACLs on document level and native SSL, which is great! Yet is has been rightfully criticized for its insecure default configurations, which resulted in a flood of hacked MongoDB instances earlier this year.
Conclusion: We will have to investigate further. Especially integration and security will need a closer look. MongoDBs performance seems lackluster, but may just be enough as our tests may have been much bigger than what we will be met with in a real setup.

Monitoring Powershell scripts with Icinga 2

The need to to monitor arbitrary Powershell scripts comes up now and then and often there are some workarounds or alternatives, NSClient for example, named. However in order to have something I can link refer people to when the topic comes up again, I’ll try to provide a quick and simple to adapt solution. Keep in mind that this assumes you have Icinga 2 up and running on your Windows host, Powershell installed and are reasonably sane.
First the the check script I used for demonstration purposes in this case, all it does is check whether a process is running and returning OK or CRITICAL based on that.

if ($args.Length -lt 1) {
	Write-Output "Script requires one argument (Process)"
$state = Get-Process $proc -ErrorAction SilentlyContinue
if ($state) {
	Write-Output "PROCESS OK '$proc' is running"
} else {
	Write-Output "PROCESS CRITICAL '$proc' is not running"

Safe it as check_proccess.ps1 somewhere you can find it again. In this case I put next to the other check plugins.
The following are the check_command object and Service apply. And as it turns out it’s not that easy of a task as I thought, it’s mostly Windows fault really… Getting the exit code of a script from Powershell returned to icinga2 required some trickery (credit goes to NSClient for that one). The result is a bit of weird CheckCommand, which can and should be improved.

object CheckCommand "powershell" {
    import "plugin-check-command"
    command = [ "cmd" ]
    arguments = {
        "Weird command" = {
            value = "/c echo $powershell_script$ $powershell_args$ ; exit ($$lastexitcode) | powershell.exe -command -"
            description = "This is needed because powershell would not tell us the exit code otherwise"
            skip_key = true
apply Service "check_powerpoint" {
    import "generic-service"
    check_command = "powershell"
    vars.powershell_script = PluginDir + "\check_process.ps1"
    vars.powershell_args = "POWERPNT"
    assign where host.vars.os == "Windows"

Der Icinga Buildserver, Version 3

Letztes verbliebene Bild des alten Icinga Jenkins

Der Icinga-Buildserver, erreichbar unter https://build.icinga.com, läuft in dieser Form jetzt etwa ein Jahr. Doch gibt es noch immer ein paar Probleme die mit diesem Setup bestehen: So ist das Hinzufügen neuer Jobs noch etwas umständlich, das provisionieren dauert länger als uns lieb ist und besonders übersichtlich ist die Konfiguration auch nicht. Um diese Probleme anzugehen haben wir uns noch einmal mit Puppet und Jenkins auseinandergesetzt.
Wie vorher verwenden wir ein jenkins puppet-modul, nur diesmal haben wir es mit einem speziellen icinga-jenkins Modul erweitert. Dieses Modul erlaubt es uns spezielle pipeline-jobs mit geringem Konfigurationsaufwand zu erstellen. So ist das unterstehende Beispiel alles was zu konfigurieren ist um eine komplette Pipeline zu erstellen. Selbst der spezielle Umgang mir RPM und Deb ist zu großen Teilen vereinheitlicht und funktioniert für alle Projekte gleich.

    control_repo: https://github.com/Icinga/icinga-packaging.git
    control_branch: snapshot
      'debian-jessie': {}

Die Pipeline erstellt dabei nicht nur einen, sondern gleich vier Jobs: “source”, “binary”, “test” und “publish”. Diese verarbeiten die specfiles, bauen das Paket, testen es und veröffentlichen es auf https://packages.icinga.com
In Produktion ist unser Modul noch nicht, aber um den testen und konfigurieren zu können haben wir Vagrant Boxen gebaut. Mit Hilfe derer bauen wir zur Zeit das icinga-jenkins Modul weiter aus um den bestehenden Buildserver komplett mit den neuen Pipelinejobs abbilden zu können. Wir hoffen unseren Buildprozess damit noch einfacher für Entwickler zu machen und dank der neuen Testsphase für Pakete Problemen in Zukunft besser vorzubeugen zu können

Monitoring Windows with Icinga 2 and NSCP

Monitoring Windows systems can be quite a hassle without the right tools. While Icinga 2 comes with a few check plugins for Windows, sadly they only allow for the most basic monitoring. A tool which is often used for this reason is NSClient++ (NSCP). This little extendable GPL software uses its own checks and even comes with a fully functioning web interface.

The monitoring view of the NSCP Web interface

The monitoring view of the NSCP Web interface

But there are problems with this solution in that it does not work with Icinga 2 out of the box and a few tweaks are necessary. To ease working with these two tools I have written a new Icinga 2 Windows Plugin “check_nscp”, which works like check_nt only that is is less restrictive in its parameters and allows for any NSCP Query to be used.
The following is an example Icinga 2 config which uses check_nscp to monitor the memory used by the Icinga 2 process.

object CheckCommand "nscp-memory-proc" {
    import "nscp-windows"
    vars.nscp_win_query = "check_process"
    vars.nscp_win_args = [
apply Service "icinga-mem" {
	import "generic-service"
	check_command = "nscp-memory-proc"
	vars.nscp_memory_proc_proccess = "icinga2.exe"
	vars.nscp_memory_proc_warn = "70m"
	vars.nscp_memory_proc_crit = "200m"
	assign where host.name == NodeName

This requires the CheckCommand for check_nscp which will be released together with check_nscp in one if the coming Icinga 2 Versions.