Here you can find all videos and slides of the OSDC 2014:

Jordan Sissel | Find Happiness in your Logs

Got logs? With so much technology powering your business, you need tools to help you identify problems and analyze past behavior. Apache 2.0-licensed Elasticsearch ELK stack is here to help you process, store, and visualize any kind of logging data, in real time, from any source imaginable!

Log management seems so boring. Log rotation, retention policy, grep, yuck!  What are your servers are doing? Did last night's upgrade break anything?  How your users are interacting with your products? Why did the site go down last weekend?

Get ready to turn your log pains into awesome visual insights and more!

BAM! Elasticsearch ELK! ELK stands for Elasticsearch, Logstash, and Kibana.  Each of these three are lovely, open source projects that, together, give you and your business log management superpowers.

This talk will primarily be done in three parts: open source and community, technology, and use cases.

  • The first part will introduce each project and its success as open source software, most notably through supportive and open communities.
  • The second part will discuss the each project and the problems solved.
  • The third (and most exciting!) part will highlight a variety of use cases and problem that real humans are using Elasticsearch ELK to solve. Live demos of some use cases will be provided.

Attendees will leave the presentation totally full of excitement about this toolset and bursting with fresh ideas about how to tackle their sour logging problems.

Lennart Koopmann | Log Analysis with Graylog2

Graylog2 is a free and open source log analysis tool that allows you to perform searches, analyse the data, build dashboards and set alarms using the streams system. Typical use cases range from debugging platform problems & monitoring exception counts to displaying average pizza delivery time per state on a dashboard.

In this talk I will go through the architecture of Graylog2, what you can do with it and how to get your data into it.

Devdas Bhagat | Graphite: Graphs for the modern age

Graphite is a timeseries data charting package, similar to MRTG and Cacti. This talk will cover Graphite starting from the basics to how scaled it to millions of datapoints per second.

Gergely Nagy | Monitoring with syslog-ng, Riemann and Kibana

In any data center, one will have a lot of machines, and even more applications, plenty of them legacy applications with little to no built-in monitoring capabilities. But even when monitoring is built in, quite often, it just provides basic building blocks. 

In this talk, it will be shown how to tie a syslog-ng based logging solution to the Riemann monitoring system, and use Kibana to make sense of both logging and monitoring data. The presentation will suggest solutions for extracting data from various applications, ways to transform them into useful metrics, and will - of course - also touch the subject of what exactly useful metrics are to begin with. A live demo of all things discussed will be shown at the end.

Schlomo Schapiro | Test Driven Infrastructure

Common wisdom has it that the test effort should be related to the risk of a change. However, the reality is different: Developers build elaborate automated test chains to test every single commit of their application. Admins regularly “test” changes on the live platform in production. But which change carries a higher risk of taking the live platform down?
What about the software that runs at the “lower levels” of your platform, e.g. systems automation, provisioning, proxy configuration, mail server configuration, database systems etc. An outage of any of those systems can have a financial impact that is as severe as a bug in the “main” software!
One of the biggest learnings that any Ops person can learn from a Dev person is Test Driven Development. Easy to say - difficult to apply is my personal experience with the TDD challenge.
This talk throws some light on recent developments at ImmobilienScout24 that help us to develop the core of our infrastructure services with a test driven approach:

  • How to do unit tests, integration tests and systems tests for infrastructure services?
  • How to automatically verify Proxy, DNS, Postfix configurations before deploying them on live servers?
  • How to test “dangerous” services like our PXE boot environment or the automated SAN mounting scripts?
  • How to add a little bit of test coverage to everything we do.
  • Test Driven: First write a failing test and then the code that fixes it.

The tools that we use are Bash, Python, Unit Test frameworks and Teamcity for build and test automation.
See for more about this topic.

Mike Adolphs | How we run Support at GitHub

Operations people often strive for the perfect solution. But is that really effective? To answer that question for yourself, you’ll have to spend some time doing support! Here at GitHub developers help out with support on a regular basis, leading to a better experience for our customers. Some even work on both: half development, half support. I'm going to show you why I think support is awesome and how it benefits your work!

Fernando Hönig | New Data Center Service Model: Cloud + DevOps

With this presentation we would like to show how the world is changing related to Applications Deployment and Infrastructure build models. After this presentation you would be able to improve quality and velocity of software release, and to synchronize development and staging environments with production environment using configuration management tools such as Chef; collect application performance metrics (APM) to view code impact changes with application monitoring tools such as New Relic, statsD, Graphite, or Cloud Monitoring; build workflows to automate routine maintenance tasks using workflow automation tools such as Rundeck and Jenkins, aggregate logs from all devices to identify patterns and spot anomalies using log aggregation tools such as logstash; manage caching needs with tools such as Memcache, Varnish and more. Multi-server environments are now provisioned in minutes instead of the hours it previously took without automation tools.

Christian Kniep | Understand your data center by overlaying multiple information layers

Today's data center managers are burdened by a lack of aligned information of multiple layers. Work-flow events like 'job starts' aligned with performance metrics and events extracted from log facilities are low-hanging fruit that is on the edge to become use-able due to open-source software like Graphite, StatsD, logstash and alike.

This talk aims to show off the benefits of merging multiple layers of information within an InfiniBand cluster by using use-cases for level 1/2/3 personnel.

Tobias Schwab | Continuous Delivery with Docker

Docker took the ops world by storm in 2013. Based on the same technology that powers Heroku (container virtualization) docker makes it easy to create private and data center agnostic PAAS architectures. 
Container images created with docker contain the full application stack and enable rapid deployments and fast auto-scaling without any external dependencies at deploy time. They allow running the exact same configuration of OS, package dependencies, application code and configuration files in all environments and on all servers.

In this talk I want to present how we implement continuous delivery of a Ruby on Rails Application ( using docker. I will give a short introduction to docker and talk about best practices for production usage. Other topics which will be covered in the docker context are:

  • image distribution with private registries
  • multi docker host orchestration
  • configuration management
  • logging and metrics
  • load balancing and failover

Martin Gerhard Loschwitz | What's next for Ceph?

Ceph has recently gained considerable momentum as a possible replacement for conventional storage technologies. Every new Ceph release brings a number of important improvements and interesting features such as Erasure Coding and Multi-Site replication. Work is on the way to make CephFS, the POSIX-compatible Ceph file system, ready for enterprise usage and the number of companies using Ceph is permanently increasing. More than enough reasons to take a closer look at recent Ceph developments: What's hot and boiling and which features do the Ceph developers have on their list for implementation next?

Thomas Schend | Introduction to the Synnefo open source cloud stack

This talk wants to introduce you to Synnefo, an open source, scalable and production ready cloud stack. It consists of a vm management layer which is google ganeti and is essentially a cluster manager and delivers an integrated management of compute, network and storage. It also runs on of the shelf hardware and delivers live migration without or with shared storage.

Synnefo is the orchestration and presentation layer which talks to ganeti via an API. To the users it presents a simple web UI and also exposes an openstack compatible API for automation called cyclades.

Also it is possible to do flexible, L2 isolated networking. 

Also it provides a “dropbox” like storage services called pithos. It features a sync client for different platforms and a web UI.
Synnefo offers all the benefits of an amazon like cloud but is geared towards persistent virtual machines. Also it is perfect as a replacement of the traditional virtualization stack and is easy to setup and use.

Yves Fauser | OpenStack Networking (Neutron) - Overview of networking challenges and solutions in OpenStack

This talk will give you an overview on OpenStack Networking. We will first go through a little bit of theory on the challenges that traditional Networking has in OpenStack, and in cloud environments in general. We will then explore the options given to us by the OpenStack community and ecosystem. After this we will go into more implementation details of OpenSource implementations of programatic overlays, traditional bridging, and some of the commercially available plugins.

Fabrizio Manfredi | Data replication

Data replication is a crucial component for distributed services deployed in a multi-Data Center environment. The replication schema needs to be carefully evaluated before its implementation, wrong design or the misuse in most of the case end with a big service outages.

To understand the replication it is needed to understand the algorithms behind it, for this reason the session will start to explaining the most used algorithms to solve the CAP theorem (Consistency , Availability and Partitioning Tolerance) like Consistent Hash, Vector clock, Gossip protocol, Paxos and Raft.

The second part of the talk will be focused to analyze how the products on the market do the replication (replication in action) with advantages and disadvantages, the talk will cover the distributed filesystem (cephs, tahoe, extreemfs..), distributed databases (db replication primitieves and external tool like Tungsten), Nosql (riak, cassandra, mongodb, couchdb) and Frameworks for in house solution (beardb, open replication,..). The talk will also show the evaluation methods and testing process for identify the best solution for your environment. 

Colin Charles | Automated MySQL failover with MHA: getting started & moving past its quirks

With the MySQL master-slave replication topology, it is nice to have  automated failover in the event the master fails rather than a  manual process. In this talk, we go through the tool MHA, talk about the potential pitfalls and gotchas of automatic failovers, how you can use regular MySQL replication (either asynchronous or semi-synchronous) to achieve high availability and more. We also cover virtual IP failover with the integration of Pacemaker +  Corosync. MySQL 5.6 and greater include global transaction IDs  (GTIDs) and a new set of failover tools and we discuss how this compares to what MHA provides.

Nat Morris | Open Network Install Environment

ONIE defines an open source “install environment” that runs on this management subsystem utilizing facilities in a Linux/BusyBox environment. This environment allows end-users and channel partners to install the target network OS as part of data center provisioning, in the fashion that servers are provisioned.

ONIE enables switch hardware suppliers, distributors and resellers to manage their operations based on a small number of hardware SKUs. This in turn creates economies of scale in manufacturing, distribution, stocking, and RMA enabling a thriving ecosystem of both network hardware and operating system alternatives.

Michael Renner | Secure encryption in a wiretapped future

Since the beginning of publications by Edward Snowden last year many of the presumedly exaggerated threat models in cryptography have become reality. When operating sensitive services it's more likely than not that communcation data will be tapped at large carriers as well as internet exchanges and stored indefinitily - this calls for strong and forward-secure encryption.

On the other hand we're faced with the problem that much of the software we're using in the datacenter today is not very secure when it comes to default encryption settings. On top of that, most developers and system administrators are not very fluent in the basic workings of encryption systems.

The talk will give an introduction to SSL/TLS and explain how to check for weaknesses in existing services with tools like nmap, sslscan and sslyze. For common daemons like apache, nginx, exim, postfix and dovecot best practice on improving cryptographic strength will be discussed.

Christopher Kunz | Software defined networking in an open-source compute cloud

Big networking vendors have discovered network virtualization for themselves. However, not only hardware appliances, but also open-source solutions have various means of virtualising networks.Hosting an IaaS cloud, you are faced with the challenge of isolating VMs, implementing private internal networks, billing and accounting, firewalls and shaping. And all these challenges should not affect the rest of your (non-virtualized) network. Using OpenVSwitch, you can tackle many of these tasks. In this session, we show you the caveats, but also the exciting possibilities of open-source network virtualization in practical examples.

Jochen Lillich | Dynamic Infrastructure Orchestration

Getting Configuration Management in place is a big step in the direction of infrastructure automation. Chef, Puppet and Co. replace error-prone manual changes with periodic system convergence runs controlled by a central database. Even with Puppet’s exported resources and Chef’s search capabilities, the weakness of this approach is that it is rather static. In situations where we need to propagate information quickly, handle failure detection, or tolerate network partitions, other tools might offer better solutions.

In this talk, I’m going to present some of these alternatives (e.g. serf, etcd) and how they can be used to allow for more dynamic configuration changes.

Sebastian Harl | SysDB the system management and inventory collection service

System DataBase” (SysDB) is a multi-backend system management and inventory collection service. It may be used to (continuously) collect information about your systems from various backends (inventory services, monitoring services, etc. like Nagios, Puppet, collectd) and provides a unique interface to access the information independent of the active backends. This is done by storing and mapping the backend objects to generic objects and correlating the attributes to create a single hierarchical view of your infrastructure. This way, all important information about your systems is accessible from a central location allowing for use-cases such as central dashboards, cross-link monitoring or inventory information, identify missing pieces in your system configuration, and much more.

This talk provides an overview over SysDB and its features as well as sample use-cases. The project is still in an early development stage but already usable. The talk also covers future directions and further integration with existing services.

Christian Patsch | System Orchestration with Capistrano and Puppet

With Puppet being several years on the market, there are still many different ways to implement the technology in data center environments. In contrast to the centralized puppetmaster-approach, some scenarios may rather need a decentralized solution due to technical or organizational requirements.

Using Puppet together with the remote server administration tool Capistrano is one way to fulfill these, and examples from an implemented project with these tools will demonstrate which advantages can be achieved and which obstacles have to be considered. A comparison with other proven approaches will deliver recommendations for future installations.

Jan-Piet Mens | Configuration Management with Ansible

Ansible is a simple configuration management and command execution framework for push and pull deployments for Unix/Linux systems using an existing SSH infrastructure. It's particularly easy to deploy because neither does it require an agent on managed nodes (a newish implementation of Python suffices) nor does it require a complex PKI. We show you how to quickly get started using Ansible for ad-hoc tasks, discuss some of its modules and introduce you to Ansible's playbooks and variables. We show you how to run Ansible as a normal user (non-root), how to configure inventory data, and give you sundry tips on using Ansible effectively. If you prefer a pull-based setup, we show you how to implement that as well. We'll discuss roles, use of variables and lookup plugins.

Ole Michaelis & Sönke Rümpler | Make it SOLID - Software Architecture for System Administrators

Starting with Chef or Puppet as a System Administrator will lead you to a problem where you are not sure what’s the best solution of a problem in terms of software architecture.

We will give you a brief overview of general well known and battle tested software patterns, which also applies to infrastructure management code. In addition, we‘ll also show Antipatterns, and best practices.

Jonathan Clarke | Rudder

As a Configuration Management [CM] "champion", trying to gain traction in your environment can be challenging when the level of expertise necessary is in short supply. We built Rudder so that the CM champion would not need to clone themselves. Instead, he or she is able to use a tool to manage configuration data, expose key parameters to the rest of their team, reduce complexity of configuration changes, and put in place role-based workflow for change control.

Rudder is an open source configuration management solution, using lightweight agents (based on CFEngine) controlled via a central management point. Using Rudder, I will show how this approach enables the team to fully participate in the practice of Configuration Management, keep track of changes and history, exploit change access / control, and facilitate knowledge sharing (sharing intentions in design via desired configuration state, maintaining a record of preferred configurations) without intervention of CM champion.

Andreas Schmidt | Testing server infrastructure with serverspec

Companies that focus on cloud infrastructures for both developing and running their applications are likely to have the highest benefit of test driven infrastructure tools such as configuration management and their spec-oriented testing counterparts.

However many enterprises have not moved to the cloud yet. 

Often limited by contracts, regulations or security considerations, they too are in need of testing their infrastructure that service providers built for them.

The talk shows approaches to infrastructure testing and demonstrates the use of serverspec (