Select Page

NETWAYS Blog

OSMC 2023 – Making your Kubernetes-based log collection reliable & durable with Vector

Last year’s OSMC offered a great program of talks. It was a pleasure to see old and new people and hear some new things about monitoring and the surrounding ecosystem. Maksim Nabokikh gave the talk “Making your Kubernetes-based log collection reliable & durable with Vector”, which we will have a closer look at today. Many thanks to Maksim for this insight and his talk.

 

Why is a reliable and permanent Logging Solution important?

Nowadays, more and more digital processes need to be tracked. Not only in the event of an error, but often also for legal reasons. A solution for implementing permanent and reliable log management is therefore becoming increasingly important. This challenge can be solved with Vector in Kubernetes.

 

What is it all about?

Vector is an efficient open source tool to build log collection pipelines. It collects the logs, transforms them and can then send them.
In his talk, Maksim showed how Vector can be used to collect and process pod logs and node service logs.

LOGS IN KUBERNETES kubernetes.io/docs/concepts/cluster-administration/logging/ What we can collect? Source Pod logs Files Node services logs Files Events Kubernetes API

The data can be collected from various sources such as files, K8s, sockets and many more. They are then transformed using remap, filter and aggregation. Permanent storage is then possible in over 50 different applications.

VECTOR’S ARCHITECTURE Remap Filter Aggregate Collect Transform Send File K8s Socket 9 in total 40 in total 52 in total …

Concluding

Maksim gave many insights into everyday problems and how they can be solved with Vector.
Real life examples are included with possible solutions.

Just in case I’ve piqued your interest, be sure to watch his great talk in full length on YouTube.

Stay tuned for this year’s edition of OSMC. Mark your calendars for November 19 – 21 and join us in Nuremberg. The Call for Papers is already open. Be sure to submit your proposal until August 15.

Marc Zimmermann
Marc Zimmermann
Manager SaaS

Marc ist bei NETWAYS 2021 vorbeigekommen und wurde eingezogen. Sein Einstieg in die Welt der EDV begann schon in seiner Jugend. Anfangs noch mehr mit Windows und DOS bis er von einen Freund von diesen "Linux" hörte. Wie sollen wir sagen, Marc beschäftigt sich heute mit Linux und allerlei anderen Sachen aus der IT Welt beruflich als auch Privat. Nebenbei werkelt er an Modellflugzeugen und versucht diese auch nach dem fliegen wieder ganz mit nach Hause zu nehmen.

OSMC 2024 | Discover our First Workshops

Enhance your conference experience at OSMC with our pre-conference training courses, scheduled for the first day, November 19. With a limited number of participants, these sessions provide individual support and effective training. Check out our first confirmed conference workshops.

 

Observing and Securing Kubernetes Workloads

Kubernetes is becoming very common, and companies everywhere are trying to start using it quickly. However, security is often ignored or delayed, which can seriously threaten the applications running on Kubernetes.

In this workshop, Daniel Bodky will look at common ways Kubernetes can be attacked and how to prevent or fix these issues from the beginning. We will use Cilium, a popular tool for managing container networks, along with Hubble and Tetragon, to secure our clusters’ networks and runtime environments. Through easy, hands-on sessions, we will learn to secure internal traffic and manage file access and process lifecycles. This workshop is suitable for people with basic Kubernetes knowledge and an interest in cloud-native security.

Find out more about Daniel’s Kubernetes workshop!

 

Unlock the Full Potential of Git and GitLab in One Day!

Are you ready to improve your development workflow with Git and GitLab? Join our one-day workshop to learn best practices and advanced features for efficient project management.

Feu Mourek will cover Git fundamentals, including working directory, staging area, repository, and advanced branching strategies. You’ll learn to write effective commit messages, handle remotes, and resolve merge conflicts. They’ll also explore GitLab’s Web IDE, issue boards, and graphs to manage your workflow, track progress, and handle releases smoothly. You’ll master CI/CD by creating pipelines, using templates, and incorporating variables for flexible workflows.
This workshop is ideal for developers and administrators with basic Linux knowledge. By the end, you’ll confidently use GitLab’s powerful features and tackle any Git challenges.

Find out more about Feu’s GitLab workshop!

 

Save your Seat!

To ensure a personal and interactive workshop experience, we have limited the number of tickets available. Please note that our workshops can only be booked in addition to an OSMC conference ticket. Be sure to reserve your seat before they’re all taken!

Katja Kotschenreuther
Katja Kotschenreuther
Manager Marketing

Katja ist seit Oktober 2020 Teil des Marketing Teams. Als Manager Marketing kümmert sie sich um das Marketing für die Konferenzen stackconf und OSMC, die DevOpsDays Berlin, Open Source Camps, sowie unsere Trainings. In ihrer Freizeit reist sie gerne, bastelt, backt und im Sommer kümmert sie sich außerdem um ihren viel zu großen Gemüseanbau.

Officially Opening the Call for Papers for OSMC 2024!

We are pleased to announce that you can now submit your ideas for presentations at OSMC 2024! The event will take place over three days, from November 19 to 21, in Nuremberg. Now is your chance to share your knowledge with our monitoring community! Check out this blog post for all the details you need to know to become a speaker.

 

Let’s Talk About…

…open source monitoring solutions! That’s the main topic of our conference. But there is much more you can talk about. Here are some ideas that will inspire you.

We’re looking for talks that take an in-depth perspective on technical topics. Whether it’s about new topics, open source projects, technical background or the latest developments, we want to hear about it! We also look forward to presentations on new features, tutorials, real-life stories, best practices and what’s next in the world of monitoring.

You are looking for more ideas?  Just check the presentations from past OSMC events.

 

Presentation Formats

How much do you want to say? How long do you want to speak? At OSMC, you can choose from three different presentation formats: Ignite Talk, 30-minute talk or 45-minute talk.

Choose the type that suits you best!

 

Submission Deadline

We’re accepting your presentation idea until August 15. Don’t wait until later – submit your talk now!

If you have any questions about Open Source Monitoring Conference, you can contact our events team at any time.

Katja Kotschenreuther
Katja Kotschenreuther
Manager Marketing

Katja ist seit Oktober 2020 Teil des Marketing Teams. Als Manager Marketing kümmert sie sich um das Marketing für die Konferenzen stackconf und OSMC, die DevOpsDays Berlin, Open Source Camps, sowie unsere Trainings. In ihrer Freizeit reist sie gerne, bastelt, backt und im Sommer kümmert sie sich außerdem um ihren viel zu großen Gemüseanbau.

OSMC 2023 | Experiments with OpenSearch and AI

Last year’s Open Source Monitoring Conference (OSMC) was a great experience. It was a pleasure to meet attendees from around the world and participate in interesting talks about the current and future state of the monitoring field.

Personally, this was my first time attending OSMC, and I was impressed by the organization, the diverse range of talks covering various aspects of monitoring, and the number of attendees that made this year’s event so special.

If you were unable to attend the congress, we are being covering some of the talks presented by the numerous specialists.
This blog post is dedicated to this year’s Gold Sponsor Eliatra and their wonderful speakers Leanne Lacey-Byrne and Jochen Kressin.

Could we enhance accessibility to technology by utilising large language models?

This question may arise when considering the implementation of artificial intelligence in a search engine such as OpenSearch, which handles large data structures and a complex operational middleware.

This idea can also be seen as the starting point for Eliatra’s experiments and their findings, which is the focus of this talk.

 

Working with OpenSearch Queries

OpenSearch deals with large amounts of data, so it is important to retrieve data efficiently and reproducibly.
To meet this need, OpenSearch provides a DSL which enables users to create advanced filters to define how data is retrieved.

In the global scheme of things, such queries can become very long and therefore increase the complexity of working with them.

What if there would be a way of generating such queries by just providing the data scheme to a LLM (large language model) and populate it with a precise description of what data to query? This would greatly reduce the amount of human workload and would definitely be less time-consuming.

 

Can ChatGPT be the Solution?

As a proof-of-concept, Leanne decided to test ChatGPT’s effectiveness in real-world scenarios, using ChatGPT’s LLM and Elasticsearch instead of OpenSearch because more information was available on the former during ChatGPT’s training.

The data used for the tests were the Kibana sample data sets.

Leanne’s approach was to give the LLM a general data mapping, similar to the one returned by the API provided by Elasticsearch, and then ask it a humanised question about which data it should return. Keeping that in mind, this proof of concept will be considered a success if the answers returned consist of valid search queries with a low failure rate.

 

Performance Analysis

Elasticsearch Queries generated by ChatGPT (Result Overview)

Source: slideshare.net (slide 14)

As we can see, the generated queries achieved only 33% overall correctness. And this level was only possible by feeding the LLM with a number of sample mappings and the queries that were manually generated for them.

Now, this accuracy could be further improved by providing more information about the mapping structures, and by submitting a large number of sample mappings and queries to the ChatGPT instance.
This would however result in much more effort in terms of compiling and providing the sample datasets, and would still have a high chance of failure for any submitted prompts that deviate from the trained sample data.

 

Vector Search: An Abstract Approach

Is there a better solution to this problem? Jochen presents another approach that falls under the category of semantic search.
Large language models can handle various inputs, and the type of input used can significantly impact the results produced by such a model.
With this in mind, we can transform our input information into vectors using transformers.
The transformers are trained LLM models that process specific types of input, for example video, audio, text, and so on.
They generate n-dimensional vectors that can be stored in a vector database.
Illustration about the usage of vector transformers

Source: slideshare.net (slide 20)

When searching a vector-based database, one frequently used algorithm for generating result sets is the ‘K-NN index’
(k-nearest-neighbour index). This algorithm compares stored vectors for similarity and provides an approximation of their relevance to other vectors.
For instance, pictures of cats can be transformed into a vector database. The transformer translates the input into a numeric, vectorized format.
The vector database compares the transformed input to the stored vectors using the K-NN algorithm and returns the most fitting vectors for the input.

 

Are Vectors the Jack of all Trades?

There are some drawbacks to the aforementioned approach. Firstly, the quality of the output heavily depends on the suitability between the transformer and the inputs provided.
Additionally, this method requires significantly more processing power to perform these tasks, which in a dense and highly populated environment could be the bottleneck of such an approach.
It is also difficult to optimize and refine existing models when they only output abstract vectors and are represented as black boxes.
What if we could combine the benefits of both approaches, using lexical and vectorized search?

 

Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation (RAG) was first mentioned in a 2020 paper by Meta. The paper explains how LLMs can be combined with external data sources to improve search results.
This overcomes the problem of stagnating/freezing models, in contrast to normal LLM approaches. Typically, models get pre-trained with a specific set of data.
However, the information provided by this training data can quickly become obsolete and there may be a need to use a model that also incorporates current developments, the latest technology and currently available information.
Augmented generation involves executing a prompt against an information database, which can be of any type (such as the vector database used in the examples above).
The result set is combined with contextual information, for example the latest data available on the Internet or some other external source, like a flight plan database.
This combined set could then be used as a prompt for another large language model, which would produce the final result against the initial prompt.
In conclusion, multiple LLMs could be joined together while using their own strengths and giving them access to current data sources and that could in turn generate more accurate and up to date answers for user prompts.
Overview of the RAG (Retrieval Augmented Generation)

Source: slideshare.net (slide 36)

Noé Costa
Noé Costa
Developer

Noé ist als Schweizer nach Deutschland ausgewandert und unterstützt das Icinga Team seit Oktober 2023 als Developer im Bereich der Webentwicklung. Er wirkt an der Weiterentwicklung der Webmodule von Icinga mit und ist sehr interessiert am Bereich des Monitorings und dessen Zukunft. Neben der Arbeit kocht er gerne, verbringt Zeit mit seiner Partnerin, erweitert sein Wissen in diversen Gebieten und spielt ab und an auch Computerspiele mit Bekanntschaften aus aller Welt.

OSMC 2024 is Calling for Sponsors

What about positioning your brand in a focused environment of international IT monitoring professionals? Discover why OSMC is just the perfect spot for it.

 

Meet your Target Audience

Sponsoring the Open Source Monitoring Conference is a fantastic opportunity to promote your brand.
Raise corporate awareness, meet potential business partners, and grow your business with lead generation. Network with the constantly growing Open Source community and establish ties with promising IT professionals for talent recruitment. Connect to a diverse and international audience including renowned IT specialists, Systems Administrators, Systems Engineers, Linux Engineers and SREs.

 

Your Sponsorship Opportunities

Our sponsorship packages are available in a variety of budgets and engagement preferences: Platinum, Gold, Silver, and Bronze.
From an individual booth, speaking opportunities and lead scanning to social media and logo promotion in different ways, everything is possible for you.
We additionally offer some Add-Ons which can be booked separately. Use this unique chance to get even more out of it. Sponsor the Dinner & Drinks event, the Networking Lounge or the Welcome Reception.

Download the sponsor prospectus for full details and pricing.

We look forward to hearing from you!

 

Early Bird Alert

Our Early Bird ticket sale is already running. Make sure to save your seat at the best price until May 31.
Our discounted tickets are selling fast, grab yours now before they’re gone!

 

Save the Date

OSMC 2024 is taking place from November 19 – 21, 2024 in Nuremberg. Mark your calendars and be part of the 18th edition of the event!

Katja Kotschenreuther
Katja Kotschenreuther
Manager Marketing

Katja ist seit Oktober 2020 Teil des Marketing Teams. Als Manager Marketing kümmert sie sich um das Marketing für die Konferenzen stackconf und OSMC, die DevOpsDays Berlin, Open Source Camps, sowie unsere Trainings. In ihrer Freizeit reist sie gerne, bastelt, backt und im Sommer kümmert sie sich außerdem um ihren viel zu großen Gemüseanbau.