Yonas Habteab, Autor bei NETWAYS GmbH

Wow, it was an outstanding stackconf 2022! Many thanks to all the organizers, speakers and participants who made it possible throughout three days full of Open Source Infrastructure Love. I am truly blessed to have been a part of this amazing event!

For all of you who would like to catch up on the specialist lectures of this special event, all the talks are available for you on YouTube. My blogpost today is all about Ricardo Castro and his talk ”Network Service Mesh”. Due to personal circumstances, he was unfortunately prevented from being present on site and sharing his expertise with us in person. Nevertheless, he was able to record his talk in advance and send it to us. So, I am going to recap his talk for you in this blog and give some insights into my findings.

Why Network Service Mesh?

Microservice architectures build applications as a collection of loosely coupled services. In this type of architecture, the goal is for the services to be fine-grained. The main adjective for teams is that the services are independent of other services. By building loosely coupled services, the types of dependencies and associated complexities are eliminated. Reliably managing network connectivity on distributed systems brings its own set of challenges. Things like service discovery, load balancing, fault tolerance, metrics collection, and security are some examples of issues that arise with distributed systems.

In short, how do you facilitate workload collaboration to create an application on the fly that communicates regardless of where those workloads are running? Ricardo is going to answer this and plenty of other questions for us below, so prick up your ears!

Traditional Service Meshes

A service mesh is a connective tissue between all of your services. It adds additional capabilities such as traffic control, service discovery, load balancing, resilience, observability, security, etc. A service network allows applications to offload these capabilities from application-level libraries. A service mesh typically consists of two components, the data plane and the control plane. The service mesh data plane is made up of business services along side the proxies, which are responsible for all the traffic. The service mesh control plane comprises a set of services providing administrative functions necessary for the control of the service mesh. There are many of the service mesh implementations that can help you in many ways. Here you can see a few of them:

Service meshes connectivity work well within a runtime domain, but often workloads need to interoperate with other services that are outside the runtime domain to provide full functionality. But what are runtime and connectivity domains?

A runtime domain is essentially a compute domain that runs workloads. Traditionally, there is exactly one connectivity domain in a runtime domain because connectivity domains are coupled with runtime domains. This means that only workloads that are within the runtime domain can be part of the connectivity domain. To truly take advantage of this approach, microservice components must at least be interdependent. Application service networks work well within the connectivity domain and their target Layer 7 such as https. They can help cloud-native workloads achieve loose coupling through functionality such as service discovery and fault tolerance. Ricardo has explored some of the use cases where traditional service meshes fail and where network service meshes can mitigate these challenges.

Where Do L7 Service Meshes Fall Short?

The first example is the Multi/Hybrid Cloud center. In this scenario, you have multiple distinct and independent cloud clusters that can be public, private or on-premises. The workloads in each of these clusters must have the ability to communicate with each other. In short, you have a number of workloads running in different connectivity domains that need to be connected together.

Another example is the Multi-Corp/Extra-Net scenario. In this scenario, workloads from different clusters also need to interconnect independent of the connectivity domains they are running in. The main difference between this and the hybrid cloud scenario is that you now have different domains of administrative control. This means that different organizations are managing their own runtime and connectivity domains. The workloads in each of these domains have to work with each other in some fashion.

Service mesh balkanization is also another example of how traditional Layer 7 service meshes fall short. For example, traditional service meshes work well within the same Kubernetes cluster. When local service meshes need to collaborate, they are typically connected via Gateways. These are usually static Layer 7 routes that are very difficult to scale and maintain. However, traditional service meshes cannot guarantee reliable networking in such a scenario because they assume there is a reliable Layer 3 underneath them, which is usually the case within the same cluster, but cannot be guaranteed beyond that. Ricardo goes on to examine how network service meshes have tackled these challenges.

Network Service Mesh (NSM)

Before we dig in further here’s an excellent description, as Ricardo did, of what network service mesh actually is.

Network Service Mesh is a hybrid/multi-cloud IP service mesh that enables Layer 3 (L3), zero trusts, pure network service connectivity, security and observability. It works with your existing K8s Container Network Interface (CNI), provides pure workload granularity, requires no changes to K8s and no changes to your workloads. Network Service Mesh tackles all of the above mentioned challenges by enabling individual workloads to connect securely, regardless of where they are running. Your runtime domain provides connectivity between clusters and requires no intervention. Before we dive into the key concepts of network service meshes, here are some use cases where NSM can be very useful. Basically, these are the same examples where traditional service meshes fall short:

A common Layer 3 domain that allows Databases (DB) running in multiple clusters, clouds, etc. to communicate with each other, such as DB replication.
A single Layer 7 service mesh such as Kuma, Linkerd, etc., that connects workloads running in multiple clusters, clouds, etc.
A single workload connecting to multiple Layer 7 service meshes.
Workloads from multiple organizations connecting to a single collaborative service mesh that is accessible across organizations.

NSM Key Concepts

In a network service mesh, a network service is a collection of features that are applied to the network traffic. These functions provide connectivity, security and observability. A network service client is a workload that requires a connection to the network service by specifying the name of the network service. Clients are independently authenticated by Spiffe ID and must be authorized to connect to the requested network service. In addition to the network service name, a client may declare a set of key value pairs called labels. These labels can be used by the network service to select the proper endpoint. Or they can be used by the endpoint itself to influence the way it provides service to the client.

A network service mesh endpoint is the entry point to the network service for the client. The network service is identified by its name and carries a payload. Network services are registered in the network service register. In short, an endpoint can be a Pod running on the same or a different K8s cluster, an aspect of the physical network or anything else to which packets can be delivered for processing. A client and an endpoint are connected by a virtual wire (vWire). A vWire acts as a virtual connectivity link between clients and an endpoint. A client may also request the same network service multiple times and therefore multiple vWires can lead to the requested endpoint.

Network Service Mesh API

The Network Service Mesh API includes the Request, Close and Monitor functionality. A client can send a Request-GRPC call to the Network Service Mesh to establish a virtual circuit between the client and the network service. For closing a virtual wire between a client and the network service, a client can send a Close-GRPC call to the NSM. A vWire between a client and a network service always has a limited expiration time, so the client usually sends a new Request message to refresh its vWire. A client can also send a Monitor GRPC call to the NSM to obtain information about the status of a vWire it has to a Network Service. If a vWire exceeds its expiration time without being refreshed, the NSM cleans up that vWire.

Network Service Mesh Registry

Like any other mesh, the Network Service Mesh has a Registry in which network services and network service endpoints are registered. A network service endpoint provides one or more network services. It registers a list of network services by name in the registry, which it annotates, and the destination labels that it advertises for each network service. Optionally, a mesh service can specify a list of matches that allows matching with the source identifiers of the client sending its request to the destination labels advertising when it is registered by the endpoint. You can find a detailed description in the documentation or on YouTube, where Ricardo explains it very well in person.

Well, how do you put it all together?

At a higher level, a Network Service Mesh registers one or more network services within the NSM. It registers that list by name and the destination labels that advertising each network service. A client can send a request to the network service to establish a vWire connectivity between the client and the network service. It can also close the connection, and since the vWire has an expiration time, a client can monitor its vWire state. The interesting thing about the NSM is that it allows workloads to request services regardless of where they are running. It is up to the NSM to create the necessary vWires and enable communication between workloads in a secure manner.

Conclusion

In general, Ricardo’s talk covers the issues that Network Service Mesh addresses and its key concepts, however there is much more to talk about, as he told us. He recommends you to look at Floating Inter-domain, where a client requests a network service from any domain in the network service registry, regardless of where it is running. It can also be valuable to take a look at some of the advanced features, as he encourages us to do. The Network Service Mesh Match process, which selects candidate endpoints for specific Network Service Meshes, can be used to implement a variety of advanced features such as Composition, Selective Composition, Topologically Aware Endpoint Selection, and Topologically Aware Scale from Zero. Also, another important note Ricardo shared with us is that Network Service Mesh has been part of the CNCF project since 2019 and is currently in the Sandbox maturity level. If you don’t know what CNCF is, it’s definitely worth to take a look at the referenced documentation.

It’s a very interesting and informative topic and I could learn a lot from Ricardo. I have never dealt with it before, but now it is quite simple to keep track after watching this great talk. I hope I was able to somehow reflect what Ricardo was talking about throughout the text. As I mentioned at the beginning, the recorded video of this lecture and all the other stackconf talks are available on YouTube! Enjoy it!

Take a look at our conference website to learn more about stackconf, check out the archives and register for our newsletter to stay tuned!

Yonas Habteab

Developer

Yonas hat 2022 seine Ausbildung zum Fachinformatiker für Anwendungsentwicklung bei NETWAYS erfolgreich abgeschlossen. Er hat Spaß daran, sich stets Wissen über das Programmieren anzueignen und widmet sich leidenschaftlich seiner Tätigkeit bei Icinga. Wenn er mal nicht am Programmieren ist, kickt er gern mit seinen Freunden Fußball oder geht in der Stadt spazieren und genießt ruhige Abende vor dem Fernseher.

Lies mehr von Yonas und triff unser Team

This entry is part 34 of 63 in the series NETWAYS stellt sich vor

Name: Yonas Habteab

Alter: 20

Position bei NETWAYS: Junior Developer

Ausbildung: Fachinformatiker im Bereich Anwendungsentwicklung

Bei NETWAYS seit: September 2019

Wie bist du zu NETWAYS gekommen und was genau gehört zu Deinem Aufgabenbereich?

Bevor ich 2015 nach Deutschland gekommen bin, wusste ich nicht einmal, dass das Internet überhaupt existiert, geschweige denn, dass es den Beruf des Fachinformatikers gibt. In der Schule erfuhr ich, dass man als Computerbegeisterter gut für den Beruf des Informatikers geeignet sei. Ab dem Zeitpunkt wusste ich, was für einen Beruf ich erlernen will. Ich hatte aber ein Problem, denn früher wohnte ich in einem kleinen Dorf, in dem es so gut wie kein geeignetes Unternehmen – bei dem man seine Ausbildung als Informatiker machen kann – gab und so bin ich schließlich in Nürnberg gelandet. Als ich dann im Internet nach einer Ausbildung recherchierte, stieß ich auf die Anzeige von NETWAYS und bewarb mich ohne zu zögern. Ein paar Tage später wurde ich dann zu einem Vorstellungsgespräch eingeladen und erhielt einen Tag darauf die Zusage.

Mein Aufgabenbereich ist sehr abwechslungsreich. Im Moment habe ich das Vergnügen, mich mit Hilfe von Eric in Icinga einzuarbeiten.

Was macht Dir an Deiner Arbeit am meisten Spaß?

Die Erfolge, die ich durch meine Ausbildung erziele, und vor allem die Glückserlebnisse, die ich habe, wenn ein von mir programmiertes Programm einwandfrei läuft.

Welche größeren, besonders interessanten Projekte stehen künftig an?

Im Moment bin ich dabei, ein Testmodul für Icinga Web 2 zu programmieren. Das Hauptziel ist die Implementierung einer unkompliziert programmierten Drag & Drop- und Resize-Funktion. So wie ich mich kenne, wird das Testmodul noch einige Zeit beanspruchen. Ich stehe jeden Morgen mit einem Lächeln im Gesicht auf, weil ich weiß, dass ich an Icinga, einer der bedeutendsten Software der Welt, arbeite.

Was machst Du, wenn Du mal nicht bei NETWAYS bist?

Unter der Woche sitze ich fast immer am Computer, schaue Tutorials oder mache einfach weiter, womit ich bei NETWAYS angefangen habe. Wenn draußen schönes Wetter ist, dann spiele ich mit meinen Freunden Fußball oder sitze mit ihnen vor dem Fernseher.

Wie geht es in Zukunft bei Dir weiter?

Ich bin sehr froh, dass ich die Probezeit bestanden habe und hoffe jetzt, dass ich mit einem gutem Notendurchschnitt meine Ausbildung abschließe und hier bei NETWAYS übernommen werde. Mein Traum ist es eines Tages ein sehr guter Entwickler zu werden und Icinga auch zukünftig unterstützen zu können.

Yonas Habteab

Developer

Lies mehr von Yonas und triff unser Team