Once upon I time when I started learning about Kubernetes, I thought it was a one-stop-shop to manage containers that are distributed over many servers. But the more I learn about the topic, the more I realize that while Kubernetes offers a lot of functionality, there are many things it doesn’t do out of the box, and for which 3rd party products are used. In recent weeks, I’ve come across the terms Istio and “Service Mesh” a lot, so it was time to have a closer look at what this actually is and which problems a service mesh solves in a Kubernetes cluster.
Setting the Scene
So let’s say you have a big Kubernetes cluster where lots of pods are deployed for a number of different services. If we are thinking about a microservice architecture in which a service or application is split up into many microservice pods, of which many instances are running across many worker nodes, it’s obvious that lots of pods have to communicate with each other. But not every pod needs to be able to communicate with every other pod. The pods implementing an ingress load balancer for http requests, for example, need to be able to communicate with microservice pods that implement different parts of an application. But the load balancer has no need to communicate with the pods that handle the database. And the microservice pods that implement different parts of the service also often do not need to communicate with each other. Lets go one step further and say that a Kubernetes cluster hosts many services that have nothing to do with each other. That means that the pods of the different services also do not have to communicate with each other. In a default Kubernetes cluster, however, all pods can communicate with each other, which is often unwanted at best and can be a security issue at worst.
And Now to the Service Mesh
And this is where Istio, Linkerd, Open Service Mesh and others come in and implement a concept referred to as a Kubernetes service mesh. Istio, which I’ve looked at, adds a ‘sidecar’ container to each pod in the cluster, which then handles all communication to and from that pod. In other words this ‘sidecar’ is a communication proxy. So when one pod communicates with another pod, it will go from the microservice container to its Istio sidecar container, from there through the Kubernetes cluster to the Istio ‘sidecar’ container in the destination pod and from there to the actual microservice container. This is what is called the ‘data plane‘. The other part of the service mesh is the ‘control plane‘, which manages the ‘sidecars’, i.e. it controls the rules which pods are allowed to talk to which other pods. A nice benefit: Since all communication goes through the ‘sidecars’, it’s an ideal way to log and monitor requests between the pods in the cluster. This is a topic for a further post, so I leave it at this stub for the moment.
Further Resources
For more details, have a look at this Wikipedia page and this 15 minute Istio into by Nana on Youtube.
You seem to be focussing on the service mesh solving how to regulate which applications are able to communicate with which other applications. However, a service mesh is solving more general problems of container deployments in a dynamic environment (BTW not only containers / K8s – also valid for VMs): *Finding* the other application in the first place; providing load-balancing / scaling and fail-over / redundancy for application-to-application communication; and enabling encryption in transit between applications, without the applications themselves having to worry about that. Especially with encryption in transit between applications, the mesh becomes the obvious place to monitor / probe inter-application traffic too.
Hi, many thanks for the additional details! Indeed, I’ve only looked at the ‘communication’ part. Particularly, I need to have a closer look at how the “discovery” part works!
It’s a different solution for a service mesh, but I find the learning material for Consul to be really clear on the principles behind these things. See here, for example (the embedded video is also good): https://learn.hashicorp.com/tutorials/consul/service-mesh