How TomTom achieves Kubernetes multi-tenancy with Capsule
With the availability of well-architected and well-orchestrated Cloud Managed Control plane for Kubernetes (e.g. AKS for Azure, EKS for AWS, …) it became very straightforward deploy a Kubernetes cluster to run some workload. In TomTom historically each engineering team has the autonomy to operate, explore and modernise the service they own, and Kubernetes recently is the de facto standard adopted solution for running workloads in the company.
This level of autonomy helped team with fast innovation but also created a scenario of cluster sprawl, where most of the clusters are utilised to run only a specific workload, and are treated as ephemeral.
Whilst launching a Kubernetes cluster is an easy task, managing, operating and updating Kubernetes requires focus and team capacity. To address these challenges Developer Experience introduced a new Kubernetes managed platform, with a few main goals:
- remove the additional efforts and distraction from the engineering team that have to run and operate their own Kubernetes infrastructure
- accelerate the time to production, by offering a well-architected and ready-to-use platform based on Kubernetes
- consolidate and optimize the compute usage, by bin-packing when possible workloads together, even from different engineering teams
Among all the challenges such a platform generates, one of the most interesting one was How to achieve multi-tenancy in Kubernetes?
Kubernetes documentation itself, recognises that the multi-tenancy concept can’t be mapped just with a Kubernetes Namespace. Engineers need more flexibility, hence several implementations have been proposed to address this challenge.
In this article we deep dive a bit on the capabilities of the technology we adopted: Capsule from Clastix (now a CNCF Incubator product)
Capsule in TomTom
Capsule is a policy-based framework, a sort of policy engine on steroids for Kubernetes. Also, it “does one thing well” as we all like the well-known and appreciated Unix principle. It introduces the concept of Tenant in Kubernetes without reinventing the Kubernetes semantics, but instead introducing the new construct by leveraging upstream Kubernetes capabilities, so that the user will be granted a “slice of compute” in a specific Kubernetes cluster.
On top of the native Kubernetes capabilities, Capsules add some additional and important features to properly define tenant boundaries, and protect multiple tenants when using shared cluster level resources.
Specifically, I would like to emphasize some nice features and design decision Capsule offers, that contribute on convincing us to adopt it.
1. Developer experience
Developer experience is a paramount in TomTom. We have chosen Capsule keeping this in mind. Capsule - indeed - enables engineers to share clusters without impacting their experience, and without requiring a high level of knowledge of Kubernetes.
Additionally, Capsule provides a nice interface, namely Capsule Proxy
,to connect to the Kubernetes API. Capsule Proxy is a (almost) pass through proxy for the Kubernetes Api Server that
takes care of intercepting and filtering the calls so that a tenant can run the usual kubectl
commands without
receiving errors due to lack of privileges when trying to access cluster level resources.
The most common use-case is the Namespace resource: with Capsule the tenant will be able to list namespaces (even if it’s a cluster level resources) but it will see only the namespaces owned by its own tenancy. Same goes for other resources, like Nodes, StorageClasses, IngressClasses.
Capsule Proxy, can be exposed via Ingress resources and leverage all the other Kubernetes capabilities such as AllowList and Cert-Manager.
With this we are able to offer a seamless experience in logging in a Kubernetes Control Plane (via Capsule Proxy) with a KubeConfig that doesn’t contain any key-material. Endpoints are exposed with valid Public Certificates (no need for CA chain in the KubeConfig) and authentication happens via Azure Kubelogin exec plugin.
This is the simplest KubeConfig you might see today
apiVersion: v1
kind: Config
clusters:
- cluster:
server: https://capsule-proxy.***.example.com
name: sp-cra-gchiesa
contexts:
- context:
cluster: sp-cra-gchiesa
user: tomtom-user
namespace: sp-cra-gchiesa-default
name: sp-cra-gchiesa
current-context: sp-cra-gchiesa
users:
- name: tomtom-user
user:
exec:
apiVersion: client.authentication.Kubernetes.io/v1beta1
args:
- get-token
- --login
- azurecli
- --server-id
- 6dae42f8-4368-4678-94ff-3960e28e3630 # <-- well-known server-id for Azure AKS service
command: kubelogin
2. Governance and resource ownership attribution
While we were evaluating soft-tenancy technologies, we noticed they typically introduce different semantics, or additional complexity by creating a separate control plane for each tenant. These outcome in a set of resources that needs to be synced with the central control plane, and add more complexity on observability side for platform teams.
Additionally, with a dedicated control plane per tenant, you open the possibility to the tenant to deploy anything they want. In TomTom, we have a set of standards and best practices, and we want to enforce a strong governance on the technologies used in the platform. So, while we foster the experimentation and innovation, we want to ensure that the platform is still properly governed.
With Capsule, each tenant share the same control plane, and each tenant has only access to the capabilities (Custom Resource Definitions) already available in the cluster. This allows a better control of what is deployed in each Kubernetes cluster, and reuse when possible shared service to support multiple tenants.
We have as well a defined process to request additional capabilities: when a tenant needs a specific operator or additional CRDs, then the platform team can roll out the changes in minutes. Or even the tenant itself can propose them by opening Pull Requests on the platform repositories.
When a lot of namespaces, services, workloads are deployed by multiple tenants, it’s extremely important to be able to
identify ownership on those. We leverage Capsule additionalMetadata
feature to enrich every important resource created
by tenants with the information we want to ensure is always attached to each resource. We enforce, for example, labeling pods and service
with the tenant that owns them, the related cost center (so we can offer a nice breakdown with kubernetes cost
management tools), the Slack channel for contacting the team with priority in case of coordination, and more.
3. Enhanced capabilities
We leverage other capabilities available in Capsule that are not Kubernetes native but still very useful.
Ingress protection: each tenant can only create ingresses with hostname matching a pre-defined pattern. Since we run External DNS as shared service in each cluster, is important that different tenants don’t conflict each-other with same hostnames.
Distributing resources on all tenants: each tenancy can retrieve some internal cluster metadata via configmaps. They
are made available by Platform Team by leveraging the GlobalTenantResource
capability in Capsule.
4. GitOps ready
Capsule can be driven via a small set of well-defined Kubernetes CRDs. These will represent the tenancy boundaries in a declarative way. The Capsule manager - deployed in each cluster in the platform - takes care to reconfigure the tenancies as soon some changes is applied to the Tenant Custom Resource objects.
How do we use Capsule?
In our platform all the configurations are delivered in a GitOps fashion. The platform is multi-tier, so we do maintain configurations at tier level and specific per-cluster configuration at cluster level. Capsule perfectly fits this scenario. We structured our git repository so that each cluster has knowledge about its tenant and continue reconciling the configuration for them.
See the example of repository layout for distributing tier and cluster configuration.
.
├── CODEOWNERS
├── README.md
├── charts
│ ├── [...] <-- charts that are maintained by Platform team to aggregate and parametrise tier, cluster or tenancy customisations
└── tiers
├── playground
│ ├── config
│ │ ├── capsule.yml
│ │ └── other-manifests-tier-level-config.yml
│ └── [azure subscription/aws account]
│ ├── [cluster-1]
│ │ ├── argo-projects
│ │ │ └── argo-project-gchiesa.yml
│ │ ├── config # cluster level config, if needed
│ │ └── tenants
│ │ └── tenant-gchiesa.yml <-- Capsule Tenancy deployment
│ └── [cluster-n]
│ ├── argo-projects
│ │ ├── argo-project-team-a.yml
│ │ └── argo-project-team-b.yml
│ └── tenants
│ ├── tenant-team-a.yml
│ └── tenant-team-b.yml
├── prod-tier-1
│ ├── config
│ │ ├── capsule.yml
│ │ └── other-manifests-tier-level-config.yml
│ └── [azure subscription/aws account]
│ └── [cluster-1]
│ ├── argo-projects
│ │ └── argo-project-team-c.yml
│ ├── config
│ │ └── manifests-for-cluster-level-config.yaml
│ └── tenants
│ └── tenant-team-c.yml
│ [...]
└── prod-tier-n
NOTE: in the snippets above and below you will see also “Argo Project” mentioned. This because as part of the tenancy we offer as well an ArgoCD project to the tenant, so that they can deploy their applications in a GitOps fashion.
Additionally, we want to ensure we can roll out updates to all Capsule tenants, and as well we want to be able to “bootstrap” a tenant with some additional resource. An example is the Namespace: we allocate a default namespace for the tenant.
To achieve this we deploy Capsule manifests via Helm.
Each tenant creation results in 2 Helm installations: a tenant helm release for allocating the tenancy, and a tenant-post-install helm release where we allocate additional resource we want to make available to the newly created tenant.
See the example of repository layout that implements charts for Capsule Tenancy and Capsule Tenant Post Install.
.
├── argocd-project
│ ├── Chart.yaml
│ ├── templates
│ │ ├── _helpers.tpl
│ │ ├── appproject.yaml
│ │ ├── repository.yaml
│ │ └── root-app.yaml
│ └── values.yaml
├── tenant
│ ├── Chart.yaml
│ ├── templates
│ │ ├── _helpers.tpl
│ │ └── tenant.yaml
│ └── values.yaml
└── tenant-post-install
├── Chart.yaml
├── templates
│ ├── _helpers.tpl
│ └── default-namespace.yaml
└── values.yaml
This is particularly convenient, for example, whenever we want to enrich tenant’s information (e.g. additionalMetadata
)
We can just update the tenant helm chart (bumping its version) and the reconciliation process will update all the
tenants with the new information.
Example of live production tenant manifest
The following yaml is an example of Tenant Resource in our cluster.
The interesting aspect is how we introduce traceability of the operations via additionalMetadata that will be associated to the entities created (namespace and services in this case) by the tenant.
Platform team will be able to understand who created it (we use self-service API First approach for provisioning tenants, so you can see the api version and invocation-id for the operation) and the use of Network policies to ensure each tenant is isolated but reachable form the shared monitoring platform available in each cluster.
Click to expand the code snippet
Conclusion
Platform Engineering is the new way of managing infrastructure, optimising its usage and offloading the engineering teams from the burden of running and operating the complexity of it. Capsule is a great tool that helps us to achieve our goals, and we are happy to contribute back to the community with our experience.