Operationalizing Cloud Provider Managed Kubernetes
Cloud Provider managed Kubernetes services such as EKS, AKS and GKE are a great way to get started with Kubernetes. With a few clicks you get a fully provisioned cluster, typically in minutes. In addition, the cloud provider ensures that your Kubernetes control plane stays up and running, and performs any necessary maintenance and upgrades.
But once you have the Kubernetes running, there are several things to think about before you onboard your teams and your applications on this cluster. In this post, I will cover 5 things you need to consider to operationalize your managed Kubernetes clusters:
- Application Visibility and Management – With several application pods running across different namespaces, it quickly becomes impossible to determine application and environment boundaries, or even determine which teams own which pods and containers. With Kubernetes, applications are no longer tied to infrastructure. Components from multiple applications will be running on the same node. Organizing your applications into logical environments lets you easily find applications and determine aggregate state and availability. You will also need to configure alerts to notify you of any application issues, based on the type of environment, and drill down into application resources (pods, deployments etc.) to quickly identify any problems and remediate the issue.
- Policies Based Compliance – Depending on your industry, it is very likely that your clusters and your applications need to meet various regulatory and compliance requirements. In such environments, having a flexible policy framework is necessary to ensure that applications deployed in the cluster are always in compliance. Kubernetes security policies can be used to ensure that applications are not running in privileged mode or cannot mount certain host volumes, Image provenance policies can be used to ensure that applications are only using images from specific registries or to enforce the use of certain labels for pods. In addition, the ability to periodically generate CIS benchmark reports for your clusters can help determine if you are in compliance and following the best practices. In addition, having automated cluster and workload backups for key components, is also a key requirement for compliance and can help recover from failures more quickly.
- Governance – Kubernetes allows users to create several types of resources, so ensuring users have granular privileges based on their roles in necessary. For example, cluster-wide resources such as storage classes should only be created by cluster administrators. Also, developers should not be able to modify resource quotas in their namespace. Ensuring access control across multiple teams and applications can be challenging and should be automated. It is also important to get visibility into the changes made by users. Detailed audit trails of all the changes made to a cluster or an application can help identify misconfigurations and speed up recovery from failures.
- Dynamic Resource Management – If you are using a managed Kubernetes service, sharing a few Kubernetes clusters across teams will provide higher resource utilization and result in cost savings. To enable multi-tenancy, Kubernetes provides a very flexible construct – namespaces. But things can go horribly wrong in multi-tenant clusters, if resource quotas and limits are not set. Setting resource quotas ensures that any team or application cannot use more than its share of resources and as a result chances of applications getting starved for resources is reduced. For a cluster admin, manually managing resource quotas across multiple teams and hundreds of namespaces is not feasible. Automating this task can eliminate any human errors and ensure that resources are allocated appropriately.
- Cost Visibility and Allocation – When using public cloud resources, cost visibility is key requirement especially for multi-tenant clusters. When using a shared cluster, it is very difficult to estimate resource usage per team and per application. The ability to identify and control which team or application is consuming the most cluster resources helps establish accountability and ensures fair cost allocation. Cost visibility along with resource quota management helps ensure that cluster resources are being allocated fairly across teams that are being charged for these resources.
To summarize, using a managed Kubernetes service is a great way to get started. But if you are planning to operationalize Kubernetes for multiple teams and multiple applications, you may want to consider the above mentioned capabilities as they will not only smoothen your on-boarding of the clusters but also help ensure compliance, governance and facilitate resource management. Besides the above mentioned capabilities, you would also need to think about monitoring, logging and security, if it is not provided by the cloud provider. We will tackle these in other posts.