Baseline architecture for an Azure Kubernetes Service (AKS) cluster

ALIF Consulting
Mar 24, 2022
5 min read

Updated: Jul 10, 2024

In this reference architecture, we'll build a baseline infrastructure that deploys an Azure Kubernetes Service (AKS) cluster. This article includes recommendations for networking, security, identity, management, and monitoring of the cluster based on an organization's business requirements.

This architecture uses a hub-spoke network topology. The hub and spoke(s) are deployed in separate virtual networks connected through peering. Some advantages of this topology are:

Segregated management. It allows for a way to apply governance and control the blast radius. It also supports the concept of a landing zone with a separation of duties.
Minimizes direct exposure of Azure resources to the public internet.
Organizations often operate with regional hub-spoke topologies. Hub-spoke network topologies can be expanded in the future to provide workload isolation.
All web applications should require a web application firewall (WAF) service to help govern HTTP traffic flows.
A natural choice for workloads that span multiple subscriptions.
It makes the architecture extensible. To accommodate new features or workloads, new spokes can be added instead of redesigning the network topology.
Certain resources, such as a firewall and DNS, can be shared across networks.

Inside of Kubernetes

Container image reference

In addition to the workload, the cluster might contain several other images, such as the ingress controller. Some of those images may reside in public registries. Consider these points when pulling them into your cluster.

The cluster is authenticated to pull the image.
If you are using a public image, consider importing it into your container registry that aligns with your SLO. Otherwise, the image might be subject to unexpected availability issues. Those issues can cause operational issues if the image isn't available when you need it. Here are some benefits of using your container registry instead of a public registry:
1. You can block unauthorized access to your images.
2. You won't have public-facing dependencies.
3. You can access image pull logs to monitor activities and triage connectivity issues.
4. Take advantage of integrated container scanning and image compliance.

An option is Azure Container Registry (ACR).

Pull images from authorized registries. You can enforce this restriction through Azure Policy. In this reference implementation, the cluster only pulls images from ACR that are deployed as part of the architecture.

Configure compute for the base cluster

In AKS, each node pool maps to a virtual machine scale set. Nodes are VMs in each node pool. Consider using a smaller VM size for the system node pool to minimize costs. This reference implementation deploys the system node pool with three DS2_v2 nodes. That size is sufficient to meet the expected load of the system pods. The OS disk is 512 GB.

For the user node pool, here are some considerations:

Choose larger node sizes to pack the maximum number of pods set on a node. It will minimize the footprint of services that run on all nodes, such as monitoring and logging.
Deploy at least two nodes. That way, the workload will have a high availability pattern with two replicas. With AKS, you can change the node count without recreating the cluster.
Actual node sizes for your workload will depend on the requirements determined by the design team. Based on the business requirements, we've chosen DS4_v2 for the production workload. To lower costs, one could drop the size to DS3_v2, which is the minimum recommendation.
When planning capacity for your cluster, assume that your workload can consume up to 80% of each node; the remaining 20% is reserved for AKS services.
Set the maximum pods per node based on your capacity planning. If you are trying to establish a capacity baseline, start with a value of 30. Adjust that value based on the requirements of the workload, the node size, and your IP constraints.

Associate Kubernetes RBAC to Azure Active Directory

Kubernetes supports role-based access control (RBAC) through :

A set of permissions. Defined by a Role or ClusterRole object for cluster-wide permissions.
Bindings that assign users and groups who are allowed to do the actions. Defined by a RoleBindingor CluserRoleBinding object.

Kubernetes has some built-in roles such as cluster-admin, edit, view, and so on. Bind those roles to Azure Active Directory users and groups to use the enterprise directory to manage access.

Use Azure RBAC for Kubernetes Authorization

Instead of using Kubernetes native RBAC (ClusterRoleBindings and RoleBindings) for authorization with integrated AAD authentication, another option is to use Azure RBAC and Azure role assignments to enforce authorization checks on the cluster. These role assignments can even be added to the subscription or resource group scopes so that all clusters under the scope inherit a consistent set of role assignments with respect to who has permission to access the objects on the Kubernetes cluster.

Secure the network flow

Network flow, in this context, can be categorized as:

Ingress traffic. From the client to the workload running in the cluster.
Egress traffic. From a pod or node in the cluster to an external service.
Pod-to-pod traffic. Communication between pods. This traffic includes communication between the ingress controller and the workload. Also, if your workload is composed of multiple applications deployed to the cluster, communication between those applications would fall into this category.
Management traffic. Traffic that goes between the client and the Kubernetes API server.

Ingress traffic flow

The architecture only accepts TLS-encrypted requests from the client. TLS v1.2 is the minimum allowed version with a restricted set of cyphers. The strict server name indication (SNI) is enabled. End-to-end TLS is set up through Application Gateway by using two different TLS certificates, as shown in this image.

Node and pod scalability

With increasing demand, Kubernetes can scale out by adding more pods to existing nodes through horizontal pod autoscaling (HPA). When additional pods can no longer be scheduled, the number of nodes must be increased through AKS cluster autoscaling. A complete scaling solution must have ways to scale both pod replicas and the node count in the cluster.

There are two approaches: autoscaling and manual scaling.

High Availability

Availability Zone - If your SLA requires a higher uptime, protect against loss in a zone. You can use availability zones if the region supports them. Both the control plane components and the nodes in the node pools are then able to spread across zones. If an entire zone is unavailable, a node in another zone within the region is still available. Each node pool maps to a separate virtual machine scale set, which manages node instances and scalability. Scale set operations and configuration are managed by the AKS service.

Multiple regions - Enabling availability zones won't be enough if the entire region goes down. To have higher availability, run multiple AKS clusters in different regions.

Disaster Recovery

In case of failure in the primary region, you should be able to quickly create a new instance in another region. Here are some recommendations:

Use paired regions.
A non-stateful workload can be replicated efficiently. If you need to store the state in the cluster (not recommended), make sure you back up the data frequently in the paired region.
Integrate the recovery strategy, such as replicating to another region, as part of the DevOps pipeline to meet your Service Level Objectives (SLO).
When provisioning each Azure service, choose features that support disaster recovery. For example, in this architecture, Azure Container Registry is enabled for geo-replication. If a region goes down, you can still pull images from the replicated region.

Business continuity decisions

To maintain business continuity, define the Service Level Agreement for the infrastructure and your application.

Cluster nodes :

To meet the minimum level of availability for workloads, multiple nodes in a node pool are needed. If a node goes down, another node in the node pool in the same cluster can continue running the application. For reliability, three nodes are recommended for the system node pool. For the user node pool, start with no less than two nodes; if you need higher availability, provision more nodes.