Deploying an AKS Cluster with Workload Identity using Terraform

Deploying an AKS Cluster with Workload Identity using Terraform

In this blog post, we'll dive deep into how to deploy an Azure Kubernetes Service (AKS) cluster with Workload Identity using Terraform. Workload Identity allows Kubernetes pods to access Azure resources securely. By the end of this guide, you'll be able to view and watch resources created in a resource group, all from inside a pod.

Why Workload Identity?

Workload Identity is a crucial feature for modern cloud-native applications, especially when deployed within Kubernetes clusters on Azure (AKS). It serves as a bridge for securely accessing Azure resources, offering a seamless authentication mechanism that leverages the underlying Kubernetes service accounts. Here are the key reasons why Workload Identity is indispensable:

Secure and Scalable Azure Resource Access

Workload Identity eliminates the traditional need for managing secrets or keys to access Azure resources. This approach significantly reduces the security risks associated with secret management, such as accidental exposure or unauthorized access. By binding Azure identities to Kubernetes service accounts, Workload Identity ensures that only the pods that require access to Azure resources can obtain the necessary tokens, thereby following the principle of least privilege.

Simplifying Authentication

With Workload Identity, the complexity of authentication is abstracted away. Applications running in your Kubernetes pods do not need to manage authentication tokens; instead, they automatically receive an identity bound to the pod's service account. This simplification is crucial for developers, as it allows them to focus on building their applications without worrying about the underlying authentication mechanisms.

Beyond Azure Key Vault: A Broader Application

One of the motivations behind this blog post and the accompanying Terraform sample is to clarify the use of Workload Identity beyond just accessing Azure Key Vault. While many online resources focus on Key Vault as the primary use case, Workload Identity's applications are much broader.

In reality, Workload Identity can facilitate secure access to a wide range of Azure services, including but not limited to Azure Storage, Azure SQL Database, and Azure Service Bus. This versatility makes it a powerful tool in the arsenal of any developer or DevOps professional working with Azure and Kubernetes.

Addressing Documentation Gaps

The steps to configure and deploy Workload Identity for AKS are not always clear or well-documented, especially when considering the diverse range of Azure resources that could be accessed. This blog post, along with the detailed Terraform sample, aims to fill that gap. By providing a clear, step-by-step guide and a real-world example, this resource intends to make the setup process more accessible and understandable for everyone, regardless of their familiarity with Workload Identity or Terraform.

Through this comprehensive approach, we aim not only to demonstrate the setup of Workload Identity for accessing Azure Key Vault but also to showcase its potential with other Azure services. This broader perspective empowers developers and IT professionals to leverage Workload Identity in a variety of scenarios, enhancing the security and efficiency of their cloud-native applications on Azure.

Prerequisites

Before we start, ensure you have the following:

  • An Azure account and azure cli authenticated.
az login
az account set -s "<subID>"
  • Terraform installed on your machine.
  • kubectl installed.
  • Access to the TerraformSamples GitHub repository which contains all the necessary Terraform configuration files for this deployment.

With that being said, let us get started.

1- Clone the repository

# Clone the repository
git clone git@github.com:mohatb/TerraformSamples.git

# Navigate to the terraform configuration directory
cd TerraformSamples/08-devops-terraform-aks-workload_identity/terraform/

2- Modify the variables in variables.tf

variables.tf

This file defines all the variables that our Terraform configuration uses. These variables allow us to customize and reuse the configuration for different environments or needs. You must change the value of "subscription_id_for_sample_pod" to your own subscription ID and the value of "resource_group_for_sample_pod" to the resource group that we will use for our sample. You can keep the default values for the other variables.

3- After you have customized the variables, you can continue with the following steps.

# Initialize terraform so it downloads the required modules
terraform init

# This command will be showing what actions Terraform would take to apply the current configuration
terraform plan

If everything went well and you did not encounter any errors, you should see an output similar to the one below on this stage. You can now run terraform apply to proceed.

terraform plan command
# This command will create or update the infrastructure according to Terraform configuration files in the current directory.
terraform apply -auto-approve

After a few minutes, you should see an output that indicates our deployment was successful, similar to the following:

results of terraform apply command

What have we created so far? let us breakdown our terraform configuration files.

  • Resource Group

resourceGroup.tf

We will use this resource group to store our AKS cluster and other resources. Later, we will also use it in our sample golang application to show its resources from inside the pod as a demonstration.

  • AKS Cluster:

k8s-control-plane.tf

This file defines the AKS cluster, including its name, location, resource group, and Kubernetes version. It enables Workload Identity and OIDC issuer authentication, essential for integrating Azure services securely.

  workload_identity_enabled = true
  oidc_issuer_enabled = true

  • User-Assigned Managed Identity:

managedIdentity.tf

This step involves creating a user-assigned managed identity and granting it the necessary permissions. In our scenario, we give the identity a "Contributor" IAM role, which enables it to read/write resources in the resource group. This way, our application can use this identity to access the resources in that group. In other scenarios, such as using workload identity to read/write a blob in a storage container, the permissions for the user-assigned identity can be minimal, such as "Storage Blob Data Contributor" or "Storage Queue Data Contributor".

  • Kubernetes Resources:

kubernetes.tf

Contains the Kubernetes provider configuration, deploying the service account for Workload Identity and a sample pod. The sample pod uses an image designed to securely access Azure resources, listing and watching resources created in a resource group, listing blobs created inside a storage container. (The application source code of the image is also available in the repository, and we will explain it later).

Note: You will deploy a sample pod that has two environment variables. The sample code uses the identity to list the resources in the resource group. It needs a resource group name and a subscription ID. You already modified variables.tf with these values in the first step.

kubernetes.tf
variables.tf

Verifying the Deployment:

After deploying your AKS cluster with Workload Identity, it's crucial to verify that everything is set up correctly. Follow these steps to confirm the successful deployment of your resources and the functionality of your setup.

  1. Retrieve AKS Credentials: To interact with your AKS cluster, you first need to retrieve the credentials and configure kubectl to use them. Run the following command in your terminal:
az aks get-credentials -n gitops -g gitops-rg

This command fetches the credentials for your AKS cluster named gitops in the gitops-rg resource group and configures kubectl to use them.

  1. List Deployed Pods: Next, verify that the pod access-azure-resources has been successfully deployed by listing all pods in your cluster:
kubectl get pods

You should see access-azure-resources listed among the deployed pods.

  1. Check Pod Logs: Finally, to confirm that your pod has successfully accessed Azure resources using Workload Identity, check its logs:
kubectl logs access-azure-resources

In the logs, you should see entries similar to the following, indicating successful access to specified Azure resources:

logs from sample pod

These log entries signify that your pod access-azure-resources was able to list resources within your Azure resource group, demonstrating successful deployment and configuration of Workload Identity within your AKS cluster.

If you see the above information in your pod's logs, congratulations! You have successfully deployed an AKS cluster with Workload Identity and verified its ability to securely access Azure resources.

Workflow Explanation:

In our sample, the pod named access-azure-resources is configured to list resources within the gitops-rg resource group on Azure. This process leverages Workload Identity to securely authenticate and authorize access to Azure resources. The pod uses a Kubernetes service account named workload-identity-sa , which is bound to an Azure user-assigned managed identity called user-assigned-workload-identity. When the pod attempts to access Azure resources, the Azure AD Pod Identity component intercepts the request and uses the bound user-assigned managed identity to obtain an Azure AD token. This token is then used to authenticate against Azure services, thus allowing the pod to list resources in the specified resource group without needing to manage secrets or keys explicitly.

Don't go yet! 😄 What about the application?

I know, devops and developers may be interested in the application that powers our Terraform sample, you'll find the source code in the TerraformSamples repository under the 08/application/ directory. The directory structure is as follows:

TerraformSamples/08/application/
├── dockerfile
├── go.mod
├── go.sum
└── main.go

This structure includes the dockerfile for containerizing the application, go.mod and go.sum for managing dependencies, and main.go, which contains the application's code.

Pod Manifest

apiVersion: v1
kind: Pod
metadata:
  name: access-azure-resources
  labels:
    azure.workload.identity/use: "true"
spec:
  containers:
  - name: workloadidentity
    image: mohatb/workloadidentitytest:latest
    env:
    - name: AZURE_SUBSCRIPTION_ID
      value: "<subscription ID>"
    - name: AZURE_RESOURCE_GROUP
      value: "<resource group>"
  serviceAccountName: workload-identity-sa
  volumes:
  - name: test-token
    projected:
      sources:
      - serviceAccountToken:
          path: test-token
          expirationSeconds: 3600
          audience: test

Few important things on the pod manifest:

1- Labels:

To enable a pod to utilize Workload Identity, it must be annotated with the label azure.workload.identity/use: "true". This specific label acts as a signal for the Workload Identity webhook to process the pod accordingly. Upon detecting this label, the webhook proceeds to inject the environment variables AZURE_CLIENT_ID and AZURE_TENANT_ID into the pod. These variables will be used during the authentication process that follows (detailed explanation to follow).

azure.workload.identity/use: "true"
2- Environment Variables:

We are setting two environment variables that have no connection to workload identity authentication. These variables are for our sample pod that displays resources. We need to indicate which resource group the pod should use to show its resources, and we are getting that information from environment variables in the application. (This mainly depends on your application).

    env:
    - name: AZURE_SUBSCRIPTION_ID
      value: "<subscription ID>"
    - name: AZURE_RESOURCE_GROUP
      value: "<resource group>"
3- Service Account:

This is for the service account we created which is bounded to the user-assigned managed identity.

serviceAccountName: workload-identity-sa
4- Projected Volume:

Workload Identity uses a projected volume to securely authenticate Kubernetes workloads with Azure services. A mutating admission webhook projects a signed service account token into a specific volume (azure-identity-token) inside the pod. The workload can access this token, stored at /var/run/secrets/azure/tokens/azure-identity-token, for secure authentication. The pod also receives key environment variables such as AZURE_AUTHORITY_HOST, AZURE_CLIENT_ID, AZURE_TENANT_ID, and AZURE_FEDERATED_TOKEN_FILE, which help the application authenticate using the Azure Identity SDKs or MSAL with the projected token.

  volumes:
  - name: test-token
    projected:
      sources:
      - serviceAccountToken:
          path: test-token
          expirationSeconds: 3600
          audience: test

dockerfile Overview

The Dockerfile is designed to build a lightweight and secure container for running our Go application. It starts from the golang:1.20 base image, ensuring we have a clean, up-to-date environment for our Go application. The Dockerfile follows best practices by first copying the go.mod and go.sum files and downloading dependencies, which enhances caching and speeds up build times. Then, it copies the application code into the container and sets the command to run the application. This setup is ideal for creating a minimal and efficient Docker image for our AKS pod.

FROM golang:1.20

WORKDIR /app

# Copy the module files first and download dependencies for caching
COPY go.mod go.sum ./
RUN go mod download

# Now copy the rest of your application's code
COPY . .

CMD ["go", "run", "main.go"]

Application Overview (main.go)

The main.go file contains a Go application that demonstrates how a pod in AKS can use Workload Identity to securely access Azure resources. Here's a breakdown of its functionality:

Environment Variables:

The application reads AZURE_SUBSCRIPTION_ID and AZURE_RESOURCE_GROUP name from environment variables. this is the resource we are going to access from our application during the sample.

Azure Identity and authentication:

In our application, we opt for azidentity.DefaultAzureCredential over azidentity.ManagedIdentityCredential for authenticating against Azure services. With DefaultAzureCredential, this is because azidentity.ManagedIdentityCredential will use the cluster managed identity instead of the identity linked to the workload identity service account. below is the authentication flow for azidentity.DefaultAzureCredential:

  • Credential Initialization: When the application initializes the azidentity.NewDefaultAzureCredential, this credential chain tries to find a suitable method to authenticate the application with Azure. In the context of an AKS environment with Workload Identity enabled, the credential looks for available authentication tokens within the environment.
  • Token Discovery: NewDefaultAzureCredential will detect that the application is running in a Kubernetes environment and attempt to use the token available in the projected volume (which is defined in the pod manifest). This detection is facilitated by the presence of environment variables like AZURE_CLIENT_ID, AZURE_TENANT_ID, and the presence of the token file, which are automatically set up when using Workload Identity by the workload identity webhook when the pod has the label azure.workload.identity/use: "true".
  • Token Use for Azure AD Authentication: The discovered token is then used to authenticate with Azure AD. Since the token is issued to the Kubernetes Service Account and this Kubernetes Service Account is bound to an Azure AD identity, Azure AD recognizes the token as representing the Azure AD identity. Successful authentication returns an OAuth 2.0 token that the application can use to access Azure services.
  • Accessing Azure Resources: With the OAuth 2.0 token, the application can now access Azure resources that the Azure AD identity has permissions to.

Resource Listing:

Using the Azure SDK for Go, specifically the armresources package, the application lists all resources in the specified resource group. It continuously polls for resources, displaying their names and types. This demonstrates real-time monitoring of Azure resources from within a Kubernetes pod.

Resource Tracking:

A simple in-memory map tracks which resources have been logged to avoid duplicate entries. This ensures that the output remains clear and focused on newly detected resources.

Application Full code:

package main

import (
	"context"
	"fmt"
	"log"
	"os"
	"time"

	"github.com/Azure/azure-sdk-for-go/sdk/azidentity"
	"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/resources/armresources"
	"k8s.io/klog"
)

var (
	subscriptionID    string
	resourceGroupName string
)

func main() {
	subscriptionID = os.Getenv("AZURE_SUBSCRIPTION_ID")
	resourceGroupName = os.Getenv("AZURE_RESOURCE_GROUP")
	if len(subscriptionID) == 0 {
		log.Fatal("AZURE_SUBSCRIPTION_ID is not set.")
	}
	if len(resourceGroupName) == 0 {
		log.Fatal("AZURE_RESOURCE_GROUP is not set.")
	}

	cred, err := azidentity.NewDefaultAzureCredential(nil)
	if err != nil {
		klog.Fatal(err)
	}
	ctx := context.Background()

	resourcesClient, err := armresources.NewClient(subscriptionID, cred, nil)
	if err != nil {
		log.Fatalf("Failed to create resources client: %v", err)
	}

	seenResources := make(map[string]bool) // Map to track seen resource names

	for {
		pager := resourcesClient.NewListByResourceGroupPager(resourceGroupName, nil)
		for pager.More() {
			resp, err := pager.NextPage(ctx)
			if err != nil {
				log.Fatalf("Failed to list resources: %v", err)
			}
			for _, resource := range resp.ResourceListResult.Value {
				// Check if the resource has already been printed
				if _, seen := seenResources[*resource.Name]; !seen {
					fmt.Printf("Resource Name: %s, Type: %s\n", *resource.Name, *resource.Type)
					seenResources[*resource.Name] = true // Mark this resource as seen
				}
			}
		}
		time.Sleep(10 * time.Second) // Sleep for 10 seconds before checking for new resources
	}
}

Summery:

In this comprehensive guide, we explored the deployment of an Azure Kubernetes Service (AKS) cluster utilizing Workload Identity with Terraform, showcasing a modern, secure approach to accessing Azure resources from Kubernetes pods. Workload Identity, by leveraging Kubernetes service accounts for Azure resource access, eradicates the need for manual secret management, enhancing both security and scalability. Through a detailed Terraform sample, we demonstrated how Workload Identity extends beyond Azure Key Vault, enabling secure interactions with a broad spectrum of Azure services like Azure Storage, Azure SQL Database, and Azure Service Bus. This post filled documentation gaps, providing a step-by-step deployment guide and verifying the setup's success, thus empowering developers and IT professionals to leverage Workload Identity for their cloud-native applications on Azure efficiently