Organizations that operate in highly sensitive data domains have to often validate the use of FIPS(Federal Information Processing Standards) Level 2 compliant concerns throughout adoption of multiple technologies. This blog is intended to show the use of Azure Kubernetes Service FIPS Enablement along with the brief understanding of FIPS and uses. FIPS Levels are represented as a standard that defines minimum security requirements for cryptographic modules in information technology products and systems. As your organization starts to migrate to containerized applications across cloud platforms its likely you’ll encompass the question of how much security does the provider enable organizations and what do they provide? Azure Kubernetes Service supports FIPS 140-2 with Linux and Windows Node Pools you can have this enabled at creation or segment that part of your cluster by adding an additional node pool with FIPS Compliant nodes. From a practitioner standpoint you have to know FIPS 140-3 has superseded FIPS 140-2. The Cryptographic Module Validation Program (CMVP) is a joint effort of NIST has provided a searchable list to validate vendors against cryptographic modules shown below to demonstrate Microsoft on this list.
Azure Kubernetes FIPS-Enabled Node
AKS offers a simple way of ensuring you running nodes that are FIPS Level 2 compliant, you can simply run the parameter from the command line as shown below or you can be complex like myself and use terraform as Infrastructure as Code.
module "aks" {
source = "Azure/aks/azurerm"
version = "7.5.0"
resource_group_name = azurerm_resource_group.aks.name
kubernetes_version = var.kubernetes_version
orchestrator_version = var.kubernetes_version
prefix = "aks-chaos-mesh"
network_plugin = "kubenet"
vnet_subnet_id = lookup(module.aks-vnet.vnet_subnets_name_id, "subnet0")
os_disk_size_gb = 50
sku_tier = "Standard" # defaults to Free
private_cluster_enabled = false
rbac_aad = var.rbac_aad
role_based_access_control_enabled = var.role_based_access_control_enabled
http_application_routing_enabled = false
enable_auto_scaling = true
enable_host_encryption = false
log_analytics_workspace_enabled = false
agents_min_count = 1
agents_max_count = 3
agents_count = null
agents_max_pods = 100
agents_pool_name = "system"
agents_availability_zones = ["1", "2"]
agents_type = "VirtualMachineScaleSets"
agents_size = var.agents_size
workload_identity_enabled = true
oidc_issuer_enabled = true
default_node_pool_fips_enabled = true
agents_labels = {
"nodepool" : "defaultnodepool"
}
agents_tags = {
"Agent" : "defaultnodepoolagent"
}
ingress_application_gateway_enabled = false
network_policy = "calico"
net_profile_dns_service_ip = "10.0.0.10"
net_profile_service_cidr = "10.0.0.0/16"
key_vault_secrets_provider_enabled = true
secret_rotation_enabled = true
secret_rotation_interval = "3m"
depends_on = [module.aks-vnet]
}
Of course I won’t leave you hanging this represents our aks.tf file we can also reference the main.tf below.
terraform {
required_version = ">=1.3"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = ">= 3.0, < 4.0"
}
kubectl = {
source = "gavinbunney/kubectl"
version = "1.14.0"
}
helm = {
source = "hashicorp/helm"
version = "2.10.1"
}
}
}
provider "azurerm" {
features {}
}
provider "kubectl" {
config_path = "~/.kube/config"
}
provider "helm" {
kubernetes {
config_path = "~/.kube/config"
}
}
resource "azurerm_resource_group" "aks" {
name = "aks-chaos-mesh-rg"
location = "East US"
}
#creates a vnet/subnet with the ability to use the mapping as shown see ref https://registry.terraform.io/modules/Azure/subnets/azurerm/latest
module "aks-vnet" {
source = "Azure/subnets/azurerm"
version = "1.0.0"
resource_group_name = azurerm_resource_group.aks.name
subnets = {
subnet0 = {
address_prefixes = ["10.52.0.0/24"]
}
}
virtual_network_address_space = ["10.52.0.0/16"]
virtual_network_location = var.region
virtual_network_name = "aks-chaos-vnet"
}
module "aks-vnet2" {
source = "Azure/subnets/azurerm"
version = "1.0.0"
resource_group_name = azurerm_resource_group.aks.name
subnets = {
subnet0 = {
address_prefixes = ["10.0.0.0/24"]
}
}
virtual_network_address_space = ["10.0.0.0/16"]
virtual_network_location = var.region
virtual_network_name = "aks-chaos-vnet2"
}
Additionally it would be missing if I didn’t include the variable.tf – listed below.
#variable.tf
variable "region" {
type = string
default = "eastus"
}
variable "agents_size" {
default = "Standard_D2s_v3"
description = "The default virtual machine size for the Kubernetes agents"
type = string
}
variable "kubernetes_version" {
description = "Specify which Kubernetes release to use. The default used is the latest Kubernetes version available in the region"
type = string
default = null
}
variable "os_sku" {
type = string
default = null
description = "(Optional) Specifies the OS SKU used by the agent pool. Possible values include: `Ubuntu`, `CBLMariner`, `Mariner`, `Windows2019`, `Windows2022`. If not specified, the default is `Ubuntu` if OSType=Linux or `Windows2019` if OSType=Windows. And the default Windows OSSKU will be changed to `Windows2022` after Windows2019 is deprecated. Changing this forces a new resource to be created."
}
variable "pod_subnet_id" {
type = string
default = null
description = "(Optional) The ID of the Subnet where the pods in the default Node Pool should exist. Changing this forces a new resource to be created."
}
variable "private_cluster_enabled" {
type = bool
default = false
description = "If true cluster API server will be exposed only on internal IP address and available only in cluster vnet."
}
variable "private_cluster_public_fqdn_enabled" {
type = bool
default = false
description = "(Optional) Specifies whether a Public FQDN for this Private Cluster should be added. Defaults to `false`."
}
variable "private_dns_zone_id" {
type = string
default = null
description = "(Optional) Either the ID of Private DNS Zone which should be delegated to this Cluster, `System` to have AKS manage this or `None`. In case of `None` you will need to bring your own DNS server and set up resolving, otherwise cluster will have issues after provisioning. Changing this forces a new resource to be created."
}
variable "public_network_access_enabled" {
type = bool
default = true
description = "(Optional) Whether public network access is allowed for this Kubernetes Cluster. Defaults to `true`. Changing this forces a new resource to be created."
nullable = false
}
variable "public_ssh_key" {
type = string
default = ""
description = "A custom ssh key to control access to the AKS cluster. Changing this forces a new resource to be created."
}
variable "rbac_aad" {
type = bool
default = false
description = "(Optional) Is Azure Active Directory integration enabled?"
nullable = false
}
variable "role_based_access_control_enabled" {
type = bool
default = false
description = "Enable Role Based Access Control."
nullable = false
}
variable "sku_tier" {
type = string
default = "Free"
description = "The SKU Tier that should be used for this Kubernetes Cluster. Possible values are `Free` and `Standard`"
validation {
condition = contains(["Free", "Standard"], var.sku_tier)
error_message = "The SKU Tier must be either `Free` or `Standard`. `Paid` is no longer supported since AzureRM provider v3.51.0."
}
}
variable "tags" {
type = map(string)
default = {}
description = "Any tags that should be present on the AKS cluster resources"
}
variable "aks_virtual_network" {
type = string
default = "vnet-aks"
description = "virtual network name"
}
variable "aks_vnet_address_space" {
description = "Specifies the address prefix of the AKS subnet"
default = ["10.0.0.0/16"]
type = list(string)
}
variable "subnet_delegation" {
type = map(list(object({
name = string
service_delegation = object({
name = string
actions = optional(list(string))
})
})))
default = {}
description = "`service_delegation` blocks for `azurerm_subnet` resource, subnet names as keys, list of delegation blocks as value, more details about delegation block could be found at the [document](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/subnet#delegation)."
nullable = false
}
variable "subnet_enforce_private_link_endpoint_network_policies" {
type = map(bool)
default = {}
description = "A map with key (string) `subnet name`, value (bool) `true` or `false` to indicate enable or disable network policies for the private link endpoint on the subnet. Default value is false."
}
variable "subnet_names" {
type = list(string)
default = ["subnet1"]
description = "A list of public subnets inside the vNet."
}
variable "subnet_prefixes" {
type = list(string)
default = ["10.0.1.0/24"]
description = "The address prefix to use for the subnet."
}
variable "default_node_pool_subnet_name" {
description = "Specifies the name of the subnet that hosts the default node pool"
default = "SystemSubnet"
type = string
}
variable "default_node_pool_subnet_address_prefix" {
description = "Specifies the address prefix of the subnet that hosts the default node pool"
default = ["10.0.0.0/20"]
type = list(string)
}
variable "subnet_service_endpoints" {
type = map(list(string))
default = {}
description = "A map with key (string) `subnet name`, value (list(string)) to indicate enabled service endpoints on the subnet. Default value is []."
}
variable "use_for_each" {
type = bool
default = true
}
variable "api_server_authorized_ip_ranges" {
type = set(string)
default = null
description = "(Optional) The IP ranges to allow for incoming traffic to the server nodes."
}
variable "api_server_subnet_id" {
type = string
default = null
description = "(Optional) The ID of the Subnet where the API server endpoint is delegated to."
}
Few items of notice from our code we aren’t declaring the access mode to the API Server this should be limited to need-to-know as the control plane is operating across nodes. For this quick spin-up I’ve added the parameters api_server_authorized_ip_ranges = [“xx.xx.xx.xx/xx”] for my own uses if I’m doing a quick demo and ensuring minimal access from this mechanism. However, in production you should use private link endpoints along with a bastion host in a paired virtual network for this access.
Let’s run the following to see our output intentions.
terraform init
terraform plan
We can see that our node pool adds the parameter we sent through the module to the annotation under default node pool. This ensures to tell the API request we are going to need a FIPS-Enabled image.
After you’re satisfied with the parameters you run a terraform apply (ideally after you’ve scanned your configuration prior to moving to production) since we are demoing this I’m going to delete this prompt after this demonstration.
Validating FIPS on Azure Kubernetes Service
So after we’ve applied our configuration we now can access our cluster by accessing through our credentials.
az account set --subscription <id>
az aks get-credentials --resource-group aks-chaos-mesh-rg --name aks-chaos-mesh-aks --overwrite-existing
Once we’ve authenticated now lets check out our nodes we have running
kubectl get nodes -o wide
We can see we running our kernel version reflects 5.4.0-1121-azure-fips further more we can also validate this by accessing the node. We can do that be initiating the debugging mode to access the node with the following command, first I want to show how we can further validate that FIPS is enabled now from the auditor perspective lets actually dive into the node and validate.
az aks show --resource-group aks-chaos-mesh-rg --name aks-chaos-mesh-aks --query="agentPoolProfiles[].{Name:name enableFips:enableFips}" -o table
To run debug node the only change in this command will be your node name
kubectl debug node/aks-system-65776705-vmss000000 -it --image=mcr.microsoft.com/dotnet/runtime-deps:6.0
You should see this will prompt on the use of debugger on the node to access and now we are accessing the aks-system node.
cat /proc/sys/crypto/fips_enabled
This attempts to access the cryptographic library and the response can be represented as ‘1’ signifying enablement or a ‘0’ false.
Additionally, on our FIPS-enabled node pools the label is applied kubernetes.azure.com/fips_enable=true if we want to target our workloads to deploy on that. In our case we only have one node so this wouldn’t be much of a challenge.
If we run a kubectl describe nodes <node> we can see the label annotated its quite a lot so I did a wider view for this.
Summary
Federal Information Protection Standards vary in levels and notably are a area of focus for organization operating with sensitive data that isn’t considered classified but does provide minimum security requirements for cryptographic modules and this shows how you can operate kubernetes securely. While this doesn’t encompass other areas of security concerns that we can cover a good starting point if Azure Kubernetes Service is your back yard you can reference this link. I’m going to cover more in this implementation of services such as if you require the most secure workloads such as data in use being encrypted consider confidential computing images which are supported in AKS as well. Reference architecture in areas such as PCI-DSS is covered extensively in the architecture center such as this link.