Azure Kubernetes Service has a new service that enhances observability, introducing the use Advanced Container Networking Services. This in a nutshell is a suite of services to observability in your kubernetes cluster supporting visibility from the Hubble UI and native integration of Azure Monitor + Grafana or you can Bring Your Own. This service as of today, June 1st 2024 in preview but will start being a paid offering in the next couple of days June 4th.
Some of the features capabilities are noted below as the following.
- Node-level metrics: node-level visibility for understanding your traffic volume, dropped packets, and number of connections.
- Hubble Metrics (DNS and Pod Level Metrics): Layer 4/Layer 7 packet flows you will see this later in a visual.
- Hubble Flow Logs: Flow logs can answer your latency questions, troubleshoot any path if you need to dive further.
The node-level metrics for Cilium which will be what the cluster provisioned will have support the following metrics.
Pod-Level Metrics (Hubble Metrics)
Current Limitations
- Pod-level metrics are currently only available on Linux
- Cilium data plane is supported starting with Kubernetes 1.29
- Metric labels may have subtle differences between Cilium and non-Cilium clusters.
- Cilium data plane does not currently support DNS Metrics.
It’s also listed the for Scale to understand that Azure managed Prometheus and Grafana impose service-specific scale limitations.
Getting Started
So for getting started to use this I’m going to run the following configuration
- AKS 1.29 is a minimum
- Azure Grafana + Managed Prometheus
- Azure Monitor
First start with updating the use of aks-preview in the cli.
az extension add --name aks-preview
az extension update --name aks-preview
Then we have to register the provider to start using this feature so we’ll call the following notice the namespace falls in Microsoft.ContainerService
az feature register --namespace "Microsoft.ContainerService" --name "AdvancedNetworkingPreview"
This will say Registering if you don’t have this as a registered provider once this is done you can run the same command with changing register to show.
az feature show --namespace "Microsoft.ContainerService" --name "AdvancedNetworkingPreview"
While that is registering we can provision what our cluster will be by starting to script the following, I’ve shorten it to one script for easy creation.
!#/bin/bash
# Set environment variables for the resource group name and location. Make sure to replace the placeholders with your own values.
export RESOURCE_GROUP="<resource-group-name>"
export LOCATION="<azure-region>"
# Create a resource group
az group create --name $RESOURCE_GROUP --location $LOCATION
# Set an environment variable for the AKS cluster name. Make sure to replace the placeholder with your own value.
export CLUSTER_NAME="<aks-cluster-name>"
# Create an AKS cluster
az aks create \
--name $CLUSTER_NAME \
--resource-group $RESOURCE_GROUP \
--api-server-authorized-ip-ranges "[x.x.x./32"] \
--generate-ssh-keys \
--location eastus \
--max-pods 250 \
--network-plugin azure \
--network-plugin-mode overlay \
--network-dataplane cilium \
--node-count 2 \
--pod-cidr 192.168.0.0/16 \
--kubernetes-version 1.29 \
--enable-advanced-network-observability
Once you’re cluster is up and running grab the credentials by running.
az aks get-credentials --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
Since we are creating Azure Managed Prometheus and Grafana we will use the underlying in another script.
nano grafana.sh
#!/bin/bash
#Set an environment variable for the Grafana name. Make sure to replace the placeholder with your own value.
export AZURE_MONITOR_NAME="azure-east-aks"
export RESOURCE_GROUP="aks-demo-networking"
# Create Azure monitor resource
az resource create --resource-group $RESOURCE_GROUP --namespace microsoft.monitor --resource-type accounts --name $AZURE_MONITOR_NAME --location eastus --properties '{}'
# Creating Grafana
export GRAFANA_NAME="aks-grafana"
# Create the instance
az grafana create \
--name $GRAFANA_NAME \
--resource-group $RESOURCE_GROUP
Then we can convert this to a executable chmod +x bash.sh then run ./bash.sh.
After all these commands have our creation we can now grab the Grafana and Azure monitor resource IDs in variables by the following from documentation.
# Ensure you used export RESOURCE_GROUP, GRAFANA_NAME
grafanaId=$(az grafana show \
--name $GRAFANA_NAME \
--resource-group $RESOURCE_GROUP \
--query id \
--output tsv)
azuremonitorId=$(az resource show
--resource-group $RESOURCE_GROUP \
--name $AZURE_MONITOR_NAME \
--resource-type "Microsoft.Monitor/accounts" \
--query id \
--output tsv)
Then we have to link Azure Monitor and Grafana to the AKS Cluster.
az aks update --name $CLUSTER_NAME \
--resource-group $RESOURCE_GROUP \
--enable-azure-monitor-metrics \
--azure-monitor-workspace-resource-id $azuremonitorId \
--grafana-resource-id $grafanaId
If you run into a issue similar to this one shown below you’ll have to register the Microsoft.AlertsManagement this is on the Subscriptions -> Settings -> Resource Providers -> Search “Microsoft.AlertsManagement”
To validate we have this link run the following and we can see the creation of Azure Monitoring Agent, Metrics and more.
Now you can navigate to your Grafana Instance with the UI and start to uncover the Advanced Networking Dashboards in Prometheus.
Assuming this is publicly accessible but in a production environment this likely would be segmented behind a private link or firewall, using our Entra ID Credentials to authenticate.
Select the Endpoint then navigate to Dashboards.
Under the Azure Managed Prometheus folder look for the following visual to see the dashboards from this enablement.
Selecting the first one to see what is going on at the cluster lever this was a snapshot with some nice metrics.
Exploring the Pod Flows (Workload) I can see the connections at the pod level as shown.
From the pods namespace metrics this can show you between namespaces granular connections.
Installing Hubble CLI
For the rest of this you’ll need to have the Hubble CLI, I’ve used the source documentation installation below.
HUBBLE_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/hubble/master/stable.txt)
HUBBLE_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then HUBBLE_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/hubble/releases/download/$HUBBLE_VERSION/hubble-linux-${HUBBLE_ARCH}.tar.gz{,.sha256sum}
sha256sum --check hubble-linux-${HUBBLE_ARCH}.tar.gz.sha256sum
sudo tar xzvfC hubble-linux-${HUBBLE_ARCH}.tar.gz /usr/local/bin
rm hubble-linux-${HUBBLE_ARCH}.tar.gz{,.sha256sum}
Then let’s run the following command.
kubectl get pods -o wide -n kube-system -l k8s-app=hubble-relay
Run a port-forward and open another terminal as well.
kubectl port-forward -n kube-system svc/hubble-relay --address 127.0.0.1 4245:443
We still we need to run some following lines to get the access to use Hubble.
Create a script as shown below. I’ve used install.sh, for this file.
#!/usr/bin/env bash
set -euo pipefail
set -x
# Directory where certificates will be stored
CERT_DIR="$(pwd)/.certs"
mkdir -p "$CERT_DIR"
declare -A CERT_FILES=(
["tls.crt"]="tls-client-cert-file"
["tls.key"]="tls-client-key-file"
["ca.crt"]="tls-ca-cert-files"
)
for FILE in "${!CERT_FILES[@]}"; do
KEY="${CERT_FILES[$FILE]}"
JSONPATH="{.data['${FILE//./\\.}']}"
# Retrieve the secret and decode it
kubectl get secret hubble-relay-client-certs -n kube-system -o jsonpath="${JSONPATH}" | base64 -d > "$CERT_DIR/$FILE"
# Set the appropriate hubble CLI config
hubble config set "$KEY" "$CERT_DIR/$FILE"
done
hubble config set tls true
hubble config set tls-server-name instance.hubble-relay.cilium.io
Change to a executable and run ./<script>.sh
Once that has run lets check the secrets in our cluster.
kubectl get secrets -n kube-system | grep hubble-
We will need to add another file for accessing the UI hubble-ui.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: hubble-ui
namespace: kube-system
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: hubble-ui
labels:
app.kubernetes.io/part-of: retina
rules:
- apiGroups:
- networking.k8s.io
resources:
- networkpolicies
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- componentstatuses
- endpoints
- namespaces
- nodes
- pods
- services
verbs:
- get
- list
- watch
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- get
- list
- watch
- apiGroups:
- cilium.io
resources:
- "*"
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: hubble-ui
labels:
app.kubernetes.io/part-of: retina
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: hubble-ui
subjects:
- kind: ServiceAccount
name: hubble-ui
namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: hubble-ui-nginx
namespace: kube-system
data:
nginx.conf: |
server {
listen 8081;
server_name localhost;
root /app;
index index.html;
client_max_body_size 1G;
location / {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
# CORS
add_header Access-Control-Allow-Methods "GET, POST, PUT, HEAD, DELETE, OPTIONS";
add_header Access-Control-Allow-Origin *;
add_header Access-Control-Max-Age 1728000;
add_header Access-Control-Expose-Headers content-length,grpc-status,grpc-message;
add_header Access-Control-Allow-Headers range,keep-alive,user-agent,cache-control,content-type,content-transfer-encoding,x-accept-content-transfer-encoding,x-accept-response-streaming,x-user-agent,x-grpc-web,grpc-timeout;
if ($request_method = OPTIONS) {
return 204;
}
# /CORS
location /api {
proxy_http_version 1.1;
proxy_pass_request_headers on;
proxy_hide_header Access-Control-Allow-Origin;
proxy_pass http://127.0.0.1:8090;
}
location / {
try_files $uri $uri/ /index.html /index.html;
}
# Liveness probe
location /healthz {
access_log off;
add_header Content-Type text/plain;
return 200 'ok';
}
}
}
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: hubble-ui
namespace: kube-system
labels:
k8s-app: hubble-ui
app.kubernetes.io/name: hubble-ui
app.kubernetes.io/part-of: retina
spec:
replicas: 1
selector:
matchLabels:
k8s-app: hubble-ui
template:
metadata:
labels:
k8s-app: hubble-ui
app.kubernetes.io/name: hubble-ui
app.kubernetes.io/part-of: retina
spec:
serviceAccount: hibble-ui
serviceAccountName: hubble-ui
automountServiceAccountToken: true
containers:
- name: frontend
image: mcr.microsoft.com/oss/cilium/hubble-ui:v0.12.2
imagePullPolicy: Always
ports:
- name: http
containerPort: 8081
livenessProbe:
httpGet:
path: /healthz
port: 8081
readinessProbe:
httpGet:
path: /
port: 8081
resources: {}
volumeMounts:
- name: hubble-ui-nginx-conf
mountPath: /etc/nginx/conf.d/default.conf
subPath: nginx.conf
- name: tmp-dir
mountPath: /tmp
terminationMessagePolicy: FallbackToLogsOnError
securityContext: {}
- name: backend
image: mcr.microsoft.com/oss/cilium/hubble-ui-backend:v0.12.2
imagePullPolicy: Always
env:
- name: EVENTS_SERVER_PORT
value: "8090"
- name: FLOWS_API_ADDR
value: "hubble-relay:443"
- name: TLS_TO_RELAY_ENABLED
value: "true"
- name: TLS_RELAY_SERVER_NAME
value: ui.hubble-relay.cilium.io
- name: TLS_RELAY_CA_CERT_FILES
value: /var/lib/hubble-ui/certs/hubble-relay-ca.crt
- name: TLS_RELAY_CLIENT_CERT_FILE
value: /var/lib/hubble-ui/certs/client.crt
- name: TLS_RELAY_CLIENT_KEY_FILE
value: /var/lib/hubble-ui/certs/client.key
livenessProbe:
httpGet:
path: /healthz
port: 8090
readinessProbe:
httpGet:
path: /healthz
port: 8090
ports:
- name: grpc
containerPort: 8090
resources: {}
volumeMounts:
- name: hubble-ui-client-certs
mountPath: /var/lib/hubble-ui/certs
readOnly: true
terminationMessagePolicy: FallbackToLogsOnError
securityContext: {}
nodeSelector:
kubernetes.io/os: linux
volumes:
- configMap:
defaultMode: 420
name: hubble-ui-nginx
name: hubble-ui-nginx-conf
- emptyDir: {}
name: tmp-dir
- name: hubble-ui-client-certs
projected:
defaultMode: 0400
sources:
- secret:
name: hubble-relay-client-certs
items:
- key: tls.crt
path: client.crt
- key: tls.key
path: client.key
- key: ca.crt
path: hubble-relay-ca.crt
---
kind: Service
apiVersion: v1
metadata:
name: hubble-ui
namespace: kube-system
labels:
k8s-app: hubble-ui
app.kubernetes.io/name: hubble-ui
app.kubernetes.io/part-of: retina
spec:
type: ClusterIP
selector:
k8s-app: hubble-ui
ports:
- name: http
port: 80
targetPort: 8081
Now once that is finally complete we can run the following.
kubectl port-forward -n kube-system svc/hubble-ui 12000:80
Navigate to http://localhost:1200/
I clicked on gatekeeper-system to show the following visual.
I can also navigate to the kube-system get a more larger view of operations in this area.
Once you’ve explored the use of this feel free to delete as needed or keep for the visual just know pricing charges go into effect soon.
Summary
Tear down resources by running the following.
az resource delete -n aks-demo-networking
This will destroy the cluster and all our components the great thing is keeping it all in one resource from a quick management perspective.
This announcement appears to spearhead the continued support of Cilium and the tighter integration of Azure Managed Grafana its also real welcoming to see the continued use of Bring Your Own Prometheus if you want more control. Check out Azure Advanced Networking Capabilities for AKS if you’re trying to get more native integration of Azure Monitor.