Kubernetes Cluster Report: aks-181225-test-uks
Powered by KubeBuddy Logo
Generated on: April 15, 2026 10:21:10 UTC
Created by 🌐 KubeDeck.io
Documentation 📄 KubeBuddy.io
  • Overview
  • Summary
  • Nodes
  • Namespaces
  • Workloads
  • Pods
  • Jobs
  • Networking
  • Storage
  • Configuration
  • Security
  • Kubernetes Events
  • AKS Best Practices

Cluster Overview

Cluster Name: aks-181225-test-uks

Cluster Health Score

Score: 81 / 100

81%

This score is calculated from key checks across nodes, workloads, security, and configuration best practices. A higher score means fewer issues and better adherence to Kubernetes standards.

API Server Health

latency (p99): 5.00 ms

Liveness: livez check passed expand_more
[+]ping ok
[+]log ok
[+]etcd ok
[+]poststarthook/start-apiserver-admission-initializer ok
[+]poststarthook/generic-apiserver-start-informers ok
[+]poststarthook/priority-and-fairness-config-consumer ok
[+]poststarthook/priority-and-fairness-filter ok
[+]poststarthook/storage-object-count-tracker-hook ok
[+]poststarthook/start-apiextensions-informers ok
[+]poststarthook/start-apiextensions-controllers ok
[+]poststarthook/crd-informer-synced ok
[+]poststarthook/start-system-namespaces-controller ok
[+]poststarthook/start-cluster-authentication-info-controller ok
[+]poststarthook/start-kube-apiserver-identity-lease-controller ok
[+]poststarthook/start-kube-apiserver-identity-lease-garbage-collector ok
[+]poststarthook/start-legacy-token-tracking-controller ok
[+]poststarthook/start-service-ip-repair-controllers ok
[+]poststarthook/rbac/bootstrap-roles ok
[+]poststarthook/scheduling/bootstrap-system-priority-classes ok
[+]poststarthook/priority-and-fairness-config-producer ok
[+]poststarthook/bootstrap-controller ok
[+]poststarthook/start-kubernetes-service-cidr-controller ok
[+]poststarthook/aggregator-reload-proxy-client-cert ok
[+]poststarthook/start-kube-aggregator-informers ok
[+]poststarthook/apiservice-status-local-available-controller ok
[+]poststarthook/apiservice-status-remote-available-controller ok
[+]poststarthook/apiservice-registration-controller ok
[+]poststarthook/apiservice-discovery-controller ok
[+]poststarthook/kube-apiserver-autoregistration ok
[+]autoregister-completion ok
[+]poststarthook/apiservice-openapi-controller ok
[+]poststarthook/apiservice-openapiv3-controller ok
livez check passed
Readiness: readyz check passed expand_more
[+]ping ok
[+]log ok
[+]etcd ok
[+]etcd-readiness ok
[+]informer-sync ok
[+]poststarthook/start-apiserver-admission-initializer ok
[+]poststarthook/generic-apiserver-start-informers ok
[+]poststarthook/priority-and-fairness-config-consumer ok
[+]poststarthook/priority-and-fairness-filter ok
[+]poststarthook/storage-object-count-tracker-hook ok
[+]poststarthook/start-apiextensions-informers ok
[+]poststarthook/start-apiextensions-controllers ok
[+]poststarthook/crd-informer-synced ok
[+]poststarthook/start-system-namespaces-controller ok
[+]poststarthook/start-cluster-authentication-info-controller ok
[+]poststarthook/start-kube-apiserver-identity-lease-controller ok
[+]poststarthook/start-kube-apiserver-identity-lease-garbage-collector ok
[+]poststarthook/start-legacy-token-tracking-controller ok
[+]poststarthook/start-service-ip-repair-controllers ok
[+]poststarthook/rbac/bootstrap-roles ok
[+]poststarthook/scheduling/bootstrap-system-priority-classes ok
[+]poststarthook/priority-and-fairness-config-producer ok
[+]poststarthook/bootstrap-controller ok
[+]poststarthook/start-kubernetes-service-cidr-controller ok
[+]poststarthook/aggregator-reload-proxy-client-cert ok
[+]poststarthook/start-kube-aggregator-informers ok
[+]poststarthook/apiservice-status-local-available-controller ok
[+]poststarthook/apiservice-status-remote-available-controller ok
[+]poststarthook/apiservice-registration-controller ok
[+]poststarthook/apiservice-discovery-controller ok
[+]poststarthook/kube-apiserver-autoregistration ok
[+]autoregister-completion ok
[+]poststarthook/apiservice-openapi-controller ok
[+]poststarthook/apiservice-openapiv3-controller ok
[+]shutdown ok
readyz check passed

Passed / Failed Checks

0/0 Passed

This shows the number of health checks that passed out of the total checks performed across the cluster. A higher pass rate indicates better overall cluster health.

Top 5 Improvements

These are the five checks whose remediation will yield the most immediate benefit to your overall Cluster Health Score. Each card shows the cluster score points you’ll recover by fixing it.

home_repair_serviceRBAC002+ 4.62 pts

RBAC Overexposure

home_repair_serviceSEC003+ 4.62 pts

Pods Running as Root

home_repair_serviceWRK007+ 3.2 pts

Missing Readiness and Liveness Probes

home_repair_serviceWRK002+ 2.4 pts

Deployment Missing Replicas

home_repair_serviceSEC006+ 2.4 pts

Pods Missing Secure Defaults

Issue Summary

This section shows how many checks have failed at each severity level over the last run. Click on a card below to expand and review those checks.

Critical

7 checks failed

expand_more
NET001Services Without Endpoints (Networking)
NET002Publicly Accessible Services (Networking)
POD007Container images do not use latest tag (Pods)
RBAC002RBAC Overexposure (Security)
SEC003Pods Running as Root (Security)
SEC011Containers Running as UID 0 (Security)
SEC014Untrusted Image Registries (Security)
Warning

18 checks failed

expand_more
CFG002Duplicate ConfigMap Names (Configuration Hygiene)
EVENT001Grouped Warning Events (Kubernetes Events)
EVENT002Full Warning Event Log (Kubernetes Events)
NET004Namespace Missing Network Policy (Networking)
NODE003Max Pods per Node (Nodes)
NS002Missing or Weak ResourceQuotas (Namespaces)
NS003Missing LimitRanges (Namespaces)
POD004Pending Pods (Pods)
POD008Automounting API Credentials Enabled in Pods (Pods)
RBAC003Orphaned ServiceAccounts (Security)
SC002AKS Azure In-Tree Storage Provisioners (Storage)
SC004StorageClass Prevents Volume Expansion (Storage)
SEC006Pods Missing Secure Defaults (Security)
SEC009Missing Capabilities Drop (Security)
SEC015Pods Using Default ServiceAccount (Security)
SEC020Seccomp Profile Not Configured (Security)
WRK002Deployment Missing Replicas (Workloads)
WRK007Missing Readiness and Liveness Probes (Workloads)
Info

3 checks failed

expand_more
NS001Empty Namespaces (Namespaces)
RBAC004Orphaned and Ineffective Roles (Security)
SEC007Missing Pod Security Admission Labels (Security)

Rightsizing at a Glance

Node Insights

🖥️ Underutilized Nodes0
🔥 Saturated Nodes0
✅ Right-sized Nodes3

Pod Actions

⚙️ CPU Request Changes1
🧠 Memory Request Changes1
🛡️ Memory Limit Changes1
🚫 CPU Limit Removals0

Impact Summary

🚀 High Impact0
📈 Medium Impact0
🧩 Low Impact1

Excluded Namespaces iThese namespaces are excluded from analysis and reporting.

aks-istio-system calico-system coredns gatekeeper-system kube-flannel kube-node-lease kube-public kube-system local-path-storage tigera-operator

Cluster Summary

Cluster Name: aks-181225-test-uks

Kubernetes Version: v1.33.6

Cluster is running an outdated version: v1.33.6 (Latest: v1.35.3)

Cluster Metrics Summary iSummary of metrics including node and pod counts, warnings, and issues.

🚀 Nodes: 3🟩 Healthy: 3🟥 Issues: 0
📦 Pods: 67🟩 Running: 63🟥 Failed: 0
🔄 Restarts: 22🟨 Warnings: 5🟥 Critical: 0
⏳ Pending Pods: 4🟡 Waiting: 4
⚠️ Stuck Pods: 4❌ Stuck: 4
📉 Job Failures: 0🔴 Failed: 0

Pod Distribution iAverage, min, and max pods per node and total node count.

Avg: 21.0Max: 42Min: 10Total Nodes: 3

Cluster Health Metrics (Last 24h) i 24-hour Prometheus averages and charts for cluster CPU and memory usage.

Avg CPU: 5.23%

Avg Memory: 19.58%

Cluster CPU Usage (%)

Historical CPU metrics from Prometheus, averaged over the last 24 hours.

Cluster Memory Usage (%)

Historical memory metrics from Prometheus, averaged over the last 24 hours.

Cluster Events

Errors: 4

Warnings: 4

Node Conditions & Resources

NODE001 - Node Readiness and Conditions iDetects nodes that are not in Ready state or reporting other warning conditions.

✅ All Nodes are healthy.

Show Findings
NodeStatusIssues
aks-systempool-39088964-vmss00000k✅ HealthyNone
aks-systempool-39088964-vmss00000l✅ HealthyNone
aks-systempool-39088964-vmss00000m✅ HealthyNone

NODE002 - Node Resource Pressure (Last 24h) iDetects nodes under high CPU, memory, or disk pressure.

Data source: Prometheus (24h average)

✅ All Nodes are healthy.

Show Findings
NodeCPU StatusCPU %CPU UsedCPU TotalMem StatusMem %Mem UsedMem TotalDisk %Disk Status
aks-systempool-39088964-vmss00000k✅ Normal7.80%301 mC3860 mC✅ Normal32.9%4888 Mi14846 Mi21.55%✅ Normal
aks-systempool-39088964-vmss00000l✅ Normal3.64%140 mC3860 mC✅ Normal12.7%1887 Mi14850 Mi20.53%✅ Normal
aks-systempool-39088964-vmss00000m✅ Normal4.26%164 mC3860 mC✅ Normal13.1%1943 Mi14846 Mi20.49%✅ Normal

NODE003 - Max Pods per Node iAlerts when any node is running too many pods according to configured thresholds.

⚠️ Total Nodes with Issues: 1

Show Recommendations
tips_and_updatesRecommended Actions
  • Run kubectl get pods -o wide --all-namespaces and group by .spec.nodeName to see pod distribution.
  • Use kubectl describe node <node-name> to inspect allocatable pods and taints.
  • Consider tuning the kubelet’s --max-pods flag if you need higher density.
  • Scale out your node pool or add additional nodes to balance the load.
Show Findings
NodePodCountCapacityPercentageThresholdStatus
aks-systempool-39088964-vmss00000k425084.00%80%Warning

PROM005 - Overcommitted CPU (Prometheus) iChecks if CPU requests on nodes exceed allocatable capacity over the last 24 hours.

✅ All Nodes are healthy.

PROM006 - Node Sizing Insights (Prometheus) iUses Prometheus p95 CPU and memory usage over a fixed 7-day window to highlight underutilized or saturated nodes and suggest sizing actions.

✅ All Nodes are healthy.

📅 Insufficient Prometheus history for sizing. Required: 7 days, available: 5.75 days.

Show Recommendations
tips_and_updatesRecommended Actions

Node Sizing Guidance

  • Focus on sustained p95 trends, not short spikes.
  • Sizing window is fixed to 7 days for stable, lower-cost query execution.
  • Nodes flagged as underutilized are candidates for smaller SKUs or scale-in.
  • Nodes flagged as saturated likely need larger SKUs, scale-out, or workload rebalancing.
  • Validate with workload requests/limits and HPA/VPA behavior before applying changes.
Show Findings
StatusRequired DaysAvailable DaysMessage
Insufficient Prometheus history75.8Node sizing recommendations are withheld until at least 7 days of Prometheus history is available.
search
Node: aks-systempool-39088964-vmss00000kCPU: 7.80%Mem: 32.93%Disk: 21.55%

OS: Microsoft Azure Linux 3.0
Kernel: 6.6.126.1-1.azl3
Kubelet: v1.33.6
Runtime: containerd://2.0.0

CPU: 7.80%

Memory: 32.93%

Disk: 21.55%

CPU Usage (%)

Memory Usage (%)

Disk Usage (%)

Node: aks-systempool-39088964-vmss00000lCPU: 3.64%Mem: 12.71%Disk: 20.53%

OS: Microsoft Azure Linux 3.0
Kernel: 6.6.126.1-1.azl3
Kubelet: v1.33.6
Runtime: containerd://2.0.0

CPU: 3.64%

Memory: 12.71%

Disk: 20.53%

CPU Usage (%)

Memory Usage (%)

Disk Usage (%)

Node: aks-systempool-39088964-vmss00000mCPU: 4.26%Mem: 13.09%Disk: 20.49%

OS: Microsoft Azure Linux 3.0
Kernel: 6.6.126.1-1.azl3
Kubelet: v1.33.6
Runtime: containerd://2.0.0

CPU: 4.26%

Memory: 13.09%

Disk: 20.49%

CPU Usage (%)

Memory Usage (%)

Disk Usage (%)

Namespaces

NS001 - Empty Namespaces iFinds namespaces with no running pods.

⚠️ Total Namespaces with Issues: 1

Show Recommendations
tips_and_updatesRecommended Actions
  • Check if any other resources (PVCs, Secrets) exist before deleting.
  • Use kubectl get all -n to inspect.
  • Clean up empty namespaces to reduce clutter.
Show Findings
NamespaceResourceValueMessage
defaultnamespace/default⚠️ PartialNo pods, but other resources exist

NS002 - Missing or Weak ResourceQuotas iDetects namespaces with missing or incomplete ResourceQuota definitions.

⚠️ Total ResourceQuotas with Issues: 2

Show Recommendations
tips_and_updatesRecommended Actions
  • Define limits using ResourceQuota for pods, memory, and CPU.
  • Helps avoid over-provisioning and noisy neighbor issues.
  • Review quotas using kubectl describe quota -n .
Show Findings
NamespaceResourceValueMessage
azure-storenamespace/azure-store❌ No ResourceQuota
defaultnamespace/default❌ No ResourceQuota

NS003 - Missing LimitRanges iDetects namespaces without a defined LimitRange.

⚠️ Total LimitRanges with Issues: 2

Show Recommendations
tips_and_updatesRecommended Actions
  • LimitRanges define default and max values for CPU/memory.
  • Prevents pods from using unlimited resources.
  • Use kubectl create limitrange ... or kubectl describe limitrange -n .
Show Findings
NamespaceResourceValueMessage
azure-storenamespace/azure-store❌ No LimitRange
defaultnamespace/default❌ No LimitRange

NS004 - Pods in Default Namespace iFlags pods running in the default namespace.

✅ All Pods are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Use kubectl get pods -n default to list them.
  • Re-deploy your workloads into a custom namespace:
  • kubectl create namespace my-app kubectl -n my-app apply -f your-manifests.yaml

Workloads

WRK001 - DaemonSets Not Fully Running iDetects DaemonSets that have fewer ready pods than desired.

✅ All Workloads are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Run kubectl describe ds -n to check for scheduling issues.
  • Check node taints and conditions.
  • Ensure resource requests are not too high for nodes.

WRK002 - Deployment Missing Replicas iDetects Deployments where available replicas are less than desired.

⚠️ Total Workloads with Issues: 4

Show Recommendations
tips_and_updatesRecommended Actions
  • Run kubectl describe deployment -n to view status.
  • Check for failed pods using kubectl get pods -n .
  • Review rollout and events for delays or crashes.
Show Findings
NamespaceResourceValueMessage
azure-storedeployment/order-service0/1Deployment has fewer available replicas than desired.
azure-storedeployment/product-service0/1Deployment has fewer available replicas than desired.
azure-storedeployment/rabbitmq0/1Deployment has fewer available replicas than desired.
azure-storedeployment/store-front0/1Deployment has fewer available replicas than desired.

WRK003 - StatefulSet Incomplete Rollout iDetects StatefulSets with fewer ready replicas than desired.

✅ All Workloads are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Run kubectl describe sts name -n namespace to view rollout and events.
  • Check pod logs and PersistentVolumeClaim bindings.
  • Confirm storage class availability and node scheduling constraints.

WRK004 - HPA Misconfiguration or Inactivity iChecks for HPAs with missing targets, metrics, or inactive scaling.

✅ All Workloads are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Check if the target workload exists using kubectl get deploy|sts -n .
  • Use kubectl describe hpa -n to inspect HPA status and events.
  • Ensure metrics-server is running and the target exposes the required metrics.

WRK005 - Missing Resource Requests iChecks that every container has explicit CPU and memory requests.

✅ All Workloads are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Add resources.requests.cpu and resources.requests.memory to every container.
  • Review both workload and initContainers with kubectl get deploy,statefulset,daemonset -A -o yaml.
  • Apply any missing fields, then rerun KubeBuddy to confirm.

WRK006 - PDB Coverage and Effectiveness iDetects missing or weak PodDisruptionBudgets.

✅ All Workloads are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Set minAvailable to a safe minimum (not 0).
  • Avoid setting maxUnavailable to 1 or 100%.
  • Make sure PDBs match actual workloads via label selectors.

WRK007 - Missing Readiness and Liveness Probes iDetects containers without readiness or liveness probes.

⚠️ Total Workloads with Issues: 4

Show Recommendations
tips_and_updatesRecommended Actions
  • Readiness probes indicate when a container is ready to receive traffic.
  • Liveness probes detect if a container is stuck or dead.
  • Use httpGet, tcpSocket, or exec probes for most apps.
  • Docs: Health probes in Kubernetes
Show Findings
NamespaceResourceValueMessage
azure-storedeployment/order-serviceorder-servicereadiness, liveness missing
azure-storedeployment/product-serviceproduct-servicereadiness, liveness missing
azure-storedeployment/rabbitmqrabbitmqreadiness, liveness missing
azure-storedeployment/store-frontstore-frontreadiness, liveness missing

WRK008 - Deployment Selector Without Matching Pods iDetects Deployments whose selectors do not match any existing pods.

✅ All Workloads are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Check that Deployment's spec.selector.matchLabels matches the pod template's labels.
  • Fix any label mismatches to allow pods to be created.

WRK009 - Deployment, Pod, and Service Label Consistency iValidates that deployments, pods, and services use aligned labels and selectors.

✅ All Workloads are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Deployment spec.selector.matchLabels must match the Pod template metadata.labels.
  • Services should have spec.selector that targets the same labels used by the Deployment and Pods.
  • Use kubectl get deployment,svc,pod -o yaml to compare values and fix mismatches.

WRK010 - HPA Metrics Without Matching Resource Requests iDetects HPAs that scale on CPU or memory metrics when target containers lack matching requests.

✅ All Workloads are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Add resources.requests.cpu and/or resources.requests.memory for HPA target containers.
  • Use consistent requests across replicas to avoid unstable scaling behavior.
  • After updates, validate HPA behavior with kubectl describe hpa.

WRK011 - VPA Update Mode and Declarative Resource Conflict Risk iFlags VPAs in Auto/Recreate mode that may conflict with declarative resource ownership or HPAs.

✅ All Workloads are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • If GitOps/Helm controls requests, consider VPA updateMode: Off or Initial.
  • Avoid overlapping HPA (CPU/memory) and VPA ownership without clear boundaries.
  • Document which controller owns requests per workload.

WRK012 - PodDisruptionBudget Adequacy for Replicated Workloads iValidates that replicated workloads have matching PDBs with sensible settings.

✅ All Workloads are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Ensure replicated workloads (2+ replicas) have a matching PDB.
  • Avoid minAvailable equal to replica count for normal maintenance windows.
  • Use pragmatic budgets (for example maxUnavailable: 1 for many workloads).

WRK013 - CrashLoopBackOff and OOMKilled Guardrail iFlags pods with CrashLoopBackOff, OOMKilled state, or high restart counts.

✅ All Workloads are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Investigate container logs and termination reasons for recurring restarts.
  • Increase memory requests/limits when OOMKilled events are observed.
  • Apply sizing changes gradually and validate SLO/error rates.

WRK014 - Missing Memory Limits iChecks that every container has an explicit memory limit.

✅ All Workloads are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Add resources.limits.memory to every application and init container.
  • Set the limit high enough for normal peaks, then tune requests separately.
  • Review workloads with kubectl get deploy,statefulset,daemonset -A -o yaml to confirm the source manifests carry the limit.

WRK015 - Replicated Workloads Missing Spread Constraints iDetects replicated workloads that define neither anti-affinity nor topology spread constraints.

✅ All Workloads are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Add topologySpreadConstraints or affinity.podAntiAffinity to each workload with multiple replicas.
  • Prefer distribution across nodes and zones using stable labels such as topology.kubernetes.io/zone and kubernetes.io/hostname.
  • Update the source Deployment, StatefulSet, or Helm values so the spreading rule is maintained on future releases.

Pods

POD001 - Pods with High Restarts iDetects pods that have restarted more than the configured thresholds.

✅ All Pods are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Use kubectl logs -n to view logs and identify crash causes.
  • Run kubectl describe pod -n to check events and probe failures.
  • Verify readiness and liveness probes are configured properly.
  • Check for missing config, secrets, or volume mounts.
  • Adjust resource requests/limits to avoid OOM kills.

POD002 - Long Running Pods iFlags pods that have been running longer than configured thresholds.

✅ All Pods are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Pods with extended uptime may indicate skipped rolling updates.
  • Use kubectl rollout status to inspect deployment progress.
  • Restart pods when config changes are missed or memory use drifts.
  • Check if the workload is intended to be static or ephemeral.

POD003 - Failed Pods iDetects pods in a failed phase, typically due to startup errors, crashes, or misconfiguration.

✅ All Pods are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Check the pod events with kubectl describe pod <pod> -n <ns>
  • Review logs using kubectl logs <pod> -n <ns>
  • Validate container specs, resource limits, and init containers
  • Check node availability or taints

POD004 - Pending Pods iDetects pods stuck in a 'Pending' state due to scheduling or resource issues.

⚠️ Total Pods with Issues: 4

Show Recommendations
tips_and_updatesRecommended Actions
  • Run kubectl describe pod <pod> -n <namespace> to check scheduling events
  • Check if nodes meet the pod's resource requests and tolerations
  • Look for unresolved PVCs, Secrets, or ConfigMaps
  • Check cluster-wide CPU and memory availability
Show Findings
NamespaceResourceValueMessage
azure-storepod/order-service-65cc8855c-ghk9mPendingSome pods are stuck in Pending. These workloads are not running and are waiting on cluster conditions.
azure-storepod/product-service-77ff9f6fd6-rzcxjPendingSome pods are stuck in Pending. These workloads are not running and are waiting on cluster conditions.
azure-storepod/rabbitmq-5dcdf9484-kvgw7PendingSome pods are stuck in Pending. These workloads are not running and are waiting on cluster conditions.
azure-storepod/store-front-698cc8c565-f5hp5PendingSome pods are stuck in Pending. These workloads are not running and are waiting on cluster conditions.

POD005 - CrashLoopBackOff Pods iIdentifies pods stuck in a CrashLoopBackOff state due to repeated container crashes.

✅ All Pods are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Run kubectl logs <pod-name> -n <namespace> to see error output
  • Describe the pod for events and messages: kubectl describe pod <pod> -n <ns>
  • Check init containers, config errors, and resource limits

POD006 - Leftover Debug Pods iDetects pods created by kubectl debug that have not been cleaned up.

✅ All Pods are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Run kubectl delete pod -n to remove them
  • Ensure automation or users clean up after using kubectl debug

POD007 - Container images do not use latest tag iFlags containers using latest or no explicit tag.

⚠️ Total Pods with Issues: 4

Show Recommendations
tips_and_updatesRecommended Actions

🛠️ Use Specific Image Tags

  • Don't use the :latest tag or leave the image tag blank.
  • Why: It can pull different images on each deploy, leading to drift.
  • Fix: Tag images explicitly (e.g., :v1.2.3) and update the pod spec.
  • Docs: Kubernetes Image Tagging
Show Findings
NamespaceResourceValueMessage
azure-storepod/order-service-65cc8855c-ghk9mghcr.io/azure-samples/aks-store-demo/order-service:latestContainer order-service: Image uses latest tag
azure-storepod/order-service-65cc8855c-ghk9mbusyboxContainer wait-for-rabbitmq: Image omits explicit tag
azure-storepod/product-service-77ff9f6fd6-rzcxjghcr.io/azure-samples/aks-store-demo/product-service:latestContainer product-service: Image uses latest tag
azure-storepod/store-front-698cc8c565-f5hp5ghcr.io/azure-samples/aks-store-demo/store-front:latestContainer store-front: Image uses latest tag

POD008 - Automounting API Credentials Enabled in Pods iFlags pods that do not explicitly disable service account token automounting.

⚠️ Total Pods with Issues: 4

Show Recommendations
tips_and_updatesRecommended Actions

🛠️ Disable Automounting API Credentials

  • Add automountServiceAccountToken: false to the Pod's spec.
  • Edit with kubectl edit pod -n .
  • Verify if the application needs API access (e.g., for controllers).
  • Use RBAC to limit ServiceAccount permissions if access is required.
Show Findings
NamespaceResourceValueMessage
azure-storepod/order-service-65cc8855c-ghk9m<nil>Pod automounts API credentials
azure-storepod/product-service-77ff9f6fd6-rzcxj<nil>Pod automounts API credentials
azure-storepod/rabbitmq-5dcdf9484-kvgw7<nil>Pod automounts API credentials
azure-storepod/store-front-698cc8c565-f5hp5<nil>Pod automounts API credentials

PROM001 - High CPU Pods (Prometheus) iChecks for pods with sustained high CPU usage over the last 24 hours using Prometheus metrics.

✅ All Pods are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🛠️ Investigate High CPU Pods

  • Use kubectl top pod to see real-time CPU usage.
  • Review app code or HPA settings for misbehaving containers.
  • Consider raising CPU requests/limits or scaling out.

PROM002 - High Memory Usage Pods (Prometheus) iDetects pods with high memory usage over the last 24 hours based on Prometheus metrics.

✅ All Pods are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🛠️ Investigate High Memory Pods

  • Use kubectl top pod to review memory usage.
  • Adjust resources.limits.memory appropriately.

PROM003 - High Network Receive Rate (Prometheus) iDetects pods receiving large amounts of network traffic over the last 24 hours.

✅ All Pods are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🛠️ Investigate Network Receive Rate

  • Use kubectl top pod or Prometheus UI.
  • Inspect service ingress patterns.

PROM007 - Pod Sizing Insights (Prometheus) iGenerates per-container CPU and memory sizing recommendations from fixed 7-day p95 Prometheus usage.

✅ All Pods are healthy.

📅 Insufficient Prometheus history for sizing. Required: 7 days, available: 5.75 days.

Show Recommendations
tips_and_updatesRecommended Actions

Pod Sizing Guidance

  • Set CPU and memory requests from p95 usage with safety headroom.
  • Default CPU limits recommendation is none to avoid unnecessary CPU throttling and latency spikes.
  • Keep memory limits set above memory request to control OOM blast radius.
  • Validate against SLOs and roll out gradually.
Show Findings
StatusRequired DaysAvailable DaysMessage
Insufficient Prometheus history75.8Pod sizing recommendations are withheld until at least 7 days of Prometheus history is available.

Jobs

JOB001 - Stuck Kubernetes Jobs iFinds Jobs that have started but not completed within the threshold.

✅ All Jobs are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Check pod status for the job using kubectl describe job .
  • Verify resources and restart policies.
  • Check logs with kubectl logs job/.

JOB002 - Failed Kubernetes Jobs iDetects jobs with failures and no successful completions.

✅ All Jobs are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Inspect job with kubectl describe job .
  • Check logs for errors using kubectl logs job/.
  • Review pod events and resource limits.

Networking

NET001 - Services Without Endpoints iIdentifies services that have no backing endpoints.

⚠️ Total Networking with Issues: 4

Show Recommendations
tips_and_updatesRecommended Actions

🔍 Services Without Endpoints

  • Verify that your service has a valid selector.
  • Check if pods exist and are ready in the same namespace.
  • Use kubectl describe svc <name> and kubectl get endpointslices -n <namespace> -l kubernetes.io/service-name=<name>.
  • Restart affected pods or fix labels as needed.
Show Findings
NamespaceResourceValueMessage
azure-storeservice/order-serviceNo endpoints or endpoint slices
azure-storeservice/product-serviceNo endpoints or endpoint slices
azure-storeservice/rabbitmqNo endpoints or endpoint slices
azure-storeservice/store-frontNo endpoints or endpoint slices

NET002 - Publicly Accessible Services iDetects services of type LoadBalancer or NodePort that may be publicly exposed.

⚠️ Total Networking with Issues: 1

Show Recommendations
tips_and_updatesRecommended Actions

🌐 Secure Exposed Services

  • Use internal IP ranges or private LoadBalancers where possible.
  • Restrict NodePort usage or protect with firewall rules.
  • Disable external exposure for internal-only services.
  • Consider network policies or service mesh for access control.
Show Findings
NamespaceResourceValueMessage
azure-storeservice/store-frontLoadBalancerExposed via external IP: 131.145.120.106

NET003 - Ingress Health Validation iValidates ingress classes, TLS secrets, and backend service references.

✅ All Networking are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🌐 Ingress Health Remediation

  • Add spec.ingressClassName or annotations if missing.
  • Validate all backend services and ports exist.
  • Fix missing TLS secrets or use valid ones.
  • Avoid duplicate host/path combinations.
  • Use only valid pathTypes: Exact, Prefix, or ImplementationSpecific.

NET004 - Namespace Missing Network Policy iFlags namespaces that do not define any NetworkPolicy.

⚠️ Total Networking with Issues: 1

Show Recommendations
tips_and_updatesRecommended Actions
  • Apply a default deny-all NetworkPolicy for ingress and egress.
  • Use additional policies to allow traffic between required pods/services.
Show Findings
NamespaceResourceValueMessage
azure-storenamespace/azure-storeNo NetworkPolicy in active namespace

NET005 - Ingress Host/Path Conflicts iDetects duplicate host/path combinations across ingresses in the same namespace.

✅ All Networking are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🚫 Resolve Ingress Conflicts

  • Ensure that each unique host and path combination is defined in only one Ingress resource.
  • Use specific hostnames instead of broad wildcards where possible to prevent unintended conflicts.
  • Review your Ingress definitions for overlapping rules and consolidate or adjust as necessary.
  • Test routing after making changes to confirm correct behavior.

NET006 - Ingress Using Wildcard Hosts iDetects ingress rules that use wildcard hosts.

✅ All Networking are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

⭐ Review Wildcard Ingresses

  • Evaluate if a wildcard host is truly necessary for the application's routing requirements.
  • Where possible, replace wildcards with specific hostnames to limit unintentional exposure.
  • Ensure that security policies and firewalls are in place to control access to wildcard-enabled Ingresses.

NET007 - Service TargetPort Mismatch iDetects services whose targetPort does not exist on backing pods.

✅ All Networking are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🎯 Fix Service TargetPort Mismatches

  • Verify the targetPort in your Service definition. It should either be a numerical port or a named port.
  • Check the containerPorts in the Pods selected by the Service.
  • Ensure the `targetPort` (by number) or `name` (for named ports) in the Pod's `containerPort` matches the Service's `targetPort`.
  • A common fix is to ensure consistent naming conventions or directly use port numbers.

NET008 - ExternalName Service to Internal IP iIdentifies ExternalName services that point to internal IP addresses.

✅ All Networking are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🔄 Review ExternalName to Internal IP

  • ExternalName services are primarily for CNAME-like redirection to external DNS names.
  • If routing to an internal IP address, consider if a standard `Service` with manually created `EndpointSlice` or a `Service` with `type: ClusterIP` backed by pods is more appropriate.
  • Ensure this configuration is intentional and does not bypass intended network segmentation or security policies.

NET009 - Overly Permissive Network Policy iIdentifies NetworkPolicies with empty rules or broad all-IP blocks.

✅ All Networking are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🔐 Restrict Overly Permissive Network Policies

  • Ensure `policyTypes` are paired with explicit `ingress` and `egress` rules that define allowed traffic.
  • Avoid empty `ingress` or `egress` sections if the `policyTypes` are defined, as this defaults to allowing all traffic for that type.
  • Limit the use of `ipBlock: 0.0.0.0/0`. Instead, define specific CIDR ranges for necessary external communication.
  • Adopt a "deny-by-default" approach and explicitly allow only required communication.

NET010 - Network Policy Overly Permissive IPBlock iFlags NetworkPolicies that allow 0.0.0.0/0 through ipBlock rules.

✅ All Networking are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🚫 Restrict '0.0.0.0/0' in Network Policies

  • Avoid using `ipBlock: 0.0.0.0/0` in NetworkPolicies unless absolutely required for specific, well-understood use cases (e.g., public internet access).
  • Identify the precise CIDR ranges or specific IP addresses that need to be allowed.
  • For egress, if public internet access is needed, consider egress gateways or more restrictive network policies to control outbound traffic.
  • This is a critical security vulnerability if unintended.

NET011 - Network Policy Missing PolicyTypes iDetects NetworkPolicies that do not explicitly define policyTypes.

✅ All Networking are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

📝 Define Network PolicyTypes

  • Always explicitly define `policyTypes` in your NetworkPolicy, such as `policyTypes: [Ingress]` or `policyTypes: [Ingress, Egress]`.
  • This clearly indicates whether the policy applies to inbound, outbound, or both types of traffic.
  • It prevents reliance on default behaviors, which can vary or change between Kubernetes versions or CNI implementations.

NET012 - Pod HostNetwork Usage iIdentifies pods configured with hostNetwork true.

✅ All Networking are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

⚠️ Avoid HostNetwork Usage

  • Using `hostNetwork: true` is a security risk as it grants the pod direct access to the node's network stack.
  • This bypasses many Kubernetes network security features and network policies.
  • Only use `hostNetwork` for specific, highly privileged use cases (e.g., CNI plugins, network observability tools) and limit access via RBAC and Pod Security Standards.
  • For typical applications, rely on ClusterIP, NodePort, or LoadBalancer services for exposure.

NET013 - Ingress Present Without Gateway API Adoption iDetects clusters still using Ingress without any Gateway API resources.

✅ All Networking are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🚦 Begin Gateway API Migration

  • Create or select a GatewayClass supported by your controller.
  • Define one or more Gateway resources for north-south traffic entry.
  • Migrate Ingress rules incrementally to HTTPRoute and validate behavior.
  • Run both models in parallel during transition where supported.

NET014 - HTTPRoute Missing or Unaccepted Parent iDetects HTTPRoutes with missing parentRefs or no accepted parent Gateway.

✅ All Networking are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🧭 Fix HTTPRoute Parent Binding

  • Set spec.parentRefs to an existing Gateway.
  • Check route status conditions and Gateway listener compatibility.
  • Verify namespace permissions and allowedRoutes policy.

NET015 - Gateways Without Attached HTTPRoutes iDetects Gateway resources that have no attached HTTPRoutes.

✅ All Networking are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🧹 Clean Up or Attach Routes

  • Attach one or more HTTPRoute resources to each active Gateway.
  • Delete unused Gateways to avoid confusion and stale entry points.
  • Confirm listener and route host/path alignment.

NET016 - Gateway API Readiness Conditions iDetects Gateway resources that are not accepted or programmed.

✅ All Networking are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🚦 Validate Gateway Readiness

  • Check GatewayClass and Gateway status.conditions for Accepted and Programmed.
  • Verify the Gateway controller deployment is healthy and watching the relevant classes.
  • Fix listener, address, or controller configuration issues before cutover.

NET017 - Gateway TLS Secret and Cross-Namespace ReferenceGrant Validation iValidates Gateway certificateRefs against existing Secrets and ReferenceGrants.

✅ All Networking are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🔐 Fix Gateway TLS References

  • Verify each certificateRef points to an existing Secret.
  • For cross-namespace refs, create a ReferenceGrant in the Secret namespace.
  • Re-check Gateway listener status after grant/secret updates.

NET018 - Duplicate Service Selectors iDetects multiple Services in the same namespace with identical selectors.

✅ All Networking are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🎯 Use Unique Service Selectors

  • Review Services in the same namespace that select the exact same pod label set.
  • Split selectors so each Service represents a distinct routing contract, or consolidate duplicate Services where appropriate.
  • Update the source manifest or Helm chart so the selector change persists across releases.

Storage

PV001 - Orphaned Persistent Volumes iDetects PersistentVolumes that are not bound to any PVC.

✅ All Storage are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🗑️ Clean Up Orphaned PVs

  • Audit: Verify the PV is truly unneeded using kubectl describe pv <name>.
  • Delete: Remove unneeded PVs with kubectl delete pv <name>.
  • Caution: Ensure no future PVC will bind to it before deletion.

PVC001 - Unused Persistent Volume Claims iDetects PVCs not attached to any pod.

✅ All Storage are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

💾 Clean Up Unused PVCs

  • Audit: Confirm PVC is not needed using kubectl describe pvc -n .
  • Delete: Remove PVCs no longer required with kubectl delete pvc .
  • Prevent: Automate cleanup for stale environments or ephemeral workloads.

PVC002 - PVCs Using Default StorageClass iDetects PVCs that do not explicitly specify storageClassName.

✅ All Storage are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

✍️ Specify StorageClass for PVCs

  • Edit: Add storageClassName: <your-storage-class-name> to the PVC spec.
  • Consistency: Ensure consistent storage provisioning across environments.
  • Awareness: Understand which StorageClass is truly being used.

PVC003 - ReadWriteMany PVCs on Incompatible Storage iDetects ReadWriteMany PVCs backed by likely block-storage provisioners.

✅ All Storage are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

⚠️ Review ReadWriteMany PVCs

  • Verify: Confirm if the storage backend truly supports concurrent writes.
  • Adjust: If not, change PVC access mode to ReadWriteOnce.
  • Migrate: For shared data, use appropriate shared file storage solutions.

PVC004 - Unbound Persistent Volume Claims iDetects PersistentVolumeClaims that remain Pending.

✅ All Storage are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🚫 Troubleshoot Unbound PVCs

  • Describe PVC: Use kubectl describe pvc <name> -n <namespace> to see events and reasons for Pending.
  • Check StorageClass: Ensure the specified StorageClass exists and is correctly configured.
  • Review Provisioner: Verify the storage provisioner is running and healthy.

SC001 - Deprecated StorageClass Provisioners iDetects StorageClasses still using in-tree provisioners.

✅ All Storage are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🔄 Migrate Deprecated StorageClasses

  • Identify: Pinpoint PVCs using the deprecated StorageClass.
  • Create: Define a new StorageClass with the appropriate CSI driver.
  • Migrate: Follow the migration path for your specific storage provider to move data.

SC002 - AKS Azure In-Tree Storage Provisioners iDetects Azure in-tree storage provisioners that are not AKS Automatic compatible.

⚠️ Total Storage with Issues: 1

Show Recommendations
tips_and_updatesRecommended Actions

🔄 Migrate Azure StorageClasses to CSI

  • Create replacement StorageClasses that use disk.csi.azure.com or file.csi.azure.com.
  • Move PVCs and workloads off the in-tree StorageClass before migrating to AKS Automatic.
  • Validate reclaim policies, SKU, and mount options during the migration.
Show Findings
NamespaceResourceValueMessage

SC003 - High Cluster Storage Usage (Prometheus) iMonitors overall used storage across the cluster.

✅ All Storage are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

📊 Manage Storage Consumption

  • Identify: Use monitoring tools to find namespaces/pods consuming the most storage.
  • Clean Up: Delete old data, snapshots, or unused PVCs/PVs.
  • Scale: Plan for increasing storage capacity or optimizing storage allocation.

SC004 - StorageClass Prevents Volume Expansion iIdentifies StorageClasses that do not permit volume expansion, which can limit dynamic scaling of stateful applications.

⚠️ Total Storage with Issues: 1

Show Recommendations
tips_and_updatesRecommended Actions

📈 Enable Volume Expansion

  • Assess: Determine if your applications need dynamic volume resizing.
  • Configure: Add or set allowVolumeExpansion: true in the StorageClass definition.
  • Backend Check: Ensure your storage backend supports online volume expansion.
Show Findings
NamespaceResourceValueMessage
(cluster)storageclass/defaulttrueStorageClass does not allow volume expansion.

Configuration Hygiene

CFG001 - Orphaned ConfigMaps iDetects ConfigMaps that are not referenced by workloads or related resources.

✅ All Configuration Hygiene are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🛠️ Clean Up Orphaned ConfigMaps

  • Verify: Check usage (kubectl describe cm ).
  • Delete: kubectl delete cm if unused.
  • Automation: Schedule periodic scans.

CFG002 - Duplicate ConfigMap Names iDetects ConfigMaps with identical names across multiple namespaces.

⚠️ Total Configuration Hygiene with Issues: 2

Show Recommendations
tips_and_updatesRecommended Actions

🛠️ Fix Duplicate ConfigMap Names

  • Standardize: Use unique names or a naming convention that includes the environment or team name.
  • Audit: Periodically review ConfigMaps across namespaces for duplication.
  • Automation: Use policies or linting tools to catch duplicates pre-deploy.
Show Findings
NamespaceResourceValueMessage
-configmap/kube-root-ca.crt-Found in namespaces: azure-store, default
-configmap/kube-root-ca.crt-Found in namespaces: azure-store, default

CFG003 - Large ConfigMaps iFinds ConfigMaps larger than 1 MiB.

✅ All Configuration Hygiene are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🛠️ Reduce ConfigMap Size

  • Refactor: Move large files or data to PersistentVolumes.
  • Split: Break up oversized ConfigMaps into smaller ones by function.
  • Review: Check for secrets or binary blobs mistakenly stored in ConfigMaps.

PROM004 - API Server High Latency (Prometheus) iDetects high latency in Kubernetes API server requests over the last 24 hours.

✅ All Configuration are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🛠️ Investigate API Server Latency

  • Check kube-apiserver logs.
  • Review etcd performance.

Security

RBAC001 - RBAC Misconfigurations iDetects invalid roleRefs, missing roles, orphaned service accounts, and incorrect subject namespaces.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🔐 RBAC Misconfiguration Fixes

  • Don't leave roleRef blank in bindings.
  • Use valid Roles/ClusterRoles that exist in the correct namespace.
  • Verify ServiceAccounts exist in the namespace specified.
  • Remove or correct subjects pointing to non-existent namespaces.

RBAC002 - RBAC Overexposure iIdentifies dangerous RBAC grants such as cluster-admin and wildcard permissions.

⚠️ Total Security with Issues: 12

Show Recommendations
tips_and_updatesRecommended Actions

🔐 RBAC Hardening Tips

  • Avoid using cluster-admin directly in bindings.
  • Don’t assign Roles or ClusterRoles with wildcard verbs/resources/apiGroups.
  • Restrict access to sensitive resources like secrets or pods/exec.
  • Minimize privileges for default ServiceAccounts.
  • Document use of any built-in roles used in production.
Show Findings
NamespaceResourceValueMessage
🌍 Cluster-Wideclusterrolebinding/aks-cluster-admin-bindingUser/clusterAdmincluster-admin binding (built-in)
🌍 Cluster-Wideclusterrolebinding/aks-cluster-admin-bindingUser/clusterUsercluster-admin binding (built-in)
🌍 Cluster-Wideclusterrolebinding/aks-cluster-admin-binding-aadGroup/c30f2960-28f8-49cc-9308-c1e741824c4fcluster-admin binding (built-in)
🌍 Cluster-Wideclusterrolebinding/aks-secretprovidersyncing-rolebindingServiceAccount/aks-secrets-store-csi-driverAccess to sensitive resources
🌍 Cluster-Wideclusterrolebinding/aks-service-rolebindingUser/aks-supportAccess to sensitive resources
🌍 Cluster-Wideclusterrolebinding/ama-metrics-clusterrolebindingServiceAccount/ama-metrics-serviceaccountAccess to sensitive resources
🌍 Cluster-Wideclusterrolebinding/cluster-adminGroup/system:masterscluster-admin binding (built-in)
🌍 Cluster-Wideclusterrolebinding/system:controller:clusterrole-aggregation-controllerServiceAccount/clusterrole-aggregation-controllerAccess to sensitive resources (built-in)
🌍 Cluster-Wideclusterrolebinding/system:controller:legacy-service-account-token-cleanerServiceAccount/legacy-service-account-token-cleanerAccess to sensitive resources (built-in)
🌍 Cluster-Wideclusterrolebinding/system:kube-controller-managerUser/system:kube-controller-managerAccess to sensitive resources (built-in)
🌍 Cluster-Wideclusterrolebinding/system:kube-schedulerUser/system:kube-schedulerAccess to sensitive resources (built-in)
🌍 Cluster-Wideclusterrolebinding/system:persistent-volume-bindingServiceAccount/persistent-volume-binderAccess to sensitive resources (built-in)

RBAC003 - Orphaned ServiceAccounts iFinds ServiceAccounts not used by pods or RBAC bindings.

⚠️ Total Security with Issues: 1

Show Recommendations
tips_and_updatesRecommended Actions

🧾 Remove Orphaned ServiceAccounts

  • Audit ServiceAccounts not referenced in RoleBindings, ClusterRoleBindings, or used by Pods.
  • Delete those not actively used to reduce attack surface.
  • Consider automating SA cleanup with CI/CD or policy enforcement.
Show Findings
NamespaceResourceValueMessage
defaultserviceaccount/defaultdefaultServiceAccount not used by pods or RBAC bindings

RBAC004 - Orphaned and Ineffective Roles iFlags roles and clusterroles that are unused or define no rules.

⚠️ Total Security with Issues: 3

Show Recommendations
tips_and_updatesRecommended Actions

🗂️ Clean up Unused or Ineffective RBAC

  • Remove RoleBindings or ClusterRoleBindings without subjects.
  • Prune Roles and ClusterRoles not referenced by any bindings.
  • Remove roles with no defined rules unless planned for future use.
Show Findings
NamespaceResourceValueMessage
cluster-wideclusterrolebinding/system:nodesystem:nodeClusterRoleBinding has no subjects
cluster-wideclusterrole/aks-secretproviderclasses-admin-roleaks-secretproviderclasses-admin-roleUnused ClusterRole
cluster-wideclusterrole/aks-secretproviderclasses-viewer-roleaks-secretproviderclasses-viewer-roleUnused ClusterRole

SEC001 - Orphaned Secrets iDetects Secrets not used by workloads or related resources.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🔐 Orphaned Secrets Cleanup

  • Remove Secrets not referenced in Pods, Deployments, StatefulSets, or Ingresses.
  • Audit Secret content before deletion to avoid removing active credentials.
  • Validate Custom Resources don’t indirectly depend on these Secrets.
  • Regularly prune Secrets as part of security hygiene.

SEC002 - Pods using hostPID or hostNetwork iFlags pods that share the host PID or network namespace, which can compromise isolation and node security.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

Avoid Host-Level Sharing

  • Set hostPID: false and hostNetwork: false unless needed for special workloads.
  • Review security implications of namespace sharing with the host.
  • Restrict use of these settings to trusted namespaces and workloads.
  • Consider using PSPs or OPA/Gatekeeper policies to prevent usage cluster-wide.

SEC003 - Pods Running as Root iDetects pods running with UID 0 or no explicit runAsUser setting, which defaults to root in many images.

⚠️ Total Security with Issues: 12

Show Recommendations
tips_and_updatesRecommended Actions

RunAsUser Hardening

  • Set runAsUser to a non-zero UID at pod or container level.
  • Avoid relying on container defaults and define securityContext explicitly.
  • Validate any custom base images that may default to root.
Show Findings
NamespaceResourceValueMessage
azure-storepod/order-service-65cc8855c-ghk9mNot Set (Defaults to root)Container order-service runs as root or has no runAsUser set
azure-storepod/order-service-65cc8855c-ghk9mNot Set (Defaults to root)Container wait-for-rabbitmq runs as root or has no runAsUser set
azure-storepod/order-service-65cc8855c-ghk9mNot Set (Defaults to root)Container runs as root or has no runAsUser set
azure-storepod/product-service-77ff9f6fd6-rzcxjNot Set (Defaults to root)Container product-service runs as root or has no runAsUser set
azure-storepod/product-service-77ff9f6fd6-rzcxjNot Set (Defaults to root)Container runs as root or has no runAsUser set
azure-storepod/product-service-77ff9f6fd6-rzcxjNot Set (Defaults to root)Container runs as root or has no runAsUser set
azure-storepod/rabbitmq-5dcdf9484-kvgw7Not Set (Defaults to root)Container rabbitmq runs as root or has no runAsUser set
azure-storepod/rabbitmq-5dcdf9484-kvgw7Not Set (Defaults to root)Container runs as root or has no runAsUser set
azure-storepod/rabbitmq-5dcdf9484-kvgw7Not Set (Defaults to root)Container runs as root or has no runAsUser set
azure-storepod/store-front-698cc8c565-f5hp5Not Set (Defaults to root)Container store-front runs as root or has no runAsUser set
azure-storepod/store-front-698cc8c565-f5hp5Not Set (Defaults to root)Container runs as root or has no runAsUser set
azure-storepod/store-front-698cc8c565-f5hp5Not Set (Defaults to root)Container runs as root or has no runAsUser set

SEC004 - Privileged Containers iDetects containers running with privileged mode enabled.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

Disable Privileged Containers

  • Remove securityContext.privileged: true from container specs.
  • Refactor workloads to avoid needing host-level access.
  • Enforce restrictions using Pod Security Policies or OPA/Gatekeeper.
  • Limit use to dedicated namespaces with strict controls.

SEC005 - Pods Using hostIPC iDetects pods that enable hostIPC.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

🔒 Disable hostIPC for Pods

  • Remove hostIPC: true from pod specs.
  • Review workloads that require inter-process communication with the host.
  • Use shared memory only through secure, scoped means.

SEC006 - Pods Missing Secure Defaults iChecks if pods are missing recommended securityContext fields such as runAsNonRoot, readOnlyRootFilesystem, or allowPrivilegeEscalation.

⚠️ Total Security with Issues: 4

Show Recommendations
tips_and_updatesRecommended Actions
  • Set securityContext.runAsNonRoot: true
  • Set securityContext.readOnlyRootFilesystem: true
  • Set securityContext.allowPrivilegeEscalation: false
Show Findings
NamespaceResourceValueMessage
azure-storepod/order-service-65cc8855c-ghk9mMissing securityContextContainer order-service has no securityContext defined
azure-storepod/product-service-77ff9f6fd6-rzcxjMissing securityContextContainer product-service has no securityContext defined
azure-storepod/rabbitmq-5dcdf9484-kvgw7Missing securityContextContainer rabbitmq has no securityContext defined
azure-storepod/store-front-698cc8c565-f5hp5Missing securityContextContainer store-front has no securityContext defined

SEC007 - Missing Pod Security Admission Labels iFlags namespaces missing pod security admission enforce labels.

⚠️ Total Security with Issues: 2

Show Recommendations
tips_and_updatesRecommended Actions
  • Set pod-security.kubernetes.io/enforce=restricted on sensitive namespaces.
  • Optionally use enforce-version and audit labels.
Show Findings
NamespaceResourceValueMessage
azure-storenamespace/azure-storeNo pod security labels
defaultnamespace/defaultNo pod security labels

SEC008 - Secrets in Environment Variables iDetects secrets exposed through environment variables.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Use secret volumes instead of env vars to reduce accidental exposure.
  • Avoid using valueFrom.secretKeyRef in env.
  • Limit permissions to read secrets.

SEC009 - Missing Capabilities Drop iChecks containers that do not drop all Linux capabilities via securityContext.capabilities.drop = ['ALL'].

⚠️ Total Security with Issues: 4

Show Recommendations
tips_and_updatesRecommended Actions
  • Set securityContext.capabilities.drop: ['ALL'] in container specs.
  • Allow only required capabilities via add list, if any.
Show Findings
NamespaceResourceValueMessage
azure-storepod/order-service-65cc8855c-ghk9mContainer order-service does not drop ALL capabilities
azure-storepod/product-service-77ff9f6fd6-rzcxjContainer product-service does not drop ALL capabilities
azure-storepod/rabbitmq-5dcdf9484-kvgw7Container rabbitmq does not drop ALL capabilities
azure-storepod/store-front-698cc8c565-f5hp5Container store-front does not drop ALL capabilities

SEC010 - HostPath Volume Usage iFlags pods that use hostPath volumes, which mount parts of the host filesystem and bypass isolation.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Remove hostPath volumes unless needed for host-level access.
  • Consider alternatives like persistent volume claims or configMaps.

SEC011 - Containers Running as UID 0 iFlags containers explicitly configured to run as UID 0.

⚠️ Total Security with Issues: 4

Show Recommendations
tips_and_updatesRecommended Actions
  • Set runAsUser to a non-root user ID.
  • Use runAsNonRoot: true for validation.
Show Findings
NamespaceResourceValueMessage
azure-storepod/order-service-65cc8855c-ghk9m0Container order-service runs as UID 0
azure-storepod/product-service-77ff9f6fd6-rzcxj0Container product-service runs as UID 0
azure-storepod/rabbitmq-5dcdf9484-kvgw70Container rabbitmq runs as UID 0
azure-storepod/store-front-698cc8c565-f5hp50Container store-front runs as UID 0

SEC012 - Added Linux Capabilities iFlags containers that add extra Linux capabilities using securityContext.capabilities.add.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Review and remove unnecessary capabilities.
  • Default to dropping all, then selectively add only what is needed.

SEC013 - EmptyDir Volume Usage iEmptyDir volumes are ephemeral and cleared on pod restart. Use only if data persistence is not needed.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Audit use of EmptyDir volumes in production workloads.
  • Replace with PVCs or other managed storage if persistence is needed.

SEC014 - Untrusted Image Registries iFlags images that do not come from trusted registries.

⚠️ Total Security with Issues: 3

Show Recommendations
tips_and_updatesRecommended Actions
  • Use approved internal or vendor-verified registries.
  • Restrict image pull policies using Gatekeeper or admission plugins.
Show Findings
NamespaceResourceValueMessage
azure-storepod/order-service-65cc8855c-ghk9mghcr.io/azure-samples/aks-store-demo/order-service:latestImage from untrusted registry in container order-service
azure-storepod/product-service-77ff9f6fd6-rzcxjghcr.io/azure-samples/aks-store-demo/product-service:latestImage from untrusted registry in container product-service
azure-storepod/store-front-698cc8c565-f5hp5ghcr.io/azure-samples/aks-store-demo/store-front:latestImage from untrusted registry in container store-front

SEC015 - Pods Using Default ServiceAccount iFlags pods using the default service account, which may have broad permissions.

⚠️ Total Security with Issues: 4

Show Recommendations
tips_and_updatesRecommended Actions
  • Create and bind a custom ServiceAccount per application.
  • Avoid using the default ServiceAccount unless absolutely necessary.
Show Findings
NamespaceResourceValueMessage
azure-storepod/order-service-65cc8855c-ghk9mdefaultPod uses default ServiceAccount
azure-storepod/product-service-77ff9f6fd6-rzcxjdefaultPod uses default ServiceAccount
azure-storepod/rabbitmq-5dcdf9484-kvgw7defaultPod uses default ServiceAccount
azure-storepod/store-front-698cc8c565-f5hp5defaultPod uses default ServiceAccount

SEC016 - Unconfined Seccomp Profiles iDetects pods or containers explicitly using the Unconfined seccomp profile.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Use RuntimeDefault or a vetted Localhost seccomp profile.
  • Remove any pod- or container-level Unconfined seccomp setting.
  • Make the seccomp profile explicit in the workload spec so the policy is reviewable.

SEC017 - Non-Default ProcMount iFlags containers that set procMount to a non-default value.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Set securityContext.procMount: Default or omit the field.
  • Review debugging and observability agents that rely on custom proc mounts.

SEC018 - Automounting API Credentials Enabled in ServiceAccounts iFlags ServiceAccounts where automounting of API credentials is enabled, affecting associated Pods.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions

Disable Automounting in ServiceAccounts

  • Add automountServiceAccountToken: false to the ServiceAccount spec.
  • Edit with kubectl edit serviceaccount <sa-name> -n <namespace>.
  • Ensure Pods needing API access override this in their spec with automountServiceAccountToken: true.
  • Use RBAC to limit ServiceAccount permissions if access is required.

SEC019 - Unsupported AppArmor Values iDetects AppArmor annotations or profile types that are not permitted by baseline Pod Security Standards.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Allowed values are runtime/default or localhost/* in annotations, and RuntimeDefault or Localhost for structured profiles.
  • Remove legacy or custom profile names that AKS Automatic baseline policy would reject.

SEC020 - Seccomp Profile Not Configured iDetects pods and containers that do not explicitly configure a seccomp profile.

⚠️ Total Security with Issues: 5

Show Recommendations
tips_and_updatesRecommended Actions
  • Set securityContext.seccompProfile.type: RuntimeDefault for the pod or each container.
  • If you need a custom profile, use Localhost and ensure the profile exists on the node.
  • Doing this avoids AKS Automatic seccomp warnings and makes the security posture explicit.
Show Findings
NamespaceResourceValueMessage
azure-storepod/order-service-65cc8855c-ghk9mContainer order-service has no explicit seccomp profile
azure-storepod/order-service-65cc8855c-ghk9mContainer wait-for-rabbitmq has no explicit seccomp profile
azure-storepod/product-service-77ff9f6fd6-rzcxjContainer product-service has no explicit seccomp profile
azure-storepod/rabbitmq-5dcdf9484-kvgw7Container rabbitmq has no explicit seccomp profile
azure-storepod/store-front-698cc8c565-f5hp5Container store-front has no explicit seccomp profile

SEC021 - Host Ports in Pod Specs iDetects containers that bind host ports directly on the node.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Remove hostPort from container port definitions.
  • Use a Service or Ingress for north-south access.
  • Reserve host networking only for platform workloads that truly require it.

SEC022 - Non-Existent Secret References iFlags pods referencing Secrets that do not exist. This may cause runtime failures.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Check envFrom, secretKeyRef, and volume.secret.secretName references.
  • Create missing Secrets or remove invalid references.

SEC023 - Disallowed Sysctls iDetects sysctls outside the Kubernetes baseline Pod Security Standards allowlist.

✅ All Security are healthy.

Show Recommendations
tips_and_updatesRecommended Actions
  • Keep only baseline-allowed sysctls such as safe net.ipv4.ip_local_port_range or kernel.shm_rmid_forced.
  • Move node-level kernel tuning into node or image configuration where possible.

Kubernetes Warning Events

EVENT001 - Grouped Warning Events iGroups recent Warning events by Reason and Message.

⚠️ Total Events with Issues: 1

Show Recommendations
tips_and_updatesRecommended Actions
  • Group similar warnings to spot patterns.
  • Use kubectl describe and logs to investigate.
Show Findings
NamespaceResourceValueMessage
(cluster)event-group/FailedScheduling40/3 nodes are available: 3 node(s) had untolerated taint {CriticalAddonsOnly: true}. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.

EVENT002 - Full Warning Event Log iLists all recent Warning events in the cluster.

⚠️ Total Events with Issues: 4

Show Recommendations
tips_and_updatesRecommended Actions
  • Use kubectl describe to get full context.
  • Check logs for root cause.
Show Findings
NamespaceResourceValueMessage
azure-storeevents/order-service-65cc8855c-ghk9m.18a67e5d2136c2d5WarningWarning events found in recent Kubernetes logs
azure-storeevents/product-service-77ff9f6fd6-rzcxj.18a67e5d208fd3feWarningWarning events found in recent Kubernetes logs
azure-storeevents/rabbitmq-5dcdf9484-kvgw7.18a67e5d26f82c69WarningWarning events found in recent Kubernetes logs
azure-storeevents/store-front-698cc8c565-f5hp5.18a67e5d294b6844WarningWarning events found in recent Kubernetes logs

AKS Best Practices Results

✅ Passed: 30

❌ Failed: 13

📊 Total Checks: 43

🎯 Score: 69.77%

⭐ Rating: D

Show Best Practices (7/15 failed)
IDCheckSeverityCategoryStatusObserved ValueFail MessageRecommendationURL
AKSBP001Allowed Container Images Policy EnforcementHighBest Practices❌ FAILfalseContainer image restriction policies are not enforced, allowing deployment of images from any registry including public registries, untrusted sources, or images with known vulnerabilities. This significantly increases supply chain attack risks and compliance violations.
Deploy the Azure Policy initiative 'Kubernetes cluster pod security restricted standards' and configure specific allowed container registries.
Use az policy assignment create to assign the policy and set enforcement to 'deny' mode for production environments.
Learn More
AKSBP002No Privileged Containers Policy EnforcementHighBest Practices❌ FAILfalsePrivileged container policies are not enforced, allowing workloads to run with full root privileges, access host devices, mount host file systems, and potentially escape container boundaries. This creates severe security risks and violates least-privilege principles.
Enable the 'Do not allow privileged containers' Azure Policy definition in enforce mode.
Use Pod Security Standards with 'restricted' profile to block privileged containers and ensure security baseline compliance.
Learn More
AKSBP003Multiple Node PoolsMediumBest Practices❌ FAILfalseSingle node pool configuration limits workload isolation, scaling flexibility, and security boundaries. All workloads share the same VM size, OS configuration, and scaling parameters, making it impossible to optimize for different application requirements or implement proper security zones.
Create separate node pools for different workload types using az aks nodepool add --resource-group <rg> --cluster-name <cluster> --name <pool-name>.
Use system pools for system pods, user pools for applications, and specialized pools (GPU, memory-optimized) for specific workloads.
Learn More
AKSBP008Auto Upgrade Channel ConfiguredMediumBest Practices❌ FAILfalseAutomatic cluster upgrades are disabled, leaving the cluster vulnerable to security patches, bug fixes, and Kubernetes version support expiration. Manual upgrade management increases operational overhead and delays critical security updates.
Configure auto upgrade using az aks update --resource-group <rg> --name <cluster> --auto-upgrade-channel patch for security patches or 'stable' for minor version updates.
Use maintenance windows to control upgrade timing and minimize disruption.
Learn More
AKSBP009Node OS Upgrade Channel ConfiguredMediumBest Practices❌ FAILfalseNode OS automatic updates are disabled, leaving nodes running outdated OS versions with potential security vulnerabilities, missing security patches, and outdated system libraries. This increases the attack surface and compliance risks.
Enable node OS upgrade using az aks update --resource-group <rg> --name <cluster> --node-os-upgrade-channel NodeImage for automatic OS updates.
Use 'SecurityPatch' for security-only updates or configure maintenance windows for controlled updates.
Learn More
AKSBP014Use v5 or Newer SKU VMs for Node PoolsMediumBest Practices❌ FAIL3Node pools are using older VM generations (v4 or earlier) that have reduced performance, lack modern security features, don't support ephemeral OS disks by default, and may experience more frequent maintenance events affecting availability and reliability.
Upgrade to v5 or newer VM SKUs using az aks nodepool add --vm-size Standard_D2s_v5 for new node pools.
v5 SKUs provide better performance, support ephemeral OS disks by default, and have improved reliability during maintenance events and upgrades.
Learn More
AKSBP015Deployment Safeguards EnabledMediumBest Practices❌ FAILfalseDeployment Safeguards are disabled, allowing non-compliant workloads to be deployed without validation of Kubernetes best practices. This leads to deployments without resource requests/limits, missing health probes, no anti-affinity rules, and other configuration issues that impact reliability and cost.
Enable Deployment Safeguards using az aks update --resource-group <rg> --name <cluster> --safeguards-level Warning for alerting or 'Enforcement' to block non-compliant deployments.
This enforces best practices including resource requests, readiness/liveness probes, pod anti-affinity, and Pod Security Standards.
Learn More
AKSBP004Azure Linux as Host OSHighBest Practices✅ PASSNo issues detected.
Migrate to Azure Linux by creating new node pools with az aks nodepool add --os-sku AzureLinux, then migrate workloads and delete old pools.
Note: In-place OS SKU changes are not supported, requiring node pool replacement.
Learn More
AKSBP005Ephemeral OS Disks EnabledMediumBest Practices✅ PASSNo issues detected.
Enable ephemeral OS disks using az aks nodepool add --os-disk-type Ephemeral for new pools or plan node pool replacement.
This provides faster disk I/O, lower latency, and reduced costs by using local VM storage instead of managed disks.
Learn More
AKSBP006Non-Ephemeral Disks with Adequate SizeMediumBest Practices✅ PASSNo issues detected.
Increase OS disk size using az aks nodepool update --resource-group <rg> --cluster-name <cluster> --name <nodepool> --os-disk-size-gb 128 or higher.
Larger disks provide better IOPS performance and accommodate container image layers and temporary storage needs.
Learn More
AKSBP007System Node Pool TaintHighBest Practices✅ PASSNo issues detected.
Apply system node pool taint using az aks nodepool update --resource-group <rg> --cluster-name <cluster> --name <system-pool> --node-taints CriticalAddonsOnly=true:NoSchedule.
This ensures only critical system pods run on system nodes, improving reliability and resource isolation.
Learn More
AKSBP010Customized MC_ Resource Group NameMediumBest Practices✅ PASSNo issues detected.
Use a custom node resource group name during cluster creation with az aks create --node-resource-group <custom-name>.
This cannot be changed after cluster creation, so plan accordingly for better resource organization and management.
Learn More
AKSBP011System Node Pool Has Minimum Two NodesHighBest Practices✅ PASSNo issues detected.
Scale system node pool to at least 2 nodes using az aks nodepool scale --resource-group <rg> --cluster-name <cluster> --name <system-pool> --node-count 2.
Configure cluster autoscaler with --min-count 2 to ensure resiliency against node failures and maintenance events.
Learn More
AKSBP012Node Pool Version Matches Control PlaneMediumBest Practices✅ PASSNo issues detected.
Upgrade node pools to match control plane version using az aks nodepool upgrade --resource-group <rg> --cluster-name <cluster> --name <nodepool> --kubernetes-version <version>.
Plan coordinated upgrades to maintain version consistency and avoid compatibility issues.
Learn More
AKSBP013No B-Series VMs in Node PoolsHighBest Practices✅ PASSNo issues detected.
Replace B-series VMs with consistent performance SKUs like Standard_D2s_v5 or Standard_E2s_v5.
Create new node pool with az aks nodepool add --vm-size Standard_D2s_v5, migrate workloads using kubectl drain, then delete old pool with az aks nodepool delete.
Learn More
Show Disaster Recovery (0/2 failed)
IDCheckSeverityCategoryStatusObserved ValueFail MessageRecommendationURL
AKSDR001Agent Pools with Availability ZonesHighDisaster Recovery✅ PASSNo issues detected.
Deploy node pools across availability zones using az aks nodepool add --availability-zones 1 2 3 --resource-group <rg> --cluster-name <cluster> --name <pool>.
Ensure at least 3 zones are used for production workloads to achieve 99.95% SLA and protect against datacenter failures.
Learn More
AKSDR002Control Plane SLAMediumDisaster Recovery✅ PASSNo issues detected.
Upgrade to Standard tier using az aks update --resource-group <rg> --name <cluster> --tier Standard to get 99.95% uptime SLA, financially-backed availability guarantees, and improved support.
This is essential for production workloads requiring high availability.
Learn More
Show Identity & Access (0/7 failed)
IDCheckSeverityCategoryStatusObserved ValueFail MessageRecommendationURL
AKSIAM001RBAC EnabledHighIdentity & Access✅ PASSNo issues detected.
Enable RBAC during cluster creation using --enable-rbac or for existing clusters via Azure Portal.
Create RoleBindings and ClusterRoleBindings to assign appropriate permissions to users and service accounts based on the principle of least privilege.
Learn More
AKSIAM002Managed IdentityHighIdentity & Access✅ PASSNo issues detected.
Create a user-assigned managed identity using az identity create --resource-group <rg> --name <identity-name> and associate it during cluster creation with --assign-identity <identity-resource-id>.
This eliminates the need to manage service principal credentials and provides better security.
Learn More
AKSIAM003Workload Identity EnabledMediumIdentity & Access✅ PASSNo issues detected.
Enable Workload Identity using az aks update --resource-group <rg> --name <cluster> --enable-workload-identity (requires OIDC issuer).
Create Kubernetes service accounts and federate them with Azure managed identities for secure, token-based authentication to Azure services.
Learn More
AKSIAM004Managed Identity UsedHighIdentity & Access✅ PASSNo issues detected.
Migrate from Service Principal to User-Assigned Managed Identity using az aks update --resource-group <rg> --name <cluster> --assign-identity <identity-resource-id>.
This provides automatic credential rotation and eliminates the need to manage client secrets.
Learn More
AKSIAM005AAD RBAC Authorization IntegratedHighIdentity & Access✅ PASSNo issues detected.
Enable Azure RBAC for Kubernetes authorization using az aks update --resource-group <rg> --name <cluster> --enable-azure-rbac.
Assign built-in roles like 'Azure Kubernetes Service RBAC Reader/Writer/Admin' to users and groups for centralized access management through Azure AD.
Learn More
AKSIAM006AAD Managed Authentication EnabledHighIdentity & Access✅ PASSNo issues detected.
Enable Azure AD integration during cluster creation with --enable-aad --aad-admin-group-object-ids <group-id> or update existing cluster using az aks update --resource-group <rg> --name <cluster> --enable-aad.
Configure admin groups and integrate with conditional access policies.
Learn More
AKSIAM007Local Accounts DisabledHighIdentity & Access✅ PASSNo issues detected.
Disable local accounts using az aks update --resource-group <rg> --name <cluster> --disable-local-accounts.
This enforces authentication exclusively through Azure AD, eliminating certificate-based admin access and improving audit capabilities.
Learn More
Show Monitoring & Logging (0/2 failed)
IDCheckSeverityCategoryStatusObserved ValueFail MessageRecommendationURL
AKSMON001Azure MonitorHighMonitoring & Logging✅ PASSNo issues detected.
Enable Azure Monitor Container Insights using az aks enable-addons --resource-group <rg> --name <cluster> --addons monitoring --workspace-resource-id <workspace-id> or through Azure Portal > Monitoring > Insights.
Configure log retention (90+ days) and set up alerts for container failures and resource usage.
Learn More
AKSMON002Managed Prometheus EnabledHighMonitoring & Logging✅ PASSNo issues detected.
Enable managed Prometheus using az aks update --resource-group <rg> --name <cluster> --enable-azure-monitor-metrics or via Azure Portal > Monitoring > Insights.
Consider integrating with Azure Managed Grafana for advanced dashboards and setting up alerting rules for critical metrics.
Learn More
Show Networking (2/4 failed)
IDCheckSeverityCategoryStatusObserved ValueFail MessageRecommendationURL
AKSNET001Authorized IP Ranges Configured (Public Clusters)HighNetworking❌ FAILfalseAPI server accepts connections from any internet IP address, creating a large attack surface for brute force attacks, credential stuffing, and vulnerability exploitation. This violates network security best practices and most compliance frameworks.
Configure authorized IP ranges using az aks update --resource-group <rg> --name <cluster> --api-server-authorized-ip-ranges <ip-ranges>.
Include management networks, CI/CD systems, and jump boxes using CIDR notation.
Alternatively, migrate to a private cluster for enhanced security.
Learn More
AKSNET003Web App Routing EnabledLowNetworking❌ FAILfalseWeb App Routing add-on is disabled, requiring manual ingress controller management, DNS configuration, and SSL certificate handling. This increases operational overhead and may lead to inconsistent external access patterns and security configurations.
Enable Web App Routing using az aks enable-addons --resource-group <rg> --name <cluster> --addons web_application_routing.
Configure DNS zones and SSL certificates for automatic ingress management.
Consider using Application Gateway Ingress Controller (AGIC) for enterprise scenarios.
Learn More
AKSNET002Network Policy CheckMediumNetworking✅ PASSNo issues detected.
Enable network policy during cluster creation with --network-policy azure (Azure CNI) or --network-policy calico (kubenet).
Create NetworkPolicy resources to define ingress/egress rules for pods, implementing micro-segmentation and zero-trust networking principles.
Learn More
AKSNET004Azure CNI with Cilium Dataplane RecommendedMediumNetworking✅ PASSNo issues detected.
For new clusters, use --network-plugin azure --network-dataplane cilium --network-plugin-mode overlay for optimal performance.
Azure CNI powered by Cilium provides eBPF-based packet processing, better scalability, and advanced L3-L7 network policies.
Existing clusters should migrate by creating a new cluster with Cilium enabled.
Learn More
Show Resource Management (1/5 failed)
IDCheckSeverityCategoryStatusObserved ValueFail MessageRecommendationURL
AKSRES002AKS Built-in Cost Tooling EnabledMediumResource Management❌ FAILfalseCost analysis and OpenCost integration is disabled, providing no visibility into per-namespace, per-workload, or per-application spending. This makes it impossible to implement cost allocation, identify expensive workloads, optimize resource usage, or implement chargeback policies for different teams.
Enable cost analysis using az aks update --resource-group <rg> --name <cluster> --enable-cost-analysis to track namespace and workload-level costs.
Use the cost insights to identify expensive workloads, optimize resource requests, and implement chargeback/showback policies.
Learn More
AKSRES001Cluster AutoscalerMediumResource Management✅ PASSNo issues detected.
Enable Cluster Autoscaler using az aks update --resource-group <rg> --name <cluster> --enable-cluster-autoscaler --min-count <min> --max-count <max> on node pools.
Configure appropriate min/max node counts, scale-down parameters, and node pool priorities for optimal cost and performance balance.
Learn More
AKSRES003Vertical Pod Autoscaler (VPA) is enabledMediumResource Management✅ PASSNo issues detected.
Enable VPA using az aks update --resource-group <rg> --name <cluster> --enable-vpa.
Deploy VPA objects with 'updateMode: Auto' or 'Off' for recommendations only.
Monitor VPA recommendations and adjust application resource requests/limits accordingly for better resource efficiency.
Learn More
AKSRES004KEDA (Event-Driven Autoscaling) EnabledLowResource Management✅ PASSNo issues detected.
Enable KEDA using az aks update --resource-group <rg> --name <cluster> --enable-keda.
Deploy ScaledObject resources to define event sources (Azure Queue, Service Bus, Kafka, HTTP, etc.) and scaling behavior.
KEDA complements HPA by enabling scale-to-zero and event-driven scaling patterns.
Learn More
AKSRES005Node Auto-provisioning or Cluster Autoscaler ConfiguredHighResource Management✅ PASSNo issues detected.
Enable Node Auto-provisioning using az aks update --resource-group <rg> --name <cluster> --node-provisioning-mode Auto for Karpenter-based dynamic provisioning.
Alternatively, enable Cluster Autoscaler with az aks update --enable-cluster-autoscaler.
NAP is recommended for modern workloads with diverse resource requirements.
Learn More
Show Security (3/8 failed)
IDCheckSeverityCategoryStatusObserved ValueFail MessageRecommendationURL
AKSSEC001Private ClusterHighSecurity❌ FAILfalseAPI server is publicly accessible from the internet, exposing your cluster to potential attacks, unauthorized access attempts, and compliance violations. This creates a significant security risk as attackers can attempt to exploit Kubernetes API vulnerabilities.
Configure as a private cluster using az aks create --enable-private-cluster or az aks update --enable-private-cluster for existing clusters.
This routes API server traffic through private endpoints within your VNet.
Configure private DNS zones and ensure network connectivity from management machines.
Learn More
AKSSEC006Image Cleaner EnabledMediumSecurity❌ FAILfalseImage Cleaner is disabled, allowing stale and potentially vulnerable container images to accumulate on node disks. This increases storage costs, extends attack surface with outdated images containing known CVEs, and can impact node performance due to disk space consumption.
Enable Image Cleaner using az aks update --resource-group <rg> --name <cluster> --enable-image-cleaner.
Configure cleaning interval and retention policies to automatically remove unused container images and reduce attack surface.
Learn More
AKSSEC008Pod Security Admission EnabledHighSecurity❌ FAILfalsePod Security Admission is not configured on this cluster, meaning there are no built-in Kubernetes security controls to prevent insecure pod configurations. Without PSA, pods can run with dangerous settings like privileged mode, host network access, or unsafe capabilities, increasing container escape risks.
Configure Pod Security Admission by setting pod security standards on namespaces.
Use kubectl label namespace <namespace> pod-security.kubernetes.io/enforce=restricted pod-security.kubernetes.io/audit=restricted pod-security.kubernetes.io/warn=restricted for production namespaces.
Consider 'baseline' for less restrictive environments.
This is separate from Azure Policy and provides Kubernetes-native security controls.
Learn More
AKSSEC002Azure Policy Add-onMediumSecurity✅ PASSNo issues detected.
Enable Azure Policy add-on using az aks enable-addons --resource-group <rg> --name <cluster> --addons azure-policy.
Deploy built-in policy initiatives like 'Kubernetes cluster pod security restricted standards' and create custom policies for your organization's requirements.
Learn More
AKSSEC003Defender for ContainersHighSecurity✅ PASSNo issues detected.
Enable Defender for Containers using az aks update --resource-group <rg> --name <cluster> --enable-defender or through Security Center in Azure Portal.
Configure vulnerability scanning, runtime threat detection, and compliance monitoring for comprehensive container security.
Learn More
AKSSEC004OIDC Issuer EnabledMediumSecurity✅ PASSNo issues detected.
Enable OIDC issuer using az aks update --resource-group <rg> --name <cluster> --enable-oidc-issuer.
This enables workload identity federation, allowing pods to authenticate to Azure services using service account tokens instead of secrets.
Learn More
AKSSEC005Azure Key Vault IntegrationHighSecurity✅ PASSNo issues detected.
Enable Key Vault CSI driver using az aks enable-addons --resource-group <rg> --name <cluster> --addons azure-keyvault-secrets-provider.
Create SecretProviderClass resources to mount secrets, certificates, and keys from Azure Key Vault as volumes in pods.
Learn More
AKSSEC007Kubernetes Dashboard DisabledHighSecurity✅ PASSNo issues detected.
Disable the Kubernetes dashboard using az aks disable-addons --addons kube-dashboard --resource-group <rg> --name <cluster>.
Use Azure Portal, kubectl, or other secure management tools instead.
If dashboard access is required, implement proper authentication and network restrictions.
Learn More

AKS Automatic Migration Readiness

AKS Automatic Migration Readiness Not Ready
Not Ready - Fix blocker findings before migrating workloads to a new AKS Automatic cluster.

🚫 Blockers: 1

⚠️ Warnings: 3

✅ Aligned Checks: 8

This view is derived from existing Kubernetes and AKS shared checks and focuses on readiness for a new AKS Automatic cluster.

Open detailed AKS Automatic action plan

Fix Before Migration

IDCheckAffectedRecommendationExamples
POD007Container images do not use latest tag4Specify an explicit image tag (e.g., ':v1.2.3') on every container and initContainer to ensure consistent deployments.pod/order-service-65cc8855c-ghk9m, pod/product-service-77ff9f6fd6-rzcxj, pod/store-front-698cc8c565-f5hp5

Warnings

IDCheckAffectedRecommendationExamples
SEC003Pods Running as Root12Avoid running pods as root by explicitly setting runAsUser to a non-zero UID in pod or container securityContext.pod/order-service-65cc8855c-ghk9m, pod/product-service-77ff9f6fd6-rzcxj, pod/rabbitmq-5dcdf9484-kvgw7, pod/store-front-698cc8c565-f5hp5
SEC020Seccomp Profile Not Configured5Set seccompProfile.type to RuntimeDefault or Localhost at the pod or container level.pod/order-service-65cc8855c-ghk9m, pod/product-service-77ff9f6fd6-rzcxj, pod/rabbitmq-5dcdf9484-kvgw7, pod/store-front-698cc8c565-f5hp5
WRK007Missing Readiness and Liveness Probes4Add readiness and liveness probes to all containers to improve availability and fault detection.deployment/order-service, deployment/product-service, deployment/rabbitmq, deployment/store-front