Monitoring Kubernetes cluster utilization and capacity (the poor man's way)

If you're running Kubernetes clusters at scale, it pays to have good monitoring in place. Typical tools I use in production like Prometheus and Alertmanager are extremely useful in monitoring critical metrics, like "is my cluster almost out of CPU or Memory?"

But I also have a number of smaller clusters—some of them like my Raspberry Pi Dramble have very little in the way of resources available for hosting monitoring internally. But I still want to be able to say, at any given moment, "how much CPU or RAM is available inside the cluster? Can I fit more Pods in the cluster?"

So without further ado, I'm now using the following script, which is slightly adapted from a script found in the Kubernetes issue Need simple kubectl command to see cluster resource usage:

Usage is pretty easy, just make sure you have your kubeconfig configured so kubectl commands are working on the cluster, then run:

$ ./k8s-resources.sh 
hostname1: 23% CPU, 16% memory
hostname2: 26% CPU, 16% memory
hostname3: 38% CPU, 22% memory
hostname4: 98% CPU, 66% memory
hostname5: 29% CPU, 18% memory
hostname6: 28% CPU, 16% memory
Average usage: 40% CPU, 25% memory.

If I get some time I might make a few more modifications to allow more detailed stats. Also, there are a dozen or so other scripts and utilities you can run to get more detailed stats. But for my purposes, I am quite often setting up a small cluster, running a number of apps on it, then checking what kind of resource allocation pattern I'm getting. This helps tremendously in finding the optimal instance type on AWS, or whether I need more instances or could live with fewer.

Someday hopefully kubectl/Kubernetes will include some way of finding this information more simply. But for now, there's scripts like the above one!

kubernetes

monitoring

cluster

dramble

pi dramble

k8s

Add new comment

Comments

Awesome! Thanks!
It's a shame there is no standard dashboard to see these things.
I've extended you script to print the absolute values too:

#!/bin/bash
#
# Monitor overall Kubernetes cluster utilization and capacity.
#
# Original source:
# https://github.com/kubernetes/kubernetes/issues/17512#issuecomment-367212930
#
# Tested with:
#   - AWS EKS v1.11.5
#
# Does not require any other dependencies to be installed in the cluster.
set -e
KUBECTL="kubectl"
NODES=$($KUBECTL get nodes --no-headers -o custom-columns=NAME:.metadata.name)
unitconvert(){
  sed '
      s/\([0-9][0-9]*\(\.[0-9]\+\)\?\)K/\1*1000/g;
      s/\([0-9][0-9]*\(\.[0-9]\+\)\?\)M/\1*1000000/g;
      s/\([0-9][0-9]*\(\.[0-9]\+\)\?\)G/\1*1000000000/g;
      s/\([0-9][0-9]*\(\.[0-9]\+\)\?\)T/\1*1000000000000/g;
      s/\([0-9][0-9]*\(\.[0-9]\+\)\?\)P/\1*1000000000000000/g;
      s/\([0-9][0-9]*\(\.[0-9]\+\)\?\)E/\1*1000000000000000000/g
  ' </dev/stdin | bc | sed 's/\..*$//' # Final sed to remove decimal point
}
function usage() {
  local node_count=0
  local total_percent_cpu=0
  local total_percent_mem=0
  local total_abs_cpu=0
  local totla_abs_mem=0
  local readonly nodes=$@
  for n in $nodes; do
    local requests=$($KUBECTL describe node $n | grep -A3 -E "\\s\sRequests" | tail -n2)
    # echo "$requests"
    local abs_cpu=$(echo $requests | awk -F "[()% im]*" '{print $2}')
    local percent_cpu=$(echo $requests | awk -F "[()%]" '{print $2}')
    local abs_mem=$(echo $requests | awk -F "[()% i]*" '{print $7}' | unitconvert)
    local percent_mem=$(echo $requests | awk -F "[()%]" '{print $8}')
    echo "$n: ${abs_cpu}m ${percent_cpu}% CPU, $((abs_mem / 1000000))Mi ${percent_mem}% memory"
    node_count=$((node_count + 1))
    total_percent_cpu=$((total_percent_cpu + percent_cpu))
    total_percent_mem=$((total_percent_mem + percent_mem))
    total_abs_cpu=$((total_abs_cpu + abs_cpu))
    total_abs_mem=$((total_abs_mem + abs_mem))
  done
  local readonly avg_percent_cpu=$((total_percent_cpu / node_count))
  local readonly avg_percent_mem=$((total_percent_mem / node_count))
  local readonly avg_abs_cpu=$((total_abs_cpu / node_count))
  local readonly avg_abs_mem=$((total_abs_mem / node_count))
  echo "Average usage: ${avg_abs_cpu}m ${avg_percent_cpu}% CPU, $((avg_abs_mem / 1000000))Mi ${avg_percent_mem}% memory."
}
usage $NODES

Monitoring Kubernetes cluster utilization and capacity (the poor man's way)

Further reading

Comments