Skip to content

Overview

Overview

CruiseKube operates as a closed-loop system through a set of periodic background tasks. Each task has a clearly defined responsibility and can be enabled or disabled independently.

  1. Create Stats Task: Builds persistent, workload-level CPU and memory statistics from Kubernetes state and Prometheus metrics. These stats form the foundation for all optimization decisions and are stored for reuse.
  2. Apply Recommendation Task: Generates and applies CPU and memory recommendations to workloads in a controlled, incremental manner. This is the core task responsible for actually right-sizing workloads.
  3. Fetch Metrics Task: Fetches metrics from the cluster and exposes them as prometheus metrics.
  4. Node Load Monitoring Task: Monitors the CPU load on nodes and isolates nodes that are overloaded.

Together, these tasks allow CruiseKube to continuously optimize resources without relying on manual tuning or reactive scaling.

Components

flowchart LR
    %% Actor
    Human((Human))

    %% Kubernetes Cluster Boundary
    subgraph K8s[Kubernetes Cluster]
        direction LR

        %% Frontend
        Frontend[Frontend]

        %% Controller
        subgraph Controller
            direction TB
            Stats[Statistics Engine]
            Runtime[Runtime Optimizer]
        end

        %% API Server
        APIServer[kube-api-server]

        %% Webhook
        subgraph Webhook
            Admission[Admission Optimizer]
        end

        %% Data & Metrics
        Database[(Database)]
        Prometheus[Prometheus]
    end

    %% User Flow
    Human --> Frontend
    Frontend --> Controller

    %% Control Plane Flow
    Controller --> APIServer
    APIServer <--> Webhook

    %% Data Flow
    Controller --> Database
    Webhook --> Database
    Controller --> Prometheus

The high-level architecture consists of 4 components deployed using the CruiseKube Helm chart:

  • Controller
    • Statistics Engine - Collects metrics from the cluster and stores them in the database
    • Runtime Optimizer - Optimizes the resources of the running workloads on the cluster
  • Webhook
    • Admission Optimizer - Intercepts new pod creations and optimizes the resources of the pod before it is scheduled
  • Frontend
    • Observable interface for recommendations - The frontend provides a user-friendly interface to view the recommendations and the potential savings
    • Setting User configurations per workload - The frontend allows the user to set the user configurations like priority, mode, etc. per workload
    • Potential savings once CruiseKube is enabled in Cruise mode - The frontend shows the potential savings once CruiseKube is enabled in Cruise mode
  • Database:
    • Stores the statistics generated by the Statistics Engine - The database stores the statistics generated by the Statistics Engine
    • Stores the user configurations per workload - The database stores the user configurations like priority, mode, etc. per workload

Statistics Engine

  • Continuously evaluates CPU and memory usage for each workload
  • Track instances of high CPU load and memory OOMs
  • Derives stable statistics (percentiles, headroom, variability)
  • Persists computed metrics in an internal datastore
  • Built on Prometheus as the primary metrics source

Runtime Optimizer

  • Implemented as a reconciliation loop in cruisekube-controller
  • Iteratively optimizes running workloads, one node at a time
  • Keeps the priority of individual workloads into account to minimise disruption

Admission Optimizer

  • Implemented as a mutating admission webhook
  • Intercepts new pod creations
  • Rewrites resource requests using learned recommendations

OOM Observer & Processor:

  • Monitors Kubernetes pod status and eviction events for OOM kills
  • Records OOM memory values in workload statistics
  • Triggers pod eviction when OOM events occur

Control Flows

Statistics Engine

sequenceDiagram
  %% CruiseKube Statistics Engine – Metrics Collection & Feature Building

  participant P as Prometheus
  participant T as Statistics Engine
  participant S as Kubernetes API Server
  participant DB as Database

  loop Every scrape/aggregation interval
    T->>S: List Pods, Nodes (metadata)
    T->>P: Query metrics (usage, throttling, pressure)
    T->>T: Calculate stats per namespace/workload/container
    T->>T: Compute aggregates (e.g., percentiles, peaks, trends)
    T->>DB: Persist container stats
  end
  1. Connects to target cluster prometheus and cluster
  2. Calculates stats related to CPU usage, CPU pressure, memory usage, OOM instances etc.
  3. Stores the calculated statistics into database

Runtime Optimizer Flow

sequenceDiagram
  %% CruiseKube Core Loop (simplified)
  participant C as CruiseKube Controller
  participant M as Database
%%   participant N as Node
  participant K as Kube API

  loop Every reconcile interval
    C->>M: Read usage + pressure signals (per pod/container)
    C->>K: Read node allocatable + pod placement
    %% C->>C: Partition pods (optimizable vs reserved)
    C->>C: Estimate StableDemand + SpikeDemand (PSI-adjusted if available)
    alt Fits within node capacity
      C->>C: Distribute SpikeDemand amongst pods
      C->>K: Patch pod resources in-place
    else Not feasible
      C->>C: Choose eviction candidates by priority
      C->>K: Evict pods until feasible
      C->>C: Distribute SpikeDemand amongst pods
      C->>K: Patch pod resources in-place
    end
  end
  1. Connect to target cluster to iterate over nodes
  2. Fetch workload statistics from DB
  3. Adjusts resources in-place for pods on the node

Admission Optimizer Flow

sequenceDiagram
  %% CruiseKube Admission Webhook – Scheduling-Time Optimization

  participant U as User / Controller
  participant K as Kubernetes API Server
  participant W as CruiseKube Mutating Webhook
  participant M as Database
%%   participant S as Scheduler
%%   participant N as Node

  U->>K: Create Pod
  K->>W: AdmissionReview (PodSpec)
  W->>M: Fetch historical usage(for workload / container)
  W->>W: Estimate StableDemand + SpikeDemand
  W->>W: Compute initial pod requests = Stable + Spike
  W-->>K: Mutated PodSpec (updated requests/limits)
%%   K->>S: Schedule Pod
%%   S->>N: Bind Pod to Node
%%   N-->>K: Pod running
  1. Intercept pod spec
  2. Fetch statistics from the controller
  3. Mutate requests before scheduling

Reactive OOM Handling Flow

sequenceDiagram
  %% CruiseKube OOM Handling – Reactive Memory Optimization

  participant K as Kubernetes API Server
  participant O as OOM Observer
  participant P as OOM Processor
  participant DB as Database
  participant W as Admission Webhook
  participant RC as ReplicaSet Controller

  Note over K,O: Container OOM Event Occurs
  K->>O: Pod Status Update (OOMKilled)
  O->>P: OOM Event Notification

  P->>DB: Check Cooldown Period
  alt Cooldown Active
    P-->>P: Skip (prevent thrashing)
  else Cooldown Expired
    P->>DB: Record OOM Event
    P->>DB: Update OOMMemory in Stats
    P->>K: Fetch Pod Details
    P->>P: Check Exclusions & Overrides
    alt Pod Should Be Evicted
      P->>K: Evict Pod
      K->>RC: Pod Deleted (replica missing)
      RC->>K: Create New Pod
      K->>W: AdmissionReview (new pod)
      W->>DB: Fetch Stats (with OOMMemory)
      W->>W: Calculate Memory Limit<br/>(2x max(usage, OOMMemory))
      W-->>K: Mutated Pod (higher memory)
      K->>K: Schedule & Start Pod
    else Skip Eviction
      P-->>P: Skip (excluded/disabled)
    end
  end
  1. Monitor pod status for OOM kill events
  2. Record OOM memory and update statistics
  3. Evaluate eviction decision based on policies (see OOM Handling for detailed decision flow)
  4. Evict pod and let ReplicaSet recreate it
  5. Admission webhook applies updated memory limits

Next Steps

  • Get started with installation here