Kubernetes Monitoring
Complete observability for your Kubernetes clusters. Monitor nodes, pods, containers, and services with automatic discovery, Prometheus compatibility, and intelligent alerting.
Full-Stack Kubernetes Observability
From cluster health to individual container metrics, get complete visibility into your Kubernetes environment.
Cluster Level
Control plane health, API server latency, etcd performance, scheduler metrics
Node Level
CPU, memory, disk, network, kubelet status, node conditions
Pod Level
Pod lifecycle, restart counts, resource requests vs limits, readiness
Container Level
CPU throttling, memory usage, OOM events, container states
What We Monitor
Control Plane
Monitor the heart of your Kubernetes cluster for reliability and performance.
- API Server request latency
- etcd health and latency
- Scheduler queue depth
- Controller manager metrics
- Certificate expiration
Nodes & Kubelet
Track node health and kubelet performance across your cluster.
- Node CPU, memory, disk
- Kubelet health status
- Node conditions (Ready, DiskPressure, etc.)
- Pod capacity and allocation
- Container runtime metrics
Pods & Containers
Deep visibility into workload performance and resource consumption.
- CPU usage and throttling
- Memory usage and OOM kills
- Restart counts and crash loops
- Resource requests vs limits
- Container states and events
Services & Networking
Monitor service endpoints and network connectivity.
- Service endpoint health
- Ingress traffic and latency
- Network policies effectiveness
- DNS resolution times
- Service mesh metrics (Istio, Linkerd)
Workload Resources
Track Deployments, StatefulSets, DaemonSets, and Jobs.
- Deployment replica status
- Rolling update progress
- StatefulSet ordering
- DaemonSet coverage
- Job and CronJob completion
Persistent Storage
Monitor PersistentVolumes and storage performance.
- PV/PVC binding status
- Storage capacity usage
- I/O throughput and latency
- StorageClass provisioning
- Volume mount errors
Kubernetes-Native Features
๐ Auto-Discovery
Automatically discover and monitor pods, services, and endpoints. No manual configuration needed as workloads scale.
๐ Prometheus Compatible
Native PromQL support. Scrape existing Prometheus endpoints. Use your existing recording rules and alerts.
๐ท๏ธ Label-Aware
Filter and aggregate by Kubernetes labels and annotations. Group metrics by namespace, deployment, or custom labels.
๐ Resource Optimization
Right-size resource requests and limits based on actual usage. Identify over-provisioned and under-provisioned workloads.
๐ Smart Alerting
Pre-configured alerts for common K8s issues: CrashLoopBackOff, pending pods, node NotReady, certificate expiry.
๐ Multi-Cluster
Monitor multiple Kubernetes clusters from a single dashboard. Compare performance across environments.
๐ฆ Helm Deployment
Deploy Bleemeo agent with a single Helm chart. GitOps-ready with full customization options.
๐ OpenTelemetry
Ingest traces and metrics via OpenTelemetry. Correlate infrastructure metrics with application traces.
Quick Setup with Helm
Add Bleemeo Helm Repository
Add the official Bleemeo Helm chart repository to your Helm installation.
helm repo add bleemeo-agent https://packages.bleemeo.com/bleemeo-agent/helm-charts
helm repo update Install the Agent
Deploy Glouton agent as a DaemonSet with your account credentials.
helm upgrade --install glouton bleemeo-agent/glouton \
--set account_id="your_account_id" \
--set registration_key="your_registration_key" \
--set config.kubernetes.clustername="my_k8s_cluster_name" \
--set namespace="default" View Your Cluster
Nodes, pods, and services appear automatically in your Bleemeo dashboard within seconds.
Pre-Built Kubernetes Alerts
Get notified about common Kubernetes issues before they impact your users.
Pod Issues
- CrashLoopBackOff detected
- Pod stuck in Pending
- High restart count
- OOMKilled containers
Node Issues
- Node NotReady
- High CPU/memory pressure
- Disk space low
- Too many pods scheduled
Cluster Issues
- API server errors
- etcd latency high
- Certificate expiring
- PVC pending
Workload Issues
- Deployment replicas unavailable
- StatefulSet not ready
- Job failed
- HPA at max replicas
Works With Your Stack
Why Bleemeo for Kubernetes?
Real-Time Visibility
See pod creation, scaling events, and failures as they happen. No delay in metrics collection.
Cost Optimization
Identify resource waste and right-size your workloads. Reduce cloud spending without impacting performance.
Lightweight Agent
Glouton uses minimal resources. Less than 100MB memory per node. Won't compete with your workloads.
13 Months Retention
Keep historical data for capacity planning and trend analysis. Compare performance over time.
Want to go further?
Read the DocumentationFrequently Asked Questions
Everything you need to know about Bleemeo's Kubernetes monitoring
How do I deploy Bleemeo in my Kubernetes cluster?
Bleemeo deploys via Helm chart as a DaemonSet, placing one Glouton agent on each node. Simply add the Bleemeo Helm repository, then run helm upgrade --install with your account credentials and cluster name. The agent automatically discovers all pods and services. You can also deploy using plain kubectl with our provided manifests. GitOps tools like ArgoCD and Flux are fully supported.
What Kubernetes metrics does Bleemeo collect?
Bleemeo collects comprehensive metrics including: Pod metrics (counts by state, restart counts, CPU/memory usage vs requests/limits), Node metrics (CPU, memory, disk, network, kubelet status), Cluster metrics (node count, namespace count, API status), and Certificate expiration (CA and node certificates). Metrics are labeled by namespace, owner kind (Deployment, DaemonSet), and owner name for easy filtering.
Does Bleemeo auto-discover services in my pods?
Yes, automatic service discovery is a core feature. The Bleemeo agent detects all running services in your pods (databases, web servers, message queues, etc.) and starts monitoring them without manual configuration. It recognizes 100+ services out of the box. As pods scale up or down, monitoring automatically follows - no reconfiguration needed for ephemeral workloads.
Can I scrape Prometheus metrics from my applications?
Yes, Bleemeo supports Prometheus-style scraping via pod annotations. Add prometheus.io/scrape: "true" to your pods, and optionally specify prometheus.io/path and prometheus.io/port for custom metrics endpoints. The agent automatically discovers and scrapes these endpoints. You can also use PromQL to query metrics in your dashboards.
What are the resource requirements for the agent?
The Glouton agent is designed to be lightweight. It typically uses less than 100MB of memory and minimal CPU per node. The agent won't compete with your production workloads for resources. Resource requests and limits can be customized in the Helm values if needed. The agent is optimized for high-density environments with many pods per node.
Which Kubernetes distributions are supported?
Bleemeo works with all major Kubernetes distributions: Managed services (EKS, GKE, AKS, DigitalOcean Kubernetes), Self-managed (kubeadm, k3s, k0s, microk8s), and Enterprise distributions (OpenShift, Rancher, Tanzu). We support Kubernetes 1.19+. The agent adapts to different container runtimes including containerd, CRI-O, and Docker.
Can I monitor multiple Kubernetes clusters?
Yes, Bleemeo supports multi-cluster monitoring. Each cluster appears as a separate entity in your dashboard with its own name (configured via config.kubernetes.clustername). You can view all clusters in a unified dashboard, compare metrics across clusters, and drill down into individual cluster details. This is ideal for managing development, staging, and production environments.
What alerts are pre-configured for Kubernetes?
Bleemeo includes pre-built alerts for common Kubernetes issues: Pod issues (CrashLoopBackOff, pending pods, high restart counts, OOMKilled), Node issues (NotReady, disk/memory pressure), Cluster issues (API server errors, certificate expiring), and Workload issues (deployment replicas unavailable, failed jobs). You can customize thresholds or create additional alerts.
How do I track resource requests vs actual usage?
Bleemeo collects both resource requests/limits and actual usage for CPU and memory. Dashboards show the comparison between what pods requested and what they're actually using, helping you identify over-provisioned workloads (wasting resources) and under-provisioned ones (at risk of throttling or OOM). This enables effective right-sizing of your workloads.
Does Bleemeo monitor container logs?
Yes, with log collection enabled, Glouton automatically captures logs from all containers in your Kubernetes cluster. Logs are collected from container stdout/stderr without additional configuration. You can apply custom parsers and filters using pod annotations (glouton.log_format, glouton.log_filter). Logs can be correlated with metrics for comprehensive troubleshooting.
Start Monitoring Your Kubernetes Clusters
Deploy in minutes. Get full visibility into your K8s infrastructure.