Skip to content

Key Concepts

This page introduces the fundamental concepts and terminology used throughout Bleemeo. Understanding these concepts will help you get the most out of the platform.

A metric is a numerical measurement collected at regular intervals. Metrics represent the state of your infrastructure at a specific point in time. Examples include CPU usage percentage, memory consumption in bytes, or the number of HTTP requests per second.

Bleemeo collects metrics at 10-second granularity, providing high-resolution data for real-time monitoring and troubleshooting.

A time series is a sequence of metric values recorded over time. Each data point consists of a timestamp and a value. Time series data allows you to:

  • Visualize trends and patterns
  • Detect anomalies
  • Set up alerting based on historical behavior
  • Perform capacity planning

Labels (also called tags) are key-value pairs attached to metrics that provide additional context and allow filtering. For example, a CPU metric might have labels like:

  • hostname: the server name
  • cpu: the CPU core number
  • instance: the specific service instance

Labels enable you to query and aggregate metrics across your infrastructure.

Thresholds define the boundaries that determine when a metric value is considered normal, warning, or critical:

  • Warning threshold: The first alert level, indicating a potential issue
  • Critical threshold: A severe alert level requiring immediate attention

When a metric crosses a threshold, an incident is created and notifications can be triggered.

Glouton is the Bleemeo monitoring agent. It’s a lightweight, open-source agent that:

  • Collects system and application metrics
  • Automatically discovers services running on your systems
  • Sends data to Bleemeo Cloud in real-time
  • Requires minimal configuration

The agent can run on Linux, Windows, Docker containers, or Kubernetes clusters (as a DaemonSet).

Agent facts are system properties automatically detected by the agent. These include:

FactDescription
glouton_versionVersion of the installed agent
cpu_coresNumber of CPU cores
memoryTotal system memory
os_pretty_nameOperating system name and version
hostnameSystem hostname
kubernetes_cluster_nameKubernetes cluster name (if applicable)

Facts help identify and categorize your infrastructure in dashboards and alerts.

Bleemeo agent automatically discovers services running on your systems. When a service is detected (e.g., Apache, MySQL, Redis), the agent:

  1. Creates a tag with the service name
  2. Starts collecting service-specific metrics
  3. Sets up health checks for the service
  4. Applies default thresholds

Over 100 services are supported out of the box, requiring no manual configuration.

Each discovered service has a status that indicates its health:

Status CodeMeaning
0Check passed successfully
1Check passed with a warning (e.g., HTTP 404 response)
2Check detected an issue with the service
3Check status unknown (timeout or check failure)

Service status is stored in the service_status metric and displayed in status dashboards.

Monitors are endpoints checked for availability from external locations. They verify that your services are accessible from the internet. Monitors can check:

  • HTTP/HTTPS endpoints
  • TCP ports
  • DNS resolution
  • SSL certificate expiration

Bleemeo provides public probes in multiple locations worldwide (Frankfurt, Milan, Ohio, Paris, Singapore, Spain, Stockholm) to check your services from different regions.

You can also use private probes (Glouton agents) to monitor internal services not accessible from the public internet.

An event is a record of something that happened in your infrastructure, such as a metric crossing a threshold or a service becoming unavailable.

An incident is an ongoing problem that requires attention. Incidents are created when issues are detected and resolved when the situation returns to normal.

Notifications are alerts sent to users when incidents occur. Bleemeo supports multiple notification channels:

  • Email
  • SMS
  • Slack
  • Microsoft Teams
  • PagerDuty
  • OpsGenie
  • Webhooks

A contact group is a collection of notification targets that can be reused across multiple alerting rules. For example, you might have:

  • An “On-Call Team” group for critical alerts
  • A “Development” group for application-specific notifications

Server groups allow you to organize agents and apply shared settings. Each agent belongs to at least one group (the “Default” group). Server groups let you:

  • Apply different thresholds to different environments
  • Organize infrastructure by function or team
  • Configure group-specific alerting rules

A silence temporarily suppresses notifications for specific metrics or services. Use silences during:

  • Planned maintenance windows
  • Known issues being worked on
  • Testing or deployment periods

Recording rules are PromQL-based expressions that generate new metrics from existing ones. They allow you to:

  • Pre-compute complex queries for faster dashboards
  • Create custom aggregations
  • Set thresholds on derived metrics

Recording rules are available on Starter and Professional plans.

Bleemeo provides several types of dashboards:

Displays all connected agents and their current status. Use it to get an overview of your entire infrastructure.

Shows current issues at a glance, including:

  • Services with problems
  • Metrics exceeding thresholds
  • Recent incidents

Fully customizable dashboards where you can:

  • Add any metric from any source
  • Create charts, graphs, and tables
  • Use PromQL queries for advanced visualizations

Pre-built dashboard templates for common infrastructure components. Templates are automatically applied when services are discovered.

Annotations mark specific time ranges on dashboards to correlate events with metric changes. Use them to mark deployments, incidents, or maintenance windows.

Bleemeo offers several plans to match your monitoring needs:

PlanDescription
FreeUp to 3 agents, 5 monitors, basic features
StarterMid-tier with extended limits, 200 metrics per agent
ProfessionalFull features, 2000 metrics per agent, container monitoring included

Bleemeo billing is based on:

  • Agents: Number of monitored servers or devices
  • Containers: Containerized services (Professional plan includes 20 simultaneous containers per agent)
  • Monitors: Number of availability checks
  • Metrics: Total unique metrics collected (evaluated hourly)

New accounts receive a 15-day free trial with access to all Professional features. No credit card is required to start a trial.

PromQL (Prometheus Query Language) is the query language used to retrieve and manipulate metrics in Bleemeo. It’s compatible with Prometheus and allows you to:

  • Filter metrics by labels
  • Aggregate data across dimensions
  • Perform mathematical operations
  • Create complex alerting conditions

Example queries:

# Average CPU usage across all servers
avg(cpu_used)
# Memory usage for a specific server
mem_used{hostname="web-server-1"}
# Request rate over the last 5 minutes
rate(http_requests_total[5m])

Applications are logical groupings of related services. By tagging services with an application name, you can:

  • View application-level dashboards
  • Set up application-specific alerts
  • Track dependencies between services

For example, an e-commerce application might include web servers, databases, cache servers, and message queues, all grouped under a single application view.

Now that you understand the key concepts, you can: