Key Concepts

This page introduces the fundamental concepts and terminology used throughout Bleemeo. Understanding these concepts will help you get the most out of the platform.

General Monitoring Concepts

Metrics

A metric is a numerical measurement collected at regular intervals. Metrics represent the state of your infrastructure at a specific point in time. Examples include CPU usage percentage, memory consumption in bytes, or the number of HTTP requests per second.

Bleemeo collects metrics at 10-second granularity, providing high-resolution data for real-time monitoring and troubleshooting.

Time Series

A time series is a sequence of metric values recorded over time. Each data point consists of a timestamp and a value. Time series data allows you to:

Visualize trends and patterns
Detect anomalies
Set up alerting based on historical behavior
Perform capacity planning

Labels and Tags

Labels (also called tags) are key-value pairs attached to metrics that provide additional context and allow filtering. For example, a CPU metric might have labels like:

hostname: the server name
cpu: the CPU core number
instance: the specific service instance

Labels enable you to query and aggregate metrics across your infrastructure.

Thresholds

Thresholds define the boundaries that determine when a metric value is considered normal, warning, or critical:

Warning threshold: The first alert level, indicating a potential issue
Critical threshold: A severe alert level requiring immediate attention

When a metric crosses a threshold, an incident is created and notifications can be triggered.

Bleemeo-Specific Concepts

Glouton Agent

Glouton is the Bleemeo monitoring agent. It’s a lightweight, open-source agent that:

Collects system and application metrics
Automatically discovers services running on your systems
Sends data to Bleemeo Cloud in real-time
Requires minimal configuration

The agent can run on Linux, Windows, Docker containers, or Kubernetes clusters (as a DaemonSet).

Agent Facts

Agent facts are system properties automatically detected by the agent. These include:

Fact	Description
`glouton_version`	Version of the installed agent
`cpu_cores`	Number of CPU cores
`memory`	Total system memory
`os_pretty_name`	Operating system name and version
`hostname`	System hostname
`kubernetes_cluster_name`	Kubernetes cluster name (if applicable)

Facts help identify and categorize your infrastructure in dashboards and alerts.

Service Discovery

Bleemeo agent automatically discovers services running on your systems. When a service is detected (e.g., Apache, MySQL, Redis), the agent:

Creates a tag with the service name
Starts collecting service-specific metrics
Sets up health checks for the service
Applies default thresholds

Over 100 services are supported out of the box, requiring no manual configuration.

Service Status

Each discovered service has a status that indicates its health:

Status Code	Meaning
0	Check passed successfully
1	Check passed with a warning (e.g., HTTP 404 response)
2	Check detected an issue with the service
3	Check status unknown (timeout or check failure)

Service status is stored in the service_status metric and displayed in status dashboards.

Monitors

Monitors are endpoints checked for availability from external locations. They verify that your services are accessible from the internet. Monitors can check:

HTTP/HTTPS endpoints
TCP ports
DNS resolution
SSL certificate expiration

Bleemeo provides public probes in multiple locations worldwide (Frankfurt, Milan, Ohio, Paris, Singapore, Spain, Stockholm) to check your services from different regions.

You can also use private probes (Glouton agents) to monitor internal services not accessible from the public internet.

Alerting Concepts

Events and Incidents

An event is a record of something that happened in your infrastructure, such as a metric crossing a threshold or a service becoming unavailable.

An incident is an ongoing problem that requires attention. Incidents are created when issues are detected and resolved when the situation returns to normal.

Notifications

Notifications are alerts sent to users when incidents occur. Bleemeo supports multiple notification channels:

Email
SMS
Slack
Microsoft Teams
PagerDuty
OpsGenie
Webhooks

Contact Groups

A contact group is a collection of notification targets that can be reused across multiple alerting rules. For example, you might have:

An “On-Call Team” group for critical alerts
A “Development” group for application-specific notifications

Server Groups

Server groups allow you to organize agents and apply shared settings. Each agent belongs to at least one group (the “Default” group). Server groups let you:

Apply different thresholds to different environments
Organize infrastructure by function or team
Configure group-specific alerting rules

Silence

A silence temporarily suppresses notifications for specific metrics or services. Use silences during:

Planned maintenance windows
Known issues being worked on
Testing or deployment periods

Recording Rules

Recording rules are PromQL-based expressions that generate new metrics from existing ones. They allow you to:

Pre-compute complex queries for faster dashboards
Create custom aggregations
Set thresholds on derived metrics

Recording rules are available on Starter and Professional plans.

Dashboard Concepts

Bleemeo provides several types of dashboards:

Agent Dashboard

Displays all connected agents and their current status. Use it to get an overview of your entire infrastructure.

Status Dashboard

Shows current issues at a glance, including:

Services with problems
Metrics exceeding thresholds
Recent incidents

Custom Dashboards

Fully customizable dashboards where you can:

Add any metric from any source
Create charts, graphs, and tables
Use PromQL queries for advanced visualizations

Dashboard Templates

Pre-built dashboard templates for common infrastructure components. Templates are automatically applied when services are discovered.

Annotations

Annotations mark specific time ranges on dashboards to correlate events with metric changes. Use them to mark deployments, incidents, or maintenance windows.

Plans and Billing

Plan Types

Bleemeo offers several plans to match your monitoring needs:

Plan	Description
Free	Up to 3 agents, 5 monitors, basic features
Starter	Mid-tier with extended limits, 200 metrics per agent
Professional	Full features, 2000 metrics per agent, container monitoring included

Billing Units

Bleemeo billing is based on:

Agents: Number of monitored servers or devices
Containers: Containerized services (Professional plan includes 20 simultaneous containers per agent)
Monitors: Number of availability checks
Metrics: Total unique metrics collected (evaluated hourly)

Trial

New accounts receive a 15-day free trial with access to all Professional features. No credit card is required to start a trial.

PromQL

PromQL (Prometheus Query Language) is the query language used to retrieve and manipulate metrics in Bleemeo. It’s compatible with Prometheus and allows you to:

Filter metrics by labels
Aggregate data across dimensions
Perform mathematical operations
Create complex alerting conditions

Example queries:

# Average CPU usage across all servers
avg(cpu_used)

# Memory usage for a specific server
mem_used{hostname="web-server-1"}

# Request rate over the last 5 minutes
rate(http_requests_total[5m])

Applications

Applications are logical groupings of related services. By tagging services with an application name, you can:

View application-level dashboards
Set up application-specific alerts
Track dependencies between services

For example, an e-commerce application might include web servers, databases, cache servers, and message queues, all grouped under a single application view.

Next Steps

Now that you understand the key concepts, you can:

Install the Bleemeo agent on your first server
Configure service monitoring for your applications
Set up notifications to stay informed about issues
Create custom dashboards to visualize your metrics