Key Concepts
This page introduces the fundamental concepts and terminology used throughout Bleemeo. Understanding these concepts will help you get the most out of the platform.
General Monitoring Concepts
Section titled “General Monitoring Concepts”Metrics
Section titled “Metrics”A metric is a numerical measurement collected at regular intervals. Metrics represent the state of your infrastructure at a specific point in time. Examples include CPU usage percentage, memory consumption in bytes, or the number of HTTP requests per second.
Bleemeo collects metrics at 10-second granularity, providing high-resolution data for real-time monitoring and troubleshooting.
Time Series
Section titled “Time Series”A time series is a sequence of metric values recorded over time. Each data point consists of a timestamp and a value. Time series data allows you to:
- Visualize trends and patterns
- Detect anomalies
- Set up alerting based on historical behavior
- Perform capacity planning
Labels and Tags
Section titled “Labels and Tags”Labels (also called tags) are key-value pairs attached to metrics that provide additional context and allow filtering. For example, a CPU metric might have labels like:
hostname: the server namecpu: the CPU core numberinstance: the specific service instance
Labels enable you to query and aggregate metrics across your infrastructure.
Thresholds
Section titled “Thresholds”Thresholds define the boundaries that determine when a metric value is considered normal, warning, or critical:
- Warning threshold: The first alert level, indicating a potential issue
- Critical threshold: A severe alert level requiring immediate attention
When a metric crosses a threshold, an incident is created and notifications can be triggered.
Bleemeo-Specific Concepts
Section titled “Bleemeo-Specific Concepts”Glouton Agent
Section titled “Glouton Agent”Glouton is the Bleemeo monitoring agent. It’s a lightweight, open-source agent that:
- Collects system and application metrics
- Automatically discovers services running on your systems
- Sends data to Bleemeo Cloud in real-time
- Requires minimal configuration
The agent can run on Linux, Windows, Docker containers, or Kubernetes clusters (as a DaemonSet).
Agent Facts
Section titled “Agent Facts”Agent facts are system properties automatically detected by the agent. These include:
| Fact | Description |
|---|---|
glouton_version | Version of the installed agent |
cpu_cores | Number of CPU cores |
memory | Total system memory |
os_pretty_name | Operating system name and version |
hostname | System hostname |
kubernetes_cluster_name | Kubernetes cluster name (if applicable) |
Facts help identify and categorize your infrastructure in dashboards and alerts.
Service Discovery
Section titled “Service Discovery”Bleemeo agent automatically discovers services running on your systems. When a service is detected (e.g., Apache, MySQL, Redis), the agent:
- Creates a tag with the service name
- Starts collecting service-specific metrics
- Sets up health checks for the service
- Applies default thresholds
Over 100 services are supported out of the box, requiring no manual configuration.
Service Status
Section titled “Service Status”Each discovered service has a status that indicates its health:
| Status Code | Meaning |
|---|---|
| 0 | Check passed successfully |
| 1 | Check passed with a warning (e.g., HTTP 404 response) |
| 2 | Check detected an issue with the service |
| 3 | Check status unknown (timeout or check failure) |
Service status is stored in the service_status metric and displayed in status dashboards.
Monitors
Section titled “Monitors”Monitors are endpoints checked for availability from external locations. They verify that your services are accessible from the internet. Monitors can check:
- HTTP/HTTPS endpoints
- TCP ports
- DNS resolution
- SSL certificate expiration
Bleemeo provides public probes in multiple locations worldwide (Frankfurt, Milan, Ohio, Paris, Singapore, Spain, Stockholm) to check your services from different regions.
You can also use private probes (Glouton agents) to monitor internal services not accessible from the public internet.
Alerting Concepts
Section titled “Alerting Concepts”Events and Incidents
Section titled “Events and Incidents”An event is a record of something that happened in your infrastructure, such as a metric crossing a threshold or a service becoming unavailable.
An incident is an ongoing problem that requires attention. Incidents are created when issues are detected and resolved when the situation returns to normal.
Notifications
Section titled “Notifications”Notifications are alerts sent to users when incidents occur. Bleemeo supports multiple notification channels:
- SMS
- Slack
- Microsoft Teams
- PagerDuty
- OpsGenie
- Webhooks
Contact Groups
Section titled “Contact Groups”A contact group is a collection of notification targets that can be reused across multiple alerting rules. For example, you might have:
- An “On-Call Team” group for critical alerts
- A “Development” group for application-specific notifications
Server Groups
Section titled “Server Groups”Server groups allow you to organize agents and apply shared settings. Each agent belongs to at least one group (the “Default” group). Server groups let you:
- Apply different thresholds to different environments
- Organize infrastructure by function or team
- Configure group-specific alerting rules
Silence
Section titled “Silence”A silence temporarily suppresses notifications for specific metrics or services. Use silences during:
- Planned maintenance windows
- Known issues being worked on
- Testing or deployment periods
Recording Rules
Section titled “Recording Rules”Recording rules are PromQL-based expressions that generate new metrics from existing ones. They allow you to:
- Pre-compute complex queries for faster dashboards
- Create custom aggregations
- Set thresholds on derived metrics
Recording rules are available on Starter and Professional plans.
Dashboard Concepts
Section titled “Dashboard Concepts”Bleemeo provides several types of dashboards:
Agent Dashboard
Section titled “Agent Dashboard”Displays all connected agents and their current status. Use it to get an overview of your entire infrastructure.
Status Dashboard
Section titled “Status Dashboard”Shows current issues at a glance, including:
- Services with problems
- Metrics exceeding thresholds
- Recent incidents
Custom Dashboards
Section titled “Custom Dashboards”Fully customizable dashboards where you can:
- Add any metric from any source
- Create charts, graphs, and tables
- Use PromQL queries for advanced visualizations
Dashboard Templates
Section titled “Dashboard Templates”Pre-built dashboard templates for common infrastructure components. Templates are automatically applied when services are discovered.
Annotations
Section titled “Annotations”Annotations mark specific time ranges on dashboards to correlate events with metric changes. Use them to mark deployments, incidents, or maintenance windows.
Plans and Billing
Section titled “Plans and Billing”Plan Types
Section titled “Plan Types”Bleemeo offers several plans to match your monitoring needs:
| Plan | Description |
|---|---|
| Free | Up to 3 agents, 5 monitors, basic features |
| Starter | Mid-tier with extended limits, 200 metrics per agent |
| Professional | Full features, 2000 metrics per agent, container monitoring included |
Billing Units
Section titled “Billing Units”Bleemeo billing is based on:
- Agents: Number of monitored servers or devices
- Containers: Containerized services (Professional plan includes 20 simultaneous containers per agent)
- Monitors: Number of availability checks
- Metrics: Total unique metrics collected (evaluated hourly)
New accounts receive a 15-day free trial with access to all Professional features. No credit card is required to start a trial.
PromQL
Section titled “PromQL”PromQL (Prometheus Query Language) is the query language used to retrieve and manipulate metrics in Bleemeo. It’s compatible with Prometheus and allows you to:
- Filter metrics by labels
- Aggregate data across dimensions
- Perform mathematical operations
- Create complex alerting conditions
Example queries:
# Average CPU usage across all serversavg(cpu_used)
# Memory usage for a specific servermem_used{hostname="web-server-1"}
# Request rate over the last 5 minutesrate(http_requests_total[5m])Applications
Section titled “Applications”Applications are logical groupings of related services. By tagging services with an application name, you can:
- View application-level dashboards
- Set up application-specific alerts
- Track dependencies between services
For example, an e-commerce application might include web servers, databases, cache servers, and message queues, all grouped under a single application view.
Next Steps
Section titled “Next Steps”Now that you understand the key concepts, you can:
- Install the Bleemeo agent on your first server
- Configure service monitoring for your applications
- Set up notifications to stay informed about issues
- Create custom dashboards to visualize your metrics