Skip to content

Services Monitoring

Free Only availability check
Starter Only availability check
Professional All specific metrics

Bleemeo agent can discover services running on your system and automatically monitor specific metrics for such services. For example with Apache HTTP server, the number of requests served is automatically monitored. For each service detected a tag with the service name is created which allows to filter your agents by service running on them.

If you have any service not listed on this page, you can define a custom check or define a custom metric.

If you want to disable metrics for a service, you can ignore some services.

The agent checks TCP sockets for each service. By default, a simple TCP connection is used to test the service, but some services support a specific check. See the service details below for the supported specific checks. Those checks are executed every minute. They may be run earlier for TCP services: the Bleemeo agent keeps a connection open with the service and if that connection is broken the check is executed immediately.

The current status of the service can be viewed in the Status Dashboard and you can configure a notification to be alerted when the status changes.

The history of the status is also stored in a metric named service_status. One metric per service is created, the metric has two labels to identify the service:

  • service: the kind of the service, like “apache”, “nginx”…
  • service_instance: the container name. This label is absent when the service isn’t running in a container.

The value of this metric is:

  • 0 when the check passed successfully
  • 1 when the check passed with a warning. For example an Apache server responded with a 404 page
  • 2 when the check detected an issue with the service
  • 3 when the check doesn’t know the status of the service. This happens when the check itself failed, usually due to a timeout

It’s possible to override the auto-discovery parameter of any service using either configuration files, Docker labels or Kubernetes annotations.

Using Bleemeo agent configuration files is done by adding entries like the following to /etc/glouton/conf.d/90-service-override.conf:

service:
- type: "apache"
ignore_ports:
- 8000
- 9000
- type: "mysql"
instance: "name_of_a_container"
username: root
password: root

The service key contains a list of service to override settings. Add one entry per service. Each service is identified by the couple “type” and “instance”. The service “type” it’s not a customizable name, unless your are creating a Custom checks, the value should match one of the supported service type (“apache”, “nginx”, “postgresql”…). See below for the full list of supported services. Service instance could be omitted when the service is running outside a container. It’s the container name for containerized service.

All other value (port, username and password in above examples) are overridable settings which are described below

Using Docker labels or Kubernetes annotations

Section titled “Using Docker labels or Kubernetes annotations”

If you are using Docker or Kubernetes, instead of Bleemeo agent configuration file, you can use Docker labels or Kubernetes annotations. Any overridable settings could be added using the name glouton.SETTING.

For instance, to ignore ports 8000 and 9000 on a Docker container, use:

Terminal window
docker run --label glouton.ignore_ports="8000,9000" [...]

The same thing for a Kubernetes deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
name: "my-application"
spec:
template:
metadata:
# Create the annotations on the pod, not on the deployment
annotations:
glouton.ignore_ports: "8000,9000"

Sample of a service with all settings overridden:

# This is only a sample to list all possible value. It don't make sense to have all
# settings on a single service. Only add the setting you need to override.
service:
- type: apache
instance: my_container
address: 127.0.0.1
port: 1234
ignore_ports:
- 8000
- 9000
tags:
- mytag1
- mytag2
interval: 60
http_path: /my_custom_path
http_status_code: 200
http_host: example.com
check_type: nagios
match_process: command-to-check
check_command: command-to-run
nagios_nrpe_name: nagios_nrpe_name
username: username
password: secret
metrics_unix_socket: /var/run/mysqld/mysqld.sock
stats_url: http://localhost:9000/status
stats_port: 9000
stats_protocol: http
detailed_items:
- table1
- table2
included_items:
- job1
excluded_items:
- job2
jmx_port: 3333
jmx_username: monitorRole
jmx_password: secret
jmx_metrics: []
ssl: false
ssl_insecure: false
starttls: false
ca_file: /etc/ssl/certs/ca-certificates.crt
cert_file: /etc/ssl/certs/client.crt
key_file: /etc/ssl/private/client.key
log_files: []
log_format: custom_apache_format
log_filter: custom_apache_filter

This is the IP address on which the service is listening.

This is the TCP port on which the service is listening.

This is a list of port to ignore from auto-discovery. The auto-discovery could find the a service is listening on multiple ports, and the service would be monitored on every ports. But this could be wrong, for example a Docker nginx that have two exposed ports in the Dockerfile (80 and 443) but only listen on 80. In that case you will want to ignore the port 443.

This is a list of tag name to associate with the services.

This is the interval in second to check for the service status. The number could not be smaller than 60 seconds which is the default, but could be increased if your service check is expensive.

For service that expose an HTTP service, by default the agent will use the service’s address and port to do a query on http://address:port/ and expect a HTTP 200 status code. Service’s address is an IP address, likely “127.0.0.1” when not using a container, this could not match your service configuration, especially when using virtual host.

http_host allows to specify the HTTP host header send in the request. This is useful when the HTTP server configured with virtual host and don’t reply to request on something like http://127.0.0.1.

http_path allows to check a some sub-path, like “/ready” to check for service specific page dedicated to service checking.

http_status_code allows the check to expect other HTTP status code.

This is supported by the following services:

For example, with the following service override:

service:
- type: apache
address: 127.0.0.1
port: 8080
http_path: /ready
http_status_code: 204
http_host: example.com

The Bleemeo agent will connect to 127.0.0.1 on port 8080 and send the HTTP request http://example.com/ready

This is used to configure a custom check, see Custom checks for details.

If the NRPE server is enabled on the agent configuration (see nrpe.enable), you can expose any service check to your Nagios server.

Example:

service:
- type: "apache"
nagios_nrpe_name: check_apache

Some service required authentication for metrics collection and/or service check. This configure the credentials to use.

This is supported by the following services:

  • Jenkins
  • MySQL / MariaDB
  • OpenLDAP
  • PostgreSQL
  • RabbitMQ
  • UPSD

Some service use a different port for the service itself and for exposing monitoring information. This setting allows to specify where the service expose its metrics.

This is supported by the following services:

  • HAProxy
  • Jenkins
  • PHP-FPM

For example:

service:
- type: haproxy
port: 80
stats_url: "http://localhost:8080/statistics"

The agent will check that the HAproxy service is listening on port 80, but it will use the URL on port 8080 to collect metrics for the HAProxy.

Some service use a different port for the service itself and for exposing monitoring information. This setting allows to specify the port where the service expose its metrics.

This setting is similar to stats_url but only specify the port rather than the full URL.

This is supported by the following services:

  • NATS
  • RabbitMQ
  • uWSGI

Some service could use multiple protocol to expose monitoring information. This setting allows to specify the protocol that could be used. It works with stats_port.

This is supported by the following services:

  • uWSGI

For example:

service:
- type: "uwsgi"
address: "127.0.0.1"
port: 8080
stats_port: 1717
# This assume uWSGI uses the --stats-http option and expose them to port 1717
stats_protocol: "http"

Some service could listen on Unix socket rather than TCP socket. This allows to configure the path to Unix socket and the agent will try to use that socket for service metrics collection.

This is supported by the following services:

  • MySQL / MariaDB

Some service exposes metrics per item (like per tables, per databases…). This allows to configure for which items metrics should be collected. Those services also expose global metrics which are enabled by default.

This is supported by the following services:

  • PostgreSQL: for detailed metrics on databases
  • Cassandra: for detailed metrics on tables
  • Kafka: for detailed metrics on topics

Some service exposes metrics per item (like per jobs…). This allows to configure for which items metrics should be collected. Unlike detailed_items, this is used on service that don’t expose a global metric which is the aggregation of each per item metrics.

This is supported by the following services:

  • Jenkins: for metrics per job

jmx_port, jmx_username, jmx_password, jmx_metrics

Section titled “jmx_port, jmx_username, jmx_password, jmx_metrics”

For a Java service, configure how to collect metrics using JMX. See Java metrics for details.

ssl, ssl_insecure, starttls, ca_file, cert_file, key_file

Section titled “ssl, ssl_insecure, starttls, ca_file, cert_file, key_file”

Some service could expose plain-text or TLS version of the service. This allows to enable TLS or stay in plain-text (default).

This is supported by the following services:

  • OpenLDAP

For example:

service:
- type: "openldap"
address: "127.0.0.1"
port: 389
starttls: true
# By default SSL certificate are verified. This means that your OpenLDAP service need to either
# to use a certificate signed by trusted authority or provide the ca_file to your self-signed CA.
ssl_insecure: false
ca_file: "/path/to/your-self-signed-ca.crt"

To customize the handling of log files related to a service, two approaches are possible:

  1. Specifying a log format and/or a log filter that will apply to all related log files:

    service:
    - type: "my-app"
    ...
    log_format: "my-app-format"
    log_filter: "my-app-filter"
  2. Specifying the format/filter for each file (or file pattern):

    - type: "my-app"
    ...
    log_files:
    - file_path: "/var/log/my-app/access.log"
    log_format: "my-app-access-format"
    log_filter: "my-app-access-filter"
    - file_path: "/var/log/my-app/error.*.log"
    log_format: "my-app-error-format"
    log_filter: "my-app-error-filter"

    When using the second approach, if an item of the log_files list has the log_format or log_filter property left blank, the value of the service’s log_format and log_filter will be used (if defined).

The services for which Glouton comes with default log configuration are: