Skip to content

How to Define a Custom Metric

Professional

Bleemeo agent can relay your custom metrics to the Bleemeo Cloud platform. There are several ways to gather custom metrics:

The Bleemeo agent can check that your custom application is alive. It can use a built-in check (HTTP and TCP), a Nagios check command, or check for a running process.

To configure an additional check, add the following to your Bleemeo agent configuration (/etc/glouton/conf.d/50-custom-check.conf):

service:
- type: "myapplication"
port: 8080
check_type: "nagios"
check_command: "command-to-run"
- type: "custom_webserver"
port: 8181
check_type: "http"
- type: "process_check_name"
check_type: "process"
match_process: "command-to-check"

Custom checks are defined under service in the configuration.

Add to your configuration file (any *.conf file in /etc/glouton/conf.d/):

service:
# Use "/path/to/bin --with-option" to check the service. Also keep a
# TCP connection with 127.0.0.1:8080, if that connection is closed run
# a check immediately instead of waiting 1 minute.
- type: "service_name"
port: 8080
address: "127.0.0.1"
interval: 60
check_type: "nagios"
check_command: "/path/to/bin --with-option"
# HTTPS check to URL https://127.0.0.1:8443/check/
- type: "an_https_server"
port: 8443
check_type: "https"
http_path: "/check/"
# TCP check on 127.0.0.1:22
- type: "another_service"
port: 22
# Process check for Docker daemon
- type: "process_check_name"
check_type: "process"
match_process: "/usr/bin/dockerd"

Custom checks are run every minute by default. If port is provided, Bleemeo Agent maintains a TCP connection with this port, and if that connection is closed the check is run immediately.

Fields are:

  • type: Name of your service. This name must be unique per Bleemeo Agent.

  • port: TCP port number. This field is mandatory if check_type is not “nagios”.

  • address: IP address associated with the TCP port. Default value is “127.0.0.1”.

  • interval: the delay between two consecutive checks in seconds (the minimum and default value is 1 minute).

  • check_type: Check used for this service. Possible values are:

    • “tcp”: The default, it checks that a TCP connection could be opened to the given address/port.
    • “http”: It checks that an HTTP request on the given address/port has a status code 2xx or 3xx.
    • “https”: Same as the HTTP check, but using an HTTPS connection. Certificates are not validated.
    • “nagios”: It uses check_command to test the liveliness of the service.
    • “process”: Check that the process defined in match_process is running.
  • http_path: The path of the URL checked with the HTTP or HTTPS check_type. Default value is ”/”

  • http_status_code: The expected status code of HTTP response (only valid if check_type is HTTP or HTTPS). Default is unset.

    When set, the check passes if the response status code is equal to the expected one, and is critical otherwise.

    When unset, if the response status code is

    • strictly less than 400, the check passes
    • between 400 and 499 (both included), the check is warning
    • greater than or equal to 500, the check is critical
  • check_command: Command used for a Nagios check. This field is mandatory if check_type is “nagios”.

    The command must conform to the Nagios check standard, that is:

    • Successful check must exit with a return code of 0
    • Warning check must exit with a return code of 1
    • Critical check must exit with a return code of 2

    In addition, the command may print to stdout a short explanation of its result. For example, if a check fails to connect to your service, the check may print: “CRITICAL - connection refused”

  • match_process: Regex to match in a process check. It supports regular expressions with the RE2 syntax.

    If no process matched, or if all matched processes are in a zombie state, the check is critical, else it passes.