Skip to main content

Services Monitoring

FreeOnly availability check
StarterOnly availability check
ProfessionalAll specific metrics

Bleemeo agent can discover services running on your system and automatically monitor specific metrics for such services. For example with Apache HTTP server, the number of requests served is automatically monitored. For each service detected a tag with the service name is created which allows to filter your agents by service running on them.

If you have any service not listed on this page, you can define a custom check or define a custom metric.

If you want to disable metrics for a service, you can ignore some services.

Common Features​

Service Status​

The agent checks TCP sockets for each service. By default, a simple TCP connection is used to test the service, but some services support a specific check. See the service details below for the supported specific checks. Those checks are executed every minute. They may be run earlier for TCP services: the Bleemeo agent keeps a connection open with the service and if that connection is broken the check is executed immediately.

The current status of the service can be viewed in the Status Dashboard and you can configure a notification to be alerted when the status changes.

The history of the status is also stored in a metric named service_status. One metric per service is created, the metric has two labels to identify the service:

  • service: the kind of the service, like "apache", "nginx"...
  • service_instance: the container name. This label is absent when the service isn't running in a container.

The value of this metric is:

  • 0 when the check passed successfully
  • 1 when the check passed with a warning. For example an Apache server responded with a 404 page
  • 2 when the check detected an issue with the service
  • 3 when the check doesn't know the status of the service. This happens when the check itself failed, usually due to a timeout

Overridable auto-discovery settings​

It's possible to override the auto-discovery parameter of any service using either configuration files, Docker labels or Kubernetes annotations.

Using configuration file​

Using Bleemeo agent configuration files is done by adding entries like the following to /etc/glouton/conf.d/90-service-override.conf:

service:
- type: "apache"
ignore_ports:
- 8000
- 9000
- type: "mysql"
instance: "name_of_a_container"
username: root
password: root

The service key contains a list of service to override settings. Add one entry per service. Each service is identified by the couple "type" and "instance". The service "type" it's not a customizable name, unless your are creating a Custom checks, the value should match one of the supported service type ("apache", "nginx", "postgresql"...). See below for the full list of supported services. Service instance could be omitted when the service is running outside a container. It's the container name for containerized service.

All other value (port, username and password in above examples) are overridable settings which are described below

Using Docker labels or Kubernetes annotations​

If you are using Docker or Kubernetes, instead of Bleemeo agent configuration file, you can use Docker labels or Kubernetes annotations. Any overridable settings could be added using the name glouton.SETTING.

For instance, to ignore ports 8000 and 9000 on a Docker container, use:

docker run --labels glouton.ignore_ports="8000,9000" [...]

The same thing for a Kubernetes deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
name: "my-application"
spec:
template:
metadata:
# Create the annotations on the pod, not on the deployment
annotations:
glouton.ignore_ports: "8000,9000"

Overridable settings​

Sample of a service with all settings overridden:

# This is only a sample to list all possible value. It don't make sense to have all
# settings on a single service. Only add the setting you need to override.

service:
- type: apache
instance: my_container
address: 127.0.0.1
port: 1234
ignore_ports:
- 8000
- 9000
tags:
- mytag1
- mytag2
interval: 60
http_path: /my_custom_path
http_status_code: 200
http_host: example.com
check_type: nagios
match_process: command-to-check
check_command: command-to-run
nagios_nrpe_name: nagios_nrpe_name
username: username
password: secret
metrics_unix_socket: /var/run/mysqld/mysqld.sock
stats_url: http://localhost:9000/status
stats_port: 9000
stats_protocol: http
detailed_items:
- table1
- table2
included_items:
- job1
excluded_items:
- job2
jmx_port: 3333
jmx_username: monitorRole
jmx_password: secret
jmx_metrics: []
ssl: false
ssl_insecure: false
starttls: false
ca_file: /etc/ssl/certs/ca-certificates.crt
cert_file: /etc/ssl/certs/client.crt
key_file: /etc/ssl/private/client.jey

address​

This is the IP address on which the service is listening.

port​

This is the TCP port on which the service is listening.

ignore_ports​

This is a list of port to ignore from auto-discovery. The auto-discovery could find the a service is listening on multiple ports, and the service would be monitored on every ports. But this could be wrong, for example a Docker nginx that have two exposed ports in the Dockerfile (80 and 443) but only listen on 80. In that case you will want to ignore the port 443.

tags​

This is a list of tag name to associate with the services.

interval​

This is the interval in second to check for the service status. The number could not be smaller than 60 seconds which is the default, but could be increased if your service check is expensive.

http_path, http_status_code, http_host​

For service that expose an HTTP service, by default the agent will use the service's address and port to do a query on http://address:port/ and expect a HTTP 200 status code. Service's address is an IP address, likely "127.0.0.1" when not using a container, this could not match your service configuration, especially when using virtual host.

http_host allows to specify the HTTP host header send in the request. This is useful when the HTTP server configured with virtual host and don't reply to request on something like http://127.0.0.1.

http_path allows to check a some sub-path, like "/ready" to check for service specific page dedicated to service checking.

http_status_code allows the check to expect other HTTP status code.

This is supported by the following services:

For example, with the following service override:

service:
- type: apache
address: 127.0.0.1
port: 8080
http_path: /ready
http_status_code: 204
http_host: example.com

The Bleemeo agent will connect to 127.0.0.1 on port 8080 and send the HTTP request http://example.com/ready

check_type, match_process, check_command​

This is used to configure a custom check, see Custom checks for details.

nagios_nrpe_name​

If the NRPE server is enabled on the agent configuration (see nrpe.enable), you can expose any service check to your Nagios server.

Example:

service:
- type: "apache"
nagios_nrpe_name: check_apache

username, password​

Some service required authentication for metrics collection and/or service check. This configure the credentials to use.

This is supported by the following services:

  • Jenkins
  • MySQL
  • OpenLDAP
  • PostgreSQL
  • RabbitMQ
  • UPSD

stats_url​

Some service use a different port for the service itself and for exposing monitoring information. This setting allows to specify where the service expose its metrics.

This is supported by the following services:

  • HAProxy
  • Jenkins
  • PHP-FPM

For example:

service:
- type: haproxy
port: 80
stats_url: "http://localhost:8080/statistics"

The agent will check that the HAproxy service is listening on port 80, but it will use the URL on port 8080 to collect metrics for the HAProxy.

stats_port​

Some service use a different port for the service itself and for exposing monitoring information. This setting allows to specify the port where the service expose its metrics.

This setting is simalar to stats_url but only specify the port rather than the full URL.

This is supported by the following services:

  • NATS
  • RabbitMQ
  • uWSGI

stats_protocol​

Some service could use multiple protocol to expose monitoring information. This setting allows to specify the protocol that could be used. It works with stats_port.

This is supported by the following services:

  • uWSGI

For example:

service:
- type: "uwsgi"
address: "127.0.0.1"
port: 8080
stats_port: 1717
# This assume uWSGI uses the --stats-http option and expose them to port 1717
stats_protocol: "http"

metrics_unix_socket​

Some service could listen on Unix socket rather than TCP socket. This allows to configure the path to Unix socket and the agent will try to use that socket for service metrics collection.

This is supported by the following services:

  • MySQL

detailed_items​

Some service exposes metrics per item (like per tables, per databases...). This allows to configure for which items metrics should be collected. Those services also expose global metrics which are enabled by default.

This is supported by the following services:

  • PostgreSQL: for detailed metrics on databases
  • Cassandra: for detailed metrics on tables
  • Kafka: for detailed metrics on topics

included_items, excluded_items​

Some service exposes metrics per item (like per jobs...). This allows to configure for which items metrics should be collected. Unlike detailed_items, this is used on service that don't expose a global metric which is the aggregation of each per item metrics.

This is supported by the following services:

  • Jenkins: for metrics per job

jmx_port, jmx_username, jmx_password, jmx_metrics​

For a Java service, configure how to collect metrics using JMX. See Java metrics for details.

ssl, ssl_insecure, starttls, ca_file, cert_file, key_file​

Some service could expose plain-text or TLS version of the service. This allows to enable TLS or stay in plain-text (default).

This is supported by the following services:

  • OpenLDAP

For example:

service:
- type: "openldap"
address: "127.0.0.1"
port: 389
starttls: true
# By default SSL certificate are verified. This means that your OpenLDAP service need to either
# to use a certificate signed by trusted authority or provide the ca_file to your self-signed CA.
ssl_insecure: false
ca_file: "/path/to/your-self-signed-ca.crt"

Apache HTTP​

Service DetectionSpecific CheckMetrics
βœ…βœ…βœ…

Agent uses a HTTP check if the service listen on port 80.

To enable metrics gathering, ensure Bleemeo agent could access Apache status on the URL http://server-address/server-status?auto. This should be done by default with Apache on Ubuntu or Debian. If not, this usually means to add the following to your default virtual host (e.g. the first file in /etc/apache2/sites-enabled/):

LoadModule status_module /usr/lib/apache2/modules/mod_status.so

<Location /server-status>
SetHandler server-status
# You may want to limit access only to Bleemeo agent
# require ip IP-OF-BLEEMEO-AGENT/32
</Location>
ExtendedStatus On

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For an Apache running outside a container
- type: "apache"
address: "127.0.0.1"
port: 80 # HTTP listener, Agent don't support HTTPS here
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:80" # Host header sent. Like other option, you can omit them and default value will be used.

# For an additional Apache running outside a container
- type: "apache"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 81 # HTTP listener, Agent don't support HTTPS here
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:80" # Host header sent. Like other option, you can omit them and default value will be used.

# For an Apache running in a Docker container
- type: "apache"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 80 # HTTP listener, Agent don't support HTTPS here
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:80" # Host header sent. Like other option, you can omit them and default value will be used.

When using Docker, you may use Docker labels to set http_path and http_host:

docker run --labels glouton.http_path="/readiness" --labels glouton.http_host="my-host" [...]

Agent gathers the following metrics:

MetricDescription
service_statusStatus of Apache
apache_busy_workersNumber of Apache worker busy
apache_busy_workers_percApache worker busy in percent
apache_bytesNetwork traffic sent by Apache in bytes per second
apache_connectionsNumber of client connections to Apache server
apache_idle_workersNumber of Apache workers waiting for an incoming request
apache_max_workersMaximum number of Apache worker configured
apache_requestsNumber of requests per second
apache_uptimeTime spent since Apache server start in seconds

Apache keeps track of server activity in a structure known as the scoreboard. There is a slot in the scoreboard for each worker and it contains the status of this worker. The size of the scoreboard is the maximum of concurrent users that Apache could server.

MetricDescription
apache_scoreboard_waitingNumber of workers waiting for an incoming request. It's the same as apache_idle_workers
apache_scoreboard_startingNumber of workers starting up
apache_scoreboard_readingNumber of workers reading an incoming request
apache_scoreboard_sendingNumber of workers processing a client request
apache_scoreboard_keepaliveNumber of workers waiting for another request via keepalive
apache_scoreboard_dnslookupNumber of workers looking up a hostname
apache_scoreboard_closingNumber of workers closing their connection
apache_scoreboard_loggingNumber of workers writing in log files
apache_scoreboard_finishingNumber of workers gracefully finishing a request
apache_scoreboard_idle_cleanupNumber of idle workers being killed
apache_scoreboard_openNumber of slots with no worker in the scoreboard

Apache do not start all workers when they are not needed (e.g. if there is enough workers waiting, Apache reuse them and don't start a new one).

The sum of all scoreboard items is the maximum of concurrent requests Apache can serve. This sum is calculated and stored in apache_max_workers metric.

Asterisk​

Service DetectionSpecific CheckMetrics
βœ…βŒβŒ

Bitbucket​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

To enable metrics gathering, Bleemeo agent needs to be installed with JMX enabled, see Java Metrics for setup details.

In addition Bitbucket need to expose JMX over a TCP port. To enable JMX you can follow Enabling JMX counters for performance monitoring on Atlassian documentation.

Here is a summary for unauthenticated access:

  • Add option -Dcom.sun.management.jmxremote.port=3333 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false to the JVM. Default Bitbucket startup-script accept them from JMX_OPTS or JAVA_OPTS environment variable.
  • Add jmx.enable=true to $BITBUCKET_HOME/shared/bitbucket.properties

Warning: this will allow unauthenticated access. Make sure no untrusted access to this port is possible or you should setup authenticated JMX access.

Bleemeo agent should auto-detect the JMX port, if some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a Bitbucket running outside a container
- type: "bitbucket"
address: "127.0.0.1"
port: 7990
jmx_port: 3333
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"

# For an additional Bitbucket running outside a container
- type: "bitbucket"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 7991
jmx_port: 3334
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"

# For a Bitbucket running in a Docker container
- type: "bitbucket"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 7990
jmx_port: 3333
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"

Agent gathers the following metrics:

MetricDescription
service_statusStatus of Bitbucket
bitbucket_eventsNumber of events per second
bitbucket_io_tasksNumber of events per second
bitbucket_jvm_gcNumber of garbage collection per second
bitbucket_jvm_gc_utilizationGarbage collection utilization in percent
bitbucket_jvm_heap_usedHeap memory used in bytes
bitbucket_jvm_non_heap_usedNon-Heap memory used in bytes
bitbucket_pullsNumber of pulls per second
bitbucket_pushesNumber of pushes per second
bitbucket_queued_eventsNumber of events queued
bitbucket_queued_scm_clientsNumber of SCM clients queued
bitbucket_queued_scm_commandsNumber of SCM commands queued
bitbucket_request_timeAverage time of request in seconds
bitbucket_requestsNumber of requests per second
bitbucket_ssh_connectionsNumber of SSH connections per second
bitbucket_tasksNumber of scheduled tasks per second

Cassandra​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

To enable metrics gathering, Bleemeo agent needs to be installed with JMX enabled, see Java Metrics for setup details.

In addition Cassandra need to expose JMX over a TCP port. To enable JMX you will need to:

  • If using Docker, add environment variable JVM_OPTS=-Dcassandra.jmx.remote.port=7199
  • If running Cassandra and Bleemeo agent as native package, nothing should be needed. Cassandra already expose JMX on port 7199 on localhost by default.
  • In other case, add option -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.authenticate=false to the JVM.

Warning: this will allow unauthenticated access. Make sure no untrusted access to this port is possible or you should setup authenticated JMX access.

Bleemeo agent should auto-detect the JMX port, if some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a Cassandra running outside a container
- type: "cassandra"
address: "127.0.0.1"
port: 9042
jmx_port: 7199
jmx_username: "cassandra" # by default, no authentication is done
jmx_password: "cassandra"

# For an additional Cassandra running outside a container
- type: "cassandra"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 9043
jmx_port: 7200
jmx_username: "cassandra" # by default, no authentication is done
jmx_password: "cassandra"

# For a Cassandra running in a Docker container
- type: "cassandra"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 9042
jmx_port: 7199
jmx_username: "cassandra" # by default, no authentication is done
jmx_password: "cassandra"

Agent gathers the following metrics:

MetricDescription
service_statusStatus of Cassandra
cassandra_bloom_filter_false_ratioNumber of false positive of the bloom filter in percent
cassandra_jvm_gcNumber of garbage collection per second
cassandra_jvm_gc_utilizationGarbage collection utilization in percent
cassandra_jvm_heap_usedHeap memory used in bytes
cassandra_jvm_non_heap_usedNon-Heap memory used in bytes
cassandra_read_requestsNumber of read requests per second
cassandra_read_timeAverage time of read requests in seconds
cassandra_sstableNumber of SSTable
cassandra_write_requestsNumber of write requests per second
cassandra_write_timeAverage time of write requests in seconds

Bleemeo also support for detailed monitoring of specific Cassandra tables. To enable this, add the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a Cassandra running outside a container
- type: "cassandra"
detailed_items:
- "keyspace1.table1"
- "keyspace2.table2"
[...]

# For a Cassandra running in a Docker container
- type: "cassandra"
instance: "CONTAINER_NAME"
detailed_items:
- "keyspace1.table1"
- "keyspace2.table2"
[...]

The following per-table metrics will be gathered:

MetricDescription
cassandra_bloom_filter_false_ratioNumber of false positive of the bloom filter in percent
cassandra_read_requestsNumber of read requests per second
cassandra_read_timeAverage time of read requests in seconds
cassandra_sstableNumber of SSTable
cassandra_write_requestsNumber of write requests per second
cassandra_write_timeAverage time of write requests in seconds

Confluence​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

To enable metrics gathering, Bleemeo agent needs to be installed with JMX enabled, see Java Metrics for setup details.

In addition Confluence need to expose JMX over a TCP port. To enable JMX you can follow Live Monitoring Using the JMX Interface.

Here is a summary for unauthenticated access:

  • Add option -Dcom.sun.management.jmxremote.port=3333 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false to the JVM. Default Confluence startup-script accept them from CATALINA_OPTS or JAVA_OPTS environment variable.

Warning: this will allow unauthenticated access. Make sure no untrusted access to this port is possible or you should setup authenticated JMX access.

Bleemeo agent should auto-detect the JMX port, if some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a Confluence running outside a container
- type: "confluence"
address: "127.0.0.1"
port: 8090
jmx_port: 3333
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"

# For an additional Confluence running outside a container
- type: "confluence"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 8091
jmx_port: 3334
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"

# For a Confluence running in a Docker container
- type: "confluence"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 8090
jmx_port: 3333
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"

Agent gathers the following metrics:

MetricDescription
service_statusStatus of Confluence
confluence_db_query_timeExample for database query time in seconds
confluence_jvm_gcNumber of garbage collection per second
confluence_jvm_gc_utilizationGarbage collection utilization in percent
confluence_jvm_heap_usedHeap memory used in bytes
confluence_jvm_non_heap_usedNon-Heap memory used in bytes
confluence_last_index_timeTime of last indexing task in seconds
confluence_queued_error_mailsNumber of mails in error queued
confluence_queued_index_tasksNumber of indexing tasks queued
confluence_queued_mailsNumber of mails queued
confluence_request_timeAverage time of request in seconds
confluence_requestsNumber of requests per second

BIND​

Service DetectionSpecific CheckMetrics
βœ…βŒβŒ

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a BIND running outside a container
- type: "bind"
address: "127.0.0.1"
port: 53

# For an additional BIND running outside a container
- type: "bind"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 54

# For a BIND running in a Docker container
- type: "bind"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 53

Only the metric from service check is produced:

MetricDescription
service_statusStatus of BIND

Dovecot​

Service DetectionSpecific CheckMetrics
βœ…βœ…βŒ

Agent uses a IMAP check if the service listen on port 143.

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a Dovecot running outside a container
- type: "dovecot"
address: "127.0.0.1"
port: 143 # IMAP listener, Agent don't support IMAPS here

# For an additional Dovecot running outside a container
- type: "dovecot"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 144 # IMAP listener, Agent don't support IMAPS here

# For a Dovecot running in a Docker container
- type: "dovecot"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 143 # IMAP listener, Agent don't support IMAPS here

Only the metric from service check is produced:

MetricDescription
service_statusStatus of Dovecot

ejabberd​

Service DetectionSpecific CheckMetrics
βœ…βŒβŒ

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For an ejabberd running outside a container
- type: "ejabberd"
address: "127.0.0.1"
port: 5672

# For an additional ejabberd running outside a container
- type: "ejabberd"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 5673

# For an ejabberd running in a Docker container
- type: "ejabberd"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 5672

Only the metric from service check is produced:

MetricDescription
service_statusStatus of ejabberd

Elasticsearch​

Service DetectionSpecific CheckMetrics
βœ…βœ…βœ…

Agent uses a HTTP check if the service listen on port 9200.

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For an Elasticsearch running outside a container
- type: "elasticsearch"
address: "127.0.0.1"
port: 9200

# For an additional Elasticsearch running outside a container
- type: "elasticsearch"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 9201

# For an Elasticsearch running in a Docker container
- type: "elasticsearch"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 9200

Agent gathers the following metrics:

MetricDescription
service_statusStatus of Elasticsearch
elasticsearch_docs_countNumber of documents stored in all indices
elasticsearch_jvm_gcNumber of garbage collection per second
elasticsearch_jvm_gc_utilizationGarbage collection utilization in percent
elasticsearch_jvm_heap_usedHeap memory used in bytes
elasticsearch_jvm_non_heap_usedNon-Heap memory used in bytes
elasticsearch_sizeSize of all indices in bytes
elasticsearch_searchNumber of search in a shard per seconds
elasticsearch_search_timeAverage time took by search in secondss
elasticsearch_cluster_docs_countNumber of documents stored in all indices of the cluster
elasticsearch_cluster_sizeSize of all indices of the cluster in bytes

Exim4​

Service DetectionSpecific CheckMetrics
βœ…βœ…βœ…

Agent uses a SMTP check if the service listen on port 25.

To enable metrics gathering, ensure Bleemeo agent could run mailq command. This usually means to add the following to your exim configuration (e.g. /etc/exim4/conf.d/main/99_local):

queue_list_requires_admin=false

You will then need to fresh configuration (warning: the following will lost any local change in /etc/exim4/exim4.conf.template):

update-exim4.conf.template --run
update-exim4.conf
service exim4 restart

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For an Exim running outside a container
- type: "exim"
address: "127.0.0.1"
port: 25

# For an additional Exim running outside a container
- type: "exim"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 26

# For an Exim running in a Docker container
- type: "exim"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 25

Agent gathers the following metrics:

MetricDescription
service_statusStatus of Exim
exim_queue_sizeNumber of mails queued

Fail2ban​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

The agent will detect Fail2ban on your server and check that the server process stays active.

If you installed Glouton using wget or with a system package Glouton will gather metrics with no further configuration, otherwise, you need to allow Glouton to run fail2ban-client status as root. You can do so by adding the following to /etc/sudoers.d/glouton:

Cmnd_Alias FAIL2BAN = /usr/bin/fail2ban-client status, /usr/bin/fail2ban-client status *
glouton ALL=(root) NOEXEC: NOPASSWD: FAIL2BAN
Defaults!FAIL2BAN !logfile, !syslog, !pam_session

The following metrics are gathered:

MetricDescription
service_statusStatus of Fail2ban
fail2ban_failedNumber of failed authentications
fail2ban_bannedNumber of banned IP

FreeRADIUS​

Service DetectionSpecific CheckMetrics
βœ…βŒβŒ

HAProxy​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

To enable metrics gathering, Bleemeo agent needs access to HAProxy stats page and to know where it may find this page.

On HAProxy side, you will need to enable statistics page on a HTTP(S) frontend, for example:

frontend api-http
bind 0.0.0.0:80
stats enable
stats uri /statistics

On Bleemeo agent side, you need configure check by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a HAProxy running outside a container
- type: "haproxy"
address: "127.0.0.1"
port: 80
stats_url: "http://my-server/statistics"
# For authenticated access, use
# stats_url: "http://username:password@my-server/statistics"

- type: "haproxy"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 81
stats_url: "http://my-server:81/statistics"
# For authenticated access, use
# stats_url: "http://username:password@my-server:81/statistics"

# For a HAProxy running in a Docker container
- type: "haproxy"
instance: "CONTAINER_NAME"
address: "127.0.0.1"
port: 80
stats_url: "http://my-server/statistics"
# For authenticated access, use
# stats_url: "http://username:password@my-server/statistics"

Agent gathers the following metrics:

MetricDescription
service_statusStatus of HAProxy
haproxy_actNumber of active servers
haproxy_binNetwork traffic received from clients in bytes per second
haproxy_boutNetwork traffic sent to clients in bytes per second
haproxy_ctimeAverage time spent opening a connection in seconds
haproxy_dreqNumber of request denied by HAproxy per second
haproxy_drespNumber of response denied by HAproxy per second
haproxy_econNumber of failed connection per second
haproxy_ereqNumber of request in error per second
haproxy_erespNumber of response in error per second
haproxy_qcurNumber of currently queued requests
haproxy_qtimeAverage time request spent in queue in seconds
haproxy_req_totNumber of HTTP request per second
haproxy_rtimeAverage response time in seconds
haproxy_scurNumber of sessions opened
haproxy_stotNumber of sessions per second
haproxy_ttimeAverage time of session in seconds

InfluxDB​

Service DetectionSpecific CheckMetrics
βœ…βœ…βŒ

Agent uses an HTTP check if the service listens on port 8086.

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For an InfluxDB running outside a container
- type: "influxdb"
address: "127.0.0.1"
port: 8086

# For an additional InfluxDB running outside a container
- type: "influxdb"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 8087

# For an InfluxDB running in a Docker container
- type: "influxdb"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 8086

Only the metric from service check is produced:

MetricDescription
service_statusStatus of InfluxDB

Jenkins​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

To enable metric gathering, you need to configure the Jenkins URL and the credentials the agent should use to get information about the latest jobs.

info

To create a Jenkins API token, click your name on the top right, then click Configure and Add new Token.

To configure your service, add the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a Jenkins running outside a container
- type: "jenkins"
address: "127.0.0.1"
port: 8080
# To enable metric gathering, the fields stats_url, username and password are required.
# The other fields are optional and can be omitted.
# Jenkins URL.
stats_url: "http://jenkins.example.com"
# Credentials used for authentication.
username: my_user
password: my_api_token
## TLS configurations.
ca_file: "/myca.pem"
cert_file: "/mycert.pem"
key_file: "/mykey.pem"
# Skip chain and host verification.
ssl_insecure: false
# Choose jobs to include or exclude. When using both lists, exclude has priority.
# Wildcards are supported: [ "jobA/*", "jobB/subjob1/*"]. If empty, all jobs are included.
included_items: []
excluded_items: []

# For a Jenkins running in a Docker container
- type: "jenkins"
instance: "CONTAINER_NAME"
port: 8080
stats_url: "http://jenkins.example.com"
username: my_user
password: my_api_token

When using Docker, you may use labels to set the configuration:

docker run --labels glouton.stats_url="http://jenkins.example.com" --labels glouton.username="my_user" [...]

The following metrics are gathered:

MetricDescription
service_statusStatus of Jenkins
jenkins_busy_executorsNumber of busy executors
jenkins_total_executorsTotal number of executors (both busy and idle)
jenkins_job_duration_secondsJob duration in seconds
jenkins_job_numberNumber of times this job has been run
jenkins_job_result_codeJob result code (0 = SUCCESS, 1 = FAILURE, 2 = NOT_BUILD, 3 = UNSTABLE, 4 = ABORTED)

JIRA​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

To enable metrics gathering, Bleemeo agent needs to be installed with JMX enabled, see Java Metrics for setup details.

In addition Confluence need to expose JMX over a TCP port. To enable JMX you need to:

  • Add option -Dcom.sun.management.jmxremote.port=3333 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false to the JVM. Default JIRA startup-script accept them from CATALINA_OPTS or JAVA_OPTS environment variable.

Warning: this will allow unauthenticated access. Make sure no untrusted access to this port is possible or you should setup authenticated JMX access.

Bleemeo agent should auto-detect the JMX port, if some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a JIRA running outside a container
- type: "jira"
address: "127.0.0.1"
port: 8080
jmx_port: 3333
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"

# For an additional JIRA running outside a container
- type: "jira"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 8081
jmx_port: 3334
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"

# For a JIRA running in a Docker container
- type: "jira"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 8080
jmx_port: 3333
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"

Agent gathers the following metrics:

MetricDescription
service_statusStatus of JIRA
jira_jvm_gcNumber of garbage collection per second
jira_jvm_gc_utilizationGarbage collection utilization in percent
jira_jvm_heap_usedHeap memory used in bytes
jira_jvm_non_heap_usedNon-Heap memory used in bytes
jira_request_timeAverage time of request in seconds
jira_requestsNumber of requests per second

Kafka​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

To enable metrics gathering, Bleemeo agent needs to be installed with JMX enabled, see Java Metrics for setup details.

In addition, Kafka needs to expose JMX over a TCP port.

If you are using Docker, add the environment variable KAFKA_JMX_PORT=1234, see the documentation for details.

In other cases, you need to export some environment variable before running kafka-server-start.sh:

export JMX_PORT=1234
export KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=127.0.0.1"

Warning: this will allow unauthenticated access. Make sure no untrusted access to this port is possible or you should setup authenticated JMX access.

The Bleemeo agent should auto-detect the JMX port, if some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a Kafka running outside a container
- type: "kafka"
address: "127.0.0.1"
port: 9092
jmx_port: 1099
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"

# For an additional Kafka running outside a container
- type: "kafka"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 9092
jmx_port: 1099
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"

# For a Kafka running in a Docker container
- type: "kafka"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 9092
jmx_port: 1099
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"

The following metrics are gathered:

MetricDescription
service_statusStatus of Kafka
kafka_jvm_gcNumber of garbage collection per second
kafka_jvm_gc_utilizationGarbage collection utilization in percent
kafka_jvm_heap_usedHeap memory used in bytes
kafka_jvm_non_heap_usedNon-Heap memory used in bytes
kafka_topics_countTotal number of topics
kafka_fetch_requests_sumTotal number of fetch requests per second
kafka_fetch_time_averageAverage time to process a fetch request in secondss
kafka_produce_requests_sumTotal number of produce requests per second
kafka_produce_time_averageAverage time to process a produce request in secondss

Bleemeo also supports detailed monitoring of specific Kafka topics. To enable this, add the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a Kafka running outside a container
- type: "kafka"
detailed_items:
- "topic1"
- "topic2"
[...]

# For a Kafka running in a Docker container
- type: "kafka"
instance: "CONTAINER_NAME"
detailed_items:
- "topic1"
- "topic2"
[...]

When using Docker, you may use labels to set the detailed topics:

docker run --labels glouton.detailed_items="topic1,topic2" [...]

The following per-topic metrics will be gathered:

MetricDescription
kafka_fetch_requestsNumber of fetch requests per second
kafka_produce_requestsNumber of produce requests per second

If you want to enable more JMX metrics, you can add custom JMX metrics. A list of all the available metrics is available here.

libvirtd​

Service DetectionSpecific CheckMetrics
βœ…βŒβŒ

Memcached​

Service DetectionSpecific CheckMetrics
βœ…βœ…βœ…

Agent uses a Memcached check if the service listen on port 11211.

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a Memcached running outside a container
- type: "memcached"
address: "127.0.0.1"
port: 11211

# For a Memcached running outside a container
- type: "memcached"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 11212

# For a Memcached running in a Docker container
- type: "memcached"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 11211

Agent gathers the following metrics:

MetricDescription
service_statusStatus of Memcached
memcached_command_getNumber of get request per second
memcached_command_setNumber of set request per second
memcached_connections_currentNumber of client connection to Memcached
memcached_items_currentCurrent number of items stored
memcached_octets_rxNetwork traffic received by Memcached in bytes per second
memcached_octets_txNetwork traffic sent by Memcached in bytes per second
memcached_ops_cas_hitsNumber of successful CAS (Check-And-Set) request per second
memcached_ops_cas_missesNumber of CAS (Check-And-Set) request against missing keys per second
memcached_ops_decr_hitsNumber of successful decr request per second
memcached_ops_decr_missesNumber of decr request against missing keys per second
memcached_ops_delete_hitsNumber of successful delete request per second
memcached_ops_delete_missesNumber of delete request against missing keys per second
memcached_ops_evictionsNumber of valid items removed from cache to free memory for new items per second
memcached_ops_get_hitsNumber of successful get request per second
memcached_ops_get_missesNumber of get request against missing keys per second
memcached_ops_incr_hitsNumber of successful incr request per second
memcached_ops_incr_missesNumber of incr request against missing keys per second
memcached_ops_touch_hitsNumber of successful touch request per second
memcached_ops_touch_missesNumber of touch request against missing keys per second
memcached_percent_hitratioHit ratio of get request in percent
memcached_ps_count_threadsNumber of worker threads
memcached_uptimeTime spent since Memcache server start in seconds

MongoDB​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a MongoDB running outside a container
- type: "mongodb"
address: "127.0.0.1"
port: 27017

# For an additional MongoDB running outside a container
- type: "mongodb"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 27018

# For a MongoDB running in a Docker container
- type: "mongodb"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 27017

Agent gathers the following metrics:

MetricDescription
service_statusStatus of MongoDB
mongodb_open_connectionsNumber of client connections to Mongo server
mongodb_net_in_bytesNetwork traffic received by MongoDB in bytes per second
mongodb_net_out_bytesNetwork traffic sent by MongoDB in bytes per second
mongodb_queued_readsNumber of clients waiting to read data from the MongoDB
mongodb_queued_writesNumber of clients waiting to write data to the MongoDB
mongodb_active_readsNumber of clients performing read operation
mongodb_active_writesNumber of clients performing write operation
mongodb_queriesNumber of queries per second

Mosquitto​

Service DetectionSpecific CheckMetrics
βœ…βŒβŒ

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a Mosquitto running outside a container
- type: "mosquitto"
address: "127.0.0.1"
port: 1883 # MQTT listener, Agent don't support MQTT-SSL here

# For an additional Mosquitto running outside a container
- type: "mosquitto"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 1884 # MQTT listener, Agent don't support MQTT-SSL here

# For a Mosquitto running in a Docker container
- type: "mosquitto"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 1883 # MQTT listener, Agent don't support MQTT-SSL here

Only the metric from service check is produced:

MetricDescription
service_statusStatus of Mosquitto

MySQL​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

To enable metrics gathering, credentials are required. Bleemeo agent will find the credentials if:

  • MySQL and the agent are running on Ubuntu or Debian
  • MySQL is running in a Docker container and root password is set through the environment variable MYSQL_ROOT_PASSWORD

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a MySQL running outside a container
- type: "mysql"
username: "USERNAME"
password: "PASSWORD"
address: "127.0.0.1"
# If set to existing socket file, Glouton will prefer using the Unix socket to
# connect and gather metric from MySQL.
# Service check will continue to use the TCP address.
metrics_unix_socket: "/var/run/mysqld/mysqld.sock"
port: 3306

# For an additional MySQL running outside a container
- type: "mysql"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
username: "USERNAME"
password: "PASSWORD"
address: "127.0.0.1"
# If set to existing socket file, Glouton will prefer using the Unix socket to
# connect and gather metric from MySQL.
# Service check will continue to use the TCP address.
metrics_unix_socket: "/var/run/mysqld/mysqld2.sock"
port: 3307

# For a MySQL running in a Docker container
- type: "mysql"
instance: "CONTAINER_NAME"
username: "USERNAME"
password: "PASSWORD"
address: "172.17.0.2"
port: 3306

Agent gathers the following metrics:

MetricDescription
service_status: Status of MySQL
mysql_cache_result_qcache_hitsNumber of query cache hits per second
mysql_cache_result_qcache_insertsNumber of queries added to the query cache per second
mysql_cache_result_qcache_not_cachedNumber of uncacheable queries per second
mysql_cache_result_qcache_prunesNumber of queries that were deleted from the query cache because of low memory per second
mysql_cache_blocksize_qcacheNumber of blocks in the query cache
mysql_cache_free_blocksNumber of free memory blocks in the query cache
mysql_cache_free_memoryNumber of free memory for the query cache in bytes
mysql_cache_size_qcacheNumber of queries registered in the query cache
mysql_locks_immediateNumber of table locks that could be granted immediately per second
mysql_locks_waitedNumber of table locks that could not be granted immediately per second
mysql_innodb_history_list_lenSize of InnoDB transaction history list
mysql_innodb_locked_transactionNumber of InnoDB transaction currently locked
mysql_octets_rxNetwork traffic received from clients in bytes per second
mysql_octets_txNetwork traffic sent to clients in bytes per second
mysql_queriesNumber of queries per second
mysql_slow_queriesNumber of slow queries per second
mysql_threads_cachedNumber of threads in the thread cache
mysql_threads_connectedNumber of currently open connections
mysql_threads_runningNumber of threads that are not sleeping
mysql_total_threads_createdNumber of threads created per second
mysql_commands_beginNumber of "BEGIN" statement executed per second
mysql_commands_binlogNumber of "BINLOG" statement executed per second
mysql_commands_call_procedureNumber of "CALL PROCEDURE" statement executed per second
mysql_commands_change_masterNumber of "CHANGE MASTER" statement executed per second
mysql_commands_change_repl_filterNumber of "CHANGE REPL FILTER" statement executed per second
mysql_commands_checkNumber of "CHECK TABLE" statement executed per second
mysql_commands_checksumNumber of "CHECKSUM TABLE" statement executed per second
mysql_commands_commitNumber of "COMMIT" statement executed per second
mysql_commands_dealloc_sqlNumber of "DEALLOCATE PREPARE" statement executed per second
mysql_commands_stmt_closeNumber of "DEALLOCATE PREPARE" statement executed per second
mysql_commands_delete_multiNumber of "DELETE" on multiple table statement executed per second
mysql_commands_deleteNumber of "DELETE" statement executed per second
mysql_commands_doNumber of "DO" statement executed per second
mysql_commands_execute_sqlNumber of "EXECUTE" statement executed per second
mysql_commands_stmt_executeNumber of "EXECUTE" statement executed per second
mysql_commands_explain_otherNumber of "EXPLAIN FOR CONNECTION" statement executed per second
mysql_commands_flushNumber of "FLUSH" statement executed per second
mysql_commands_ha_closeNumber of "HA CLOSE" statement executed per second
mysql_commands_ha_openNumber of "HA OPEN" statement executed per second
mysql_commands_ha_readNumber of "HA READ" statement executed per second
mysql_commands_insert_selectNumber of "INSERT ... SELECT" statement executed per second
mysql_commands_insertNumber of "INSERT" statement executed per second
mysql_commands_killNumber of "KILL" statement executed per second
mysql_commands_preload_keysNumber of "LOAD INDEX INTO CACHE" statement executed per second
mysql_commands_loadNumber of "LOAD" statement executed per second
mysql_commands_lock_tablesNumber of "LOCK TABLES" statement executed per second
mysql_commands_optimizeNumber of "OPTIMIZE" statement executed per second
mysql_commands_prepare_sqlNumber of "PREPARE" statement executed per second
mysql_commands_stmt_prepareNumber of "PREPARE" statement executed per second
mysql_commands_purge_before_dateNumber of "PURGE BEFORE DATE" statement executed per second
mysql_commands_purgeNumber of "PURGE" statement executed per second
mysql_commands_release_savepointNumber of "RELEASE SAVEPOINT" statement executed per second
mysql_commands_repairNumber of "REPAIR" statement executed per second
mysql_commands_replace_selectNumber of "REPLACE SELECT" statement executed per second
mysql_commands_replaceNumber of "REPLACE" statement executed per second
mysql_commands_resetNumber of "RESET" statement executed per second
mysql_commands_resignalNumber of "RESIGNAL" statement executed per second
mysql_commands_rollback_to_savepointNumber of "ROLLBACK TO SAVEPOINT" statement executed per second
mysql_commands_rollbackNumber of "ROLLBACK" statement executed per second
mysql_commands_savepointNumber of "SAVEPOINT" statement executed per second
mysql_commands_selectNumber of "SELECT" statement executed per second
mysql_commands_signalNumber of "SIGNAL" statement executed per second
mysql_commands_slave_startNumber of "START SLAVE" statement executed per second
mysql_commands_group_replication_startNumber of "START" for group replication statement executed per second
mysql_commands_stmt_fetchNumber of "STMT FETCH" statement executed per second
mysql_commands_stmt_reprepareNumber of "STMT REPREPARE" statement executed per second
mysql_commands_stmt_resetNumber of "STMT RESET" statement executed per second
mysql_commands_stmt_send_long_dataNumber of "STMT SEND LONG DATA" statement executed per second
mysql_commands_slave_stopNumber of "STOP SLAVE" statement executed per second
mysql_commands_group_replication_stopNumber of "STOP" for group replication statement executed per second
mysql_commands_truncateNumber of "TRUNCATE" statement executed per second
mysql_commands_unlock_tablesNumber of "UNLOCK TABLES" statement executed per second
mysql_commands_update_multiNumber of "UPDATE" on multiple table statement executed per second
mysql_commands_updateNumber of "UPDATE" statement executed per second
mysql_commands_xa_commitNumber of "XA COMMIT" statement executed per second
mysql_commands_xa_endNumber of "XA END" statement executed per second
mysql_commands_xa_prepareNumber of "XA PREPARE" statement executed per second
mysql_commands_xa_recoverNumber of "XA RECOVER" statement executed per second
mysql_commands_xa_rollbackNumber of "XA ROLLBACK" statement executed per second
mysql_commands_xa_startNumber of "XA START" statement executed per second
mysql_commands_assign_to_keycacheNumber of assign to keycache commands per second
mysql_handler_commitNumber of internal commit request per second
mysql_handler_deleteNumber of rows deleted from tables per second
mysql_handler_writeNumber of rows inserted per second
mysql_handler_updateNumber of rows updated per second
mysql_handler_rollbackNumber of transaction rollback requests given to a storage engine per second

NATS​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

To enable metrics gathering, you need to enable the NATS monitoring endpoint on your server, see NATS Server Monitoring for details.

By default, the agent assumes that the monitoring endpoint is enabled on port 8222, if your server uses another port, you can change it in the configuration.

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a NATS running outside a container
- type: "nats"
address: "127.0.0.1"
port: 4222
stats_port: 8222

# For an additional NATS running outside a container
- type: "nats"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 4222
stats_port: 8222

# For a NATS running in a Docker container
- type: "nats"
instance: "CONTAINER_NAME"
port: 4222
stats_port: 8222

The following metrics are gathered:

MetricDescription
service_statusStatus of NATS
nats_uptimeTime since the NATS server is started in nanoseconds
nats_routesNumber of registered routes
nats_slow_consumersNumber of slow consumers
nats_subscriptionsNumber of active subscriptions
nats_in_bytesAmount of incoming bytes
nats_out_bytesAmount of outgoing bytes
nats_in_msgsNumber of incoming messages
nats_out_msgsNumber of outgoing messages
nats_connectionsNumber of currently active clients
nats_total_connectionsTotal number of created clients

NFS​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

Metrics are gathered using /proc/self/mountstats.

NFS should be detected automatically on your server, but you can manually enable it by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]
- type: "nfs"

The following metrics are gathered:

MetricDescription
nfs_opsThe number of operations executed per second
nfs_transmitted_bitsThe data exchanged in bits/s
nfs_rtt_per_op_secondsThe average round-trip time per operation in seconds
nfs_retransThe number of times an operation had to be retried per second

Nginx​

Service DetectionSpecific CheckMetrics
βœ…βœ…βœ…

Agent uses a HTTP check if the service listen on port 80.

To enable metrics gathering, ensure Bleemeo agent can access nginx status on the URL http://server-address/nginx_status. This usually means you should add the following to your site definition (e.g. /etc/nginx/sites-enabled/default):

location /nginx_status {
stub_status on;
}

If your nginx is not build with stub_status or if you need more information about nginx stub_status, see https://nginx.org/en/docs/http/ngx_http_stub_status_module.html

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For an nginx running outside a container
- type: "nginx"
address: "127.0.0.1"
port: 80 # HTTP listener, Agent don't support HTTPS here
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:80" # Host header sent. Like other option, you can omit them and default value will be used.

# For an additional nginx running outside a container
- type: "nginx"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 81 # HTTP listener, Agent don't support HTTPS here
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:81" # Host header sent. Like other option, you can omit them and default value will be used.

# For an nginx running in a Docker container
- type: "nginx"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 80 # HTTP listener, Agent don't support HTTPS here
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:80" # Host header sent. Like other option, you can omit them and default value will be used.

When using Docker, you may use Docker labels to set http_path and http_host:

docker run --labels glouton.http_path="/readiness" --labels glouton.http_host="my-host" [...]

Agent gathers the following metrics:

MetricDescription
service_statusStatus of nginx
nginx_requestsNumber of requests per second
nginx_connections_acceptedNumber of client connections established per second
nginx_connections_handledNumber of client connections processed per second
nginx_connections_activeNumber of client connections to nginx server
nginx_connections_waitingNumber of idle client connections waiting for a request
nginx_connections_readingNumber of client connections where nginx is reading the request header
nginx_connections_writingNumber of client connections where nginx is writing the response

NTP​

Service DetectionSpecific CheckMetrics
βœ…βœ…βŒ

Agent uses a NTP check if the service listen on port 123.

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For an NTP running outside a container
- type: "ntp"
address: "127.0.0.1"
port: 123

# For an additional NTP running outside a container
- type: "ntp"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 124

# For an NTP running in a Docker container
- type: "ntp"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 123

Only the metric from service check is produced:

MetricDescription
service_statusStatus of NTP

OpenLDAP​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

To enable metrics gathering, you need to enable the slapd monitoring backend, and to set a bind CN and password in the agent configuration so it can access the metrics.

If some auto-detected parameters are wrong or you want to configure metric gathering, you can add the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]
# For an OpenLDAP running outside a container
- type: "openldap"
address: "127.0.0.1"
port: 389
# dn/password to bind with. If username is empty, an anonymous bind is performed.
username: "cn=admin,dc=example,dc=org"
password: "adminpassword"
# Use ldaps, note that port will likely need to be changed to 636.
ssl: true
# Use StartTLS (note that you can't enable both ssl and starttls at the same time).
starttls: true
# Don't verify host certificate.
ssl_insecure: true
# Path to PEM-encoded Root certificate to use to verify the server certificate.
ca_file: "/myca"

# For an additional OpenLDAP running outside a container
- type: "openldap"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 390
username: "cn=admin,dc=example,dc=org"
password: "adminpassword"

# For an OpenLdap running in a Docker container
- type: "openldap"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 389
username: "cn=admin,dc=example,dc=org"
password: "adminpassword"

The following metrics are gathered:

MetricDescription
service_statusStatus of OpenLDAP
openldap_connections_currentCurrent number of active connections
openldap_waiters_readNumber of threads blocked waiting to read data from a client
openldap_waiters_writeNumber of threads blocked waiting to write data to a client
openldap_threads_activeThreads (operations) currently active in slapd
openldap_statistics_bytesOutgoing bytes per second
openldap_statistics_entriesOutgoing entries per second
openldap_operations_add_completedNumber of add operations per second
openldap_operations_bind_completedNumber of bind operations per second
openldap_operations_delete_completedNumber of delete operations per second
openldap_operations_modify_completedNumber of modify operations per second
openldap_operations_search_completedNumber of search operations per second

OpenVPN​

Service DetectionSpecific CheckMetrics
βœ…βŒβŒ

Only the metric from service check is produced:

MetricDescription
service_statusStatus of OpenVPN

PHP-FPM​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

To enable metrics gathering, Bleemeo agent needs access to PHP-FPM status page and to know where it may find this page.

PHP-FPM needs to expose its status. For example the pool configuration should include:

pm.status_path = /status

By defaults, Bleemeo agent try fcgi://<fpm-address>:<fpm-port>/status. E.g using FCGI over the TCP port on which PHP-FPM listen and use "/status" path.

If it does not match your configuration, you will need to override "stats_url" parameters by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]
# For a PHP-FPM running outside a container
- type: "phpfpm"
address: "127.0.0.1"
port: 9000
stats_url: "fcgi://127.0.0.1:9000/status"
# For UNIX socket access, use
# stats_url: "/var/run/php5-fpm.sock"
# See below for additional note on UNIX socket permission

# For an additional PHP-FPM running outside a container
- type: "phpfpm"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 9001
stats_url: "fcgi://127.0.0.1:9001/status"
# For UNIX socket access, use
# stats_url: "/var/run/php5-fpm2.sock"
# See below for additional note on UNIX socket permission


# For a PHP-FPM running in a Docker container
- type: "phpfpm"
instance: "CONTAINER_NAME"
port: 9000
stats_url: "fcgi://my-server:9000/status"

If using UNIX socket, you should make sure Glouton user has permission to access the socket. This usually means that Glouton user should be member of the group running php-fpm, which is www-data on Debian/Ubuntu system. Therefor sudo adduser glouton www-data and a restart of Bleemeo agent will grant access.

Agent gathers the following metrics:

MetricDescription
service_statusStatus of PHP-FPM
phpfpm_accepted_connNumber of requests per second
phpfpm_active_processesNumber of active processes
phpfpm_idle_processesNumber of idle processes
phpfpm_listen_queueNumber of requests in the queue of pending connections
phpfpm_listen_queue_lenSize of the queue of pending connections
phpfpm_max_active_processesMaximum number of active processes since FPM started
phpfpm_max_children_reachedNumber of times the process limit has been reached
phpfpm_max_listen_queueMaximum number of requests in the queue of pending connections since FPM started
phpfpm_slow_requestsNumber of slow requests per second
phpfpm_start_sinceTime spent since PHP-fpm start in seconds
phpfpm_total_processesNumber of idle + active processes

Postfix​

Service DetectionSpecific CheckMetrics
βœ…βœ…βœ…

Agent uses a SMTP check if the service listen on port 25.

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]

# For a Postfix running outside a container
- type: "postfix"
address: "127.0.0.1"
port: 25

# For an additional Postfix running outside a container
- type: "postfix"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 26

# For a Postfix running in a Docker container
- type: "postfix"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 25

Agent gathers the following metrics:

MetricDescription
service_statusStatus of Postfix
postfix_queue_sizeNumber of mails queued

PostgreSQL​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

To enable metrics gathering, credentials are required. The agent will find the credentials by itself if PostgreSQL is running in a Docker container and username/password are set through the environment variables POSTGRES_USER (which defaults to postgres) and POSTGRES_PASSWORD.

By default, only sum metrics are gathered, if you want to monitor specific databases you can add them to the detailed_items setting.

If some auto-detected parameters are wrong, or you want to monitor specific databases, you can add the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]
# For a PostgreSQL running outside a container
- type: "postgresql"
username: "USERNAME"
password: "PASSWORD"
address: "127.0.0.1"
port: 5432

# For an additional PostgreSQL running outside a container
- type: "postgresql"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
username: "USERNAME"
password: "PASSWORD"
address: "127.0.0.1"
port: 5433

# For a PostgreSQL running in a Docker container.
- type: "postgresql"
instance: "CONTAINER_NAME"
username: "USERNAME"
password: "PASSWORD"
address: "172.17.0.2"
port: 5432
# Monitor the databases called "bleemeo" and "postgres".
detailed_items:
- bleemeo
- postgres

When using Docker, you may use labels to set the databases to monitor and the credentials:

docker run --labels glouton.detailed_items="bleemeo,postgres" --labels glouton.username="USERNAME" --labels glouton.password="PASSWORD" [...]

Agent gathers the following metrics:

MetricDescription
service_statusStatus of PostgreSQL
postgresql_blk_read_utilizationPostgreSQL reading data file blocks utilization
postgresql_blk_read_utilization_sumSum of PostgreSQL reading data file blocks utilization
postgresql_blk_write_utilizationPostgreSQL writing data file blocks utilization
postgresql_blk_write_utilization_sumSum of PostgreSQL writing data file blocks utilization
postgresql_blks_hit_sumNumber of blocks read from PostgreSQL cache per second
postgresql_blks_read_sumNumber of blocks read from disk per second
postgresql_commit_sumNumber of commits per second
postgresql_rollback_sumNumber of rollbacks per second
postgresql_temp_bytes_sumTemporary file write throughput in bytes per second
postgresql_temp_files_sumNumber of temporary files created per second
postgresql_tup_deleted_sumNumber of rows deleted per second
postgresql_tup_fetched_sumNumber of rows fetched per second
postgresql_tup_inserted_sumNumber of rows inserted per second
postgresql_tup_returned_sumNumber of rows returned per second
postgresql_tup_updated_sumNumber of rows updated per second

Except for the status metric, all metrics also exist without the _sum suffix if detailed_items is used. For instance, the metric postgresql_commit with the item mycontainer_mydb corresponds to the number of commits per second on the database mydb running in the container mycontainer.

RabbitMQ​

Service DetectionSpecific CheckMetrics
βœ…βœ…βœ…

Agent uses a AMQP check if the service listen on port 5672.

To enable metrics gathering, credentials are required. By default "guest/guest" is used.

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]
# For a RabbitMQ running outside a container
- type: "rabbitmq"
username: "USERNAME"
password: "PASSWORD"
address: "127.0.0.1"
port: 5672 # Port of AMQP service
stats_port: 15672 # Port of RabbitMQ management interface

# For an additional RabbitMQ running outside a container
- type: "rabbitmq"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
username: "USERNAME"
password: "PASSWORD"
address: "127.0.0.1"
port: 5673 # Port of AMQP service
stats_port: 15673 # Port of RabbitMQ management interface

# For a RabbitMQ running in a Docker container
- type: "rabbitmq"
instance: "CONTAINER_NAME"
username: "USERNAME"
password: "PASSWORD"
address: "172.17.0.2"
port: 5672 # Port of AMQP service
stats_port: 15672 # Port of RabbitMQ management interface

Agent gathers the following metrics:

MetricDescription
service_statusStatus of RabbitMQ
rabbitmq_connectionsNumber of client connections to RabbitMQ server
rabbitmq_consumersNumber of consumers
rabbitmq_messages_ackedNumber of messages acknowledged per second
rabbitmq_messages_countNumber of messages
rabbitmq_messages_deliveredNumber of messages delivered per second
rabbitmq_messages_publishedNumber of messages published per second
rabbitmq_messages_unacked_countNumber of messages waiting for an acknowledgement from a consumer
rabbitmq_queuesNumber of queues

Redis​

Service DetectionSpecific CheckMetrics
βœ…βœ…βœ…

Agent uses a Redis check if the service listen on port 6379.

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]
# For a Redis running outside a container
- type: "redis"
address: "127.0.0.1"
port: 6379
password: "password-if-configured"

# For an additional Redis running outside a container
- type: "redis"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 6380
password: "password-if-configured"

# For a Redis running in a Docker container
- type: "redis"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 6379
password: "password-if-configured"

Agent gathers the following metrics:

MetricDescription
service_statusStatus of Redis
redis_current_connections_clientsNumber of connected clients
redis_current_connections_slavesNumber of connected slaves
redis_evicted_keysTotal number of key evicted keys due to maxmemory limit per second
redis_expired_keysTotal number of key expiration events per second
redis_keyspace_hitsNumber of successful lookup of keys per second
redis_keyspace_missesNumber of successful lookup of keys per second
redis_keyspace_hitrateHit ratio of keys lookup in percent
redis_memoryMemory allocated by Redis in bytes
redis_memory_luaMemory used by Lua engine in bytes
redis_memory_peakPeak memory consumed by Redis in bytes
redis_memory_rssMemory used by Redis as seen by system in bytes
redis_pubsub_channelsGlobal number of pub/sub channels with client subscriptions
redis_pubsub_patternsGlobal number of pub/sub pattern with client subscriptions
redis_total_connectionsNumber of connections (client or slave) per second
redis_total_operationsNumber of commands processed by the server per second
redis_uptimeTime spent since Redis server start in seconds
redis_volatile_changesNumber of changes since the last dump

Salt​

Service DetectionSpecific CheckMetrics
βœ…βŒβŒ

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]
# For a Salt master running outside a container
- type: "salt"
address: "127.0.0.1"
port: 4505

# For an additional Salt master running outside a container
- type: "salt"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 4505

# For a Salt master running in a Docker container
- type: "salt"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 4505

Only the metric from service check is produced:

MetricDescription
service_statusStatus of Salt

Squid3​

Service DetectionSpecific CheckMetrics
βœ…βœ…βŒ

Agent uses a HTTP check if the service listen on port 3128.

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]
# For a Squid running outside a container
- type: "squid"
address: "127.0.0.1"
port: 3128
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:80" # Host header sent. Like other option, you can omit them and default value will be used.

# For an additional Squid running outside a container
- type: "squid"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 3129
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:81" # Host header sent. Like other option, you can omit them and default value will be used.

# For a Squid running in a Docker container
- type: "squid"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 3128
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:80" # Host header sent. Like other option, you can omit them and default value will be used.

When using Docker, you may use Docker labels to set http_path and http_host:

docker run --labels glouton.http_path="/readiness" --labels glouton.http_host="my-host" [...]

Only the metric from service check is produced:

MetricDescription
service_statusStatus of Squid3

UPSD​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

The agent gathers metrics on UPS, you need to install and configure nut-server.

To configure NUT, please refer to the documentation.

You may need to provide the UPSD user credentials to the agent depending on your configuration.

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]
- type: "upsd"
address: "127.0.0.1"
port: 3128
# UPSD user credentials.
username: ""
password: ""

The following metrics are gathered:

MetricDescription
service_statusStatus of UPSD
upsd_battery_statusBattery status. The status is critical when the UPS is overloaded, on battery and when the battery is low or needs to be replaced.
upsd_status_flagsStatus flags, see apcupsd for details.
upsd_battery_voltageBattery voltage
upsd_input_voltageInput voltage
upsd_output_voltageOutput voltage
upsd_load_percentLoad of the UPS in percent
upsd_battery_charge_percentBattery charge in percent
upsd_internal_tempInternal temperature in Β°C
upsd_input_frequencyInput frequency in Hz
upsd_time_left_secondsTime left on battery in seconds
upsd_time_on_battery_secondsTime spent on batttery in seconds

uWSGI​

Service DetectionSpecific CheckMetrics
βœ…βŒβœ…

The agent gathers metrics using the uWSGI stats server, so you should enable it.

You can do so by adding the options --stats 127.0.0.1:1717 --memory-report to your server. Note that without --memory-report, the metric uwsgi_memory_used won't be available.

By default, the agent assumes that the stats server is enabled on port 1717, if your server uses another port, you can change it in the configuration.

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]
# For a uWSGI running outside a container
- type: "uwsgi"
address: "127.0.0.1"
port: 8080
stats_port: 1717
# If your server uses the --stats-http option, you should set the protocol to "http".
# If the protocol is not set, "tcp" is used by default.
stats_protocol: "tcp"

# For an additional uWSGI running outside a container
- type: "uwsgi"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 8080
stats_port: 1717

# For a uWSGI running in a Docker container
- type: "uwsgi"
instance: "CONTAINER_NAME"
port: 8080
stats_port: 1717

The following metrics are gathered:

MetricDescription
service_statusStatus of uWSGI
uwsgi_requestsNumber of requests per second
uwsgi_transmittedAmount of data transmitted in bits/s
uwsgi_memory_usedMemory used in bytes
uwsgi_avg_request_timeAverage time to process a request in seconds
uwsgi_exceptionsNumber of exceptions per second
uwsgi_harakiri_countNumber of worker timeout per second

Valkey​

Service DetectionSpecific CheckMetrics
βœ…βœ…βœ…

Agent uses a Redis check if the service listens on port 6379.

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]
# For a Valkey running outside a container
- type: "valkey"
address: "127.0.0.1"
port: 6379
password: "password-if-configured"

# For an additional Valkey running outside a container
- type: "valkey"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 6380
password: "password-if-configured"

# For a Valkey running in a Docker container
- type: "valkey"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 6379
password: "password-if-configured"

Agent gathers the same metrics as Redis.

Varnish​

Service DetectionSpecific CheckMetrics
βœ…βŒβŒ

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]
# For a Varnish running outside a container
- type: "varnish"
address: "127.0.0.1"
port: 6082

# For an additional Varnish running outside a container
- type: "varnish"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 6083

# For a Varnish running in a Docker container
- type: "varnish"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 6082

Only the metric from service check is produced:

MetricDescription
service_statusStatus of Varnish

ZooKeeper​

Service DetectionSpecific CheckMetrics
βœ…βœ…βœ…

The agent uses a ZooKeeper check if the service listen on port 2181.

The status check uses the ruok command, you may need to add it to the whitelist in 4lw.commands.whitelist, see the documentation for details.

To enable metrics gathering, Bleemeo agent needs to be installed with JMX enabled, see Java Metrics for setup details.

To gather JMX metrics, ZooKeeper must expose JMX over a TCP port. To enable JMX you will need to:

  • If using Docker for ZooKeeper, add environment variable JMXPORT=1234
  • In other cases, add the option -Dcom.sun.management.jmxremote.port=1234 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false to the JVM. Usually in JAVA_OPTS in /etc/default/zookeeper.

Warning: in both case, this will allow unauthenticated access. Make sure no untrusted access to this port is possible or you should setup authenticated JMX access.

If some auto-detected parameters are wrong, you can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf:

service:
[...]
# For a ZooKeeper running outside a container
- type: "zookeeper"
address: "127.0.0.1"
port: 2181
jmx_port: "JMX_PORT"
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"

# For an additional ZooKeeper running outside a container
- type: "zookeeper"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 2182
jmx_port: "JMX_PORT"
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"

# For a ZooKeeper running in a Docker container
- type: "zookeeper"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 2181
jmx_port: "JMX_PORT"
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"

Agent gathers the following metrics:

MetricDescription
service_statusStatus of ZooKeeper
zookeeper_connectionsNumber of client connections to ZooKeeper server
zookeeper_packets_receivedNumber of packets received sent per second
zookeeper_packets_sentNumber of packets sent per second
zookeeper_ephemerals_countNumber of ephemeral node
zookeeper_watch_countNumber of ZooKeeper watch
zookeeper_znode_countNumber of znode

Agent gathers the following metrics through JMX:

MetricDescription
zookeeper_jvm_gcNumber of garbage collection per second
zookeeper_jvm_gc_utilizationGarbage collection utilization in percent
zookeeper_jvm_heap_usedHeap memory used in bytes
zookeeper_jvm_non_heap_usedNon-Heap memory used in bytes