Services Monitoring
Bleemeo agent can discover services running on your system and automatically monitor specific metrics for such services. For example with Apache HTTP server, the number of requests served is automatically monitored. For each service detected a tag with the service name is created which allows to filter your agents by service running on them.
If you have any service not listed on this page, you can define a custom check or define a custom metric.
If you want to disable metrics for a service, you can ignore some services.
Common Featuresβ
Service Statusβ
The agent checks TCP sockets for each service. By default, a simple TCP connection is used to test the service, but some services support a specific check. See the service details below for the supported specific checks. Those checks are executed every minute. They may be run earlier for TCP services: the Bleemeo agent keeps a connection open with the service and if that connection is broken the check is executed immediately.
The current status of the service can be viewed in the Status Dashboard and you can configure a notification to be alerted when the status changes.
The history of the status is also stored in a metric named service_status
. One metric per service
is created, the metric has two labels to identify the service:
service
: the kind of the service, like "apache", "nginx"...service_instance
: the container name. This label is absent when the service isn't running in a container.
The value of this metric is:
- 0 when the check passed successfully
- 1 when the check passed with a warning. For example an Apache server responded with a 404 page
- 2 when the check detected an issue with the service
- 3 when the check doesn't know the status of the service. This happens when the check itself failed, usually due to a timeout
Overridable auto-discovery settingsβ
It's possible to override the auto-discovery parameter of any service using either configuration files, Docker labels or Kubernetes annotations.
Using configuration fileβ
Using Bleemeo agent configuration files is done by adding entries like the following to /etc/glouton/conf.d/90-service-override.conf
:
service:
- type: "apache"
ignore_ports:
- 8000
- 9000
- type: "mysql"
instance: "name_of_a_container"
username: root
password: root
The service
key contains a list of service to override settings. Add one entry per service.
Each service is identified by the couple "type" and "instance". The service "type" it's not a customizable name, unless
your are creating a Custom checks, the value should match one of the supported service
type ("apache", "nginx", "postgresql"...). See below for the full list of supported services.
Service instance could be omitted when the service is running outside a container. It's the container name for containerized service.
All other value (port, username and password in above examples) are overridable settings which are described below
Using Docker labels or Kubernetes annotationsβ
If you are using Docker or Kubernetes, instead of Bleemeo agent configuration file, you can use Docker labels or
Kubernetes annotations. Any overridable settings could be added using the name glouton.SETTING
.
For instance, to ignore ports 8000 and 9000 on a Docker container, use:
docker run --labels glouton.ignore_ports="8000,9000" [...]
The same thing for a Kubernetes deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: "my-application"
spec:
template:
metadata:
# Create the annotations on the pod, not on the deployment
annotations:
glouton.ignore_ports: "8000,9000"
Overridable settingsβ
Sample of a service with all settings overridden:
# This is only a sample to list all possible value. It don't make sense to have all
# settings on a single service. Only add the setting you need to override.
service:
- type: apache
instance: my_container
address: 127.0.0.1
port: 1234
ignore_ports:
- 8000
- 9000
tags:
- mytag1
- mytag2
interval: 60
http_path: /my_custom_path
http_status_code: 200
http_host: example.com
check_type: nagios
match_process: command-to-check
check_command: command-to-run
nagios_nrpe_name: nagios_nrpe_name
username: username
password: secret
metrics_unix_socket: /var/run/mysqld/mysqld.sock
stats_url: http://localhost:9000/status
stats_port: 9000
stats_protocol: http
detailed_items:
- table1
- table2
included_items:
- job1
excluded_items:
- job2
jmx_port: 3333
jmx_username: monitorRole
jmx_password: secret
jmx_metrics: []
ssl: false
ssl_insecure: false
starttls: false
ca_file: /etc/ssl/certs/ca-certificates.crt
cert_file: /etc/ssl/certs/client.crt
key_file: /etc/ssl/private/client.jey
addressβ
This is the IP address on which the service is listening.
portβ
This is the TCP port on which the service is listening.
ignore_portsβ
This is a list of port to ignore from auto-discovery. The auto-discovery could find the a service is listening on multiple ports, and the service would be monitored on every ports. But this could be wrong, for example a Docker nginx that have two exposed ports in the Dockerfile (80 and 443) but only listen on 80. In that case you will want to ignore the port 443.
tagsβ
This is a list of tag name to associate with the services.
intervalβ
This is the interval in second to check for the service status. The number could not be smaller than 60 seconds which is the default, but could be increased if your service check is expensive.
http_path, http_status_code, http_hostβ
For service that expose an HTTP service, by default the agent will use the service's address and port to do a query on http://address:port/
and
expect a HTTP 200 status code. Service's address is an IP address, likely "127.0.0.1" when not using a container, this could not match
your service configuration, especially when using virtual host.
http_host allows to specify the HTTP host header send in the request. This is useful when the HTTP server configured with virtual host and don't
reply to request on something like http://127.0.0.1
.
http_path allows to check a some sub-path, like "/ready" to check for service specific page dedicated to service checking.
http_status_code allows the check to expect other HTTP status code.
This is supported by the following services:
- Apache HTTP
- InfluxDB
- Nginx
- Squid
- custom HTTP check
For example, with the following service override:
service:
- type: apache
address: 127.0.0.1
port: 8080
http_path: /ready
http_status_code: 204
http_host: example.com
The Bleemeo agent will connect to 127.0.0.1 on port 8080 and send the HTTP request http://example.com/ready
check_type, match_process, check_commandβ
This is used to configure a custom check, see Custom checks for details.
nagios_nrpe_nameβ
If the NRPE server is enabled on the agent configuration (see nrpe.enable), you can expose any service check to your Nagios server.
Example:
service:
- type: "apache"
nagios_nrpe_name: check_apache
username, passwordβ
Some service required authentication for metrics collection and/or service check. This configure the credentials to use.
This is supported by the following services:
- Jenkins
- MySQL
- OpenLDAP
- PostgreSQL
- RabbitMQ
- UPSD
stats_urlβ
Some service use a different port for the service itself and for exposing monitoring information. This setting allows to specify where the service expose its metrics.
This is supported by the following services:
- HAProxy
- Jenkins
- PHP-FPM
For example:
service:
- type: haproxy
port: 80
stats_url: "http://localhost:8080/statistics"
The agent will check that the HAproxy service is listening on port 80, but it will use the URL on port 8080 to collect metrics for the HAProxy.
stats_portβ
Some service use a different port for the service itself and for exposing monitoring information. This setting allows to specify the port where the service expose its metrics.
This setting is simalar to stats_url but only specify the port rather than the full URL.
This is supported by the following services:
- NATS
- RabbitMQ
- uWSGI
stats_protocolβ
Some service could use multiple protocol to expose monitoring information. This setting allows to specify the protocol that could be used. It works with stats_port.
This is supported by the following services:
- uWSGI
For example:
service:
- type: "uwsgi"
address: "127.0.0.1"
port: 8080
stats_port: 1717
# This assume uWSGI uses the --stats-http option and expose them to port 1717
stats_protocol: "http"
metrics_unix_socketβ
Some service could listen on Unix socket rather than TCP socket. This allows to configure the path to Unix socket and the agent will try to use that socket for service metrics collection.
This is supported by the following services:
- MySQL
detailed_itemsβ
Some service exposes metrics per item (like per tables, per databases...). This allows to configure for which items metrics should be collected. Those services also expose global metrics which are enabled by default.
This is supported by the following services:
- PostgreSQL: for detailed metrics on databases
- Cassandra: for detailed metrics on tables
- Kafka: for detailed metrics on topics
included_items, excluded_itemsβ
Some service exposes metrics per item (like per jobs...). This allows to configure for which items metrics should be collected. Unlike detailed_items, this is used on service that don't expose a global metric which is the aggregation of each per item metrics.
This is supported by the following services:
- Jenkins: for metrics per job
jmx_port, jmx_username, jmx_password, jmx_metricsβ
For a Java service, configure how to collect metrics using JMX. See Java metrics for details.
ssl, ssl_insecure, starttls, ca_file, cert_file, key_fileβ
Some service could expose plain-text or TLS version of the service. This allows to enable TLS or stay in plain-text (default).
This is supported by the following services:
- OpenLDAP
For example:
service:
- type: "openldap"
address: "127.0.0.1"
port: 389
starttls: true
# By default SSL certificate are verified. This means that your OpenLDAP service need to either
# to use a certificate signed by trusted authority or provide the ca_file to your self-signed CA.
ssl_insecure: false
ca_file: "/path/to/your-self-signed-ca.crt"
Apache HTTPβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Agent uses a HTTP check if the service listen on port 80.
To enable metrics gathering, ensure Bleemeo agent could access Apache status on
the URL http://server-address/server-status?auto
. This
should be done by default with Apache on Ubuntu or Debian. If not, this
usually means to add the following to your default virtual host
(e.g. the first file in /etc/apache2/sites-enabled/
):
LoadModule status_module /usr/lib/apache2/modules/mod_status.so
<Location /server-status>
SetHandler server-status
# You may want to limit access only to Bleemeo agent
# require ip IP-OF-BLEEMEO-AGENT/32
</Location>
ExtendedStatus On
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For an Apache running outside a container
- type: "apache"
address: "127.0.0.1"
port: 80 # HTTP listener, Agent don't support HTTPS here
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:80" # Host header sent. Like other option, you can omit them and default value will be used.
# For an additional Apache running outside a container
- type: "apache"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 81 # HTTP listener, Agent don't support HTTPS here
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:80" # Host header sent. Like other option, you can omit them and default value will be used.
# For an Apache running in a Docker container
- type: "apache"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 80 # HTTP listener, Agent don't support HTTPS here
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:80" # Host header sent. Like other option, you can omit them and default value will be used.
When using Docker, you may use Docker labels to set http_path and http_host:
docker run --labels glouton.http_path="/readiness" --labels glouton.http_host="my-host" [...]
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of Apache |
apache_busy_workers | Number of Apache worker busy |
apache_busy_workers_perc | Apache worker busy in percent |
apache_bytes | Network traffic sent by Apache in bytes per second |
apache_connections | Number of client connections to Apache server |
apache_idle_workers | Number of Apache workers waiting for an incoming request |
apache_max_workers | Maximum number of Apache worker configured |
apache_requests | Number of requests per second |
apache_uptime | Time spent since Apache server start in seconds |
Apache keeps track of server activity in a structure known as the scoreboard. There is a slot in the scoreboard for each worker and it contains the status of this worker. The size of the scoreboard is the maximum of concurrent users that Apache could server.
Metric | Description |
---|---|
apache_scoreboard_waiting | Number of workers waiting for an incoming request. It's the same as apache_idle_workers |
apache_scoreboard_starting | Number of workers starting up |
apache_scoreboard_reading | Number of workers reading an incoming request |
apache_scoreboard_sending | Number of workers processing a client request |
apache_scoreboard_keepalive | Number of workers waiting for another request via keepalive |
apache_scoreboard_dnslookup | Number of workers looking up a hostname |
apache_scoreboard_closing | Number of workers closing their connection |
apache_scoreboard_logging | Number of workers writing in log files |
apache_scoreboard_finishing | Number of workers gracefully finishing a request |
apache_scoreboard_idle_cleanup | Number of idle workers being killed |
apache_scoreboard_open | Number of slots with no worker in the scoreboard |
Apache do not start all workers when they are not needed (e.g. if there is enough workers waiting, Apache reuse them and don't start a new one).
The sum of all scoreboard items is the maximum of concurrent requests Apache can serve. This sum is calculated and stored in apache_max_workers metric.
Asteriskβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Bitbucketβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
To enable metrics gathering, Bleemeo agent needs to be installed with JMX enabled, see Java Metrics for setup details.
In addition Bitbucket need to expose JMX over a TCP port. To enable JMX you can follow Enabling JMX counters for performance monitoring on Atlassian documentation.
Here is a summary for unauthenticated access:
- Add option
-Dcom.sun.management.jmxremote.port=3333 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
to the JVM. Default Bitbucket startup-script accept them from JMX_OPTS or JAVA_OPTS environment variable. - Add
jmx.enable=true
to $BITBUCKET_HOME/shared/bitbucket.properties
Warning: this will allow unauthenticated access. Make sure no untrusted access to this port is possible or you should setup authenticated JMX access.
Bleemeo agent should auto-detect the JMX port, if some auto-detected parameters are wrong, you
can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Bitbucket running outside a container
- type: "bitbucket"
address: "127.0.0.1"
port: 7990
jmx_port: 3333
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"
# For an additional Bitbucket running outside a container
- type: "bitbucket"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 7991
jmx_port: 3334
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"
# For a Bitbucket running in a Docker container
- type: "bitbucket"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 7990
jmx_port: 3333
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of Bitbucket |
bitbucket_events | Number of events per second |
bitbucket_io_tasks | Number of events per second |
bitbucket_jvm_gc | Number of garbage collection per second |
bitbucket_jvm_gc_utilization | Garbage collection utilization in percent |
bitbucket_jvm_heap_used | Heap memory used in bytes |
bitbucket_jvm_non_heap_used | Non-Heap memory used in bytes |
bitbucket_pulls | Number of pulls per second |
bitbucket_pushes | Number of pushes per second |
bitbucket_queued_events | Number of events queued |
bitbucket_queued_scm_clients | Number of SCM clients queued |
bitbucket_queued_scm_commands | Number of SCM commands queued |
bitbucket_request_time | Average time of request in seconds |
bitbucket_requests | Number of requests per second |
bitbucket_ssh_connections | Number of SSH connections per second |
bitbucket_tasks | Number of scheduled tasks per second |
Cassandraβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
To enable metrics gathering, Bleemeo agent needs to be installed with JMX enabled, see Java Metrics for setup details.
In addition Cassandra need to expose JMX over a TCP port. To enable JMX you will need to:
- If using Docker, add environment variable
JVM_OPTS=-Dcassandra.jmx.remote.port=7199
- If running Cassandra and Bleemeo agent as native package, nothing should be needed. Cassandra already expose JMX on port 7199 on localhost by default.
- In other case, add option
-Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.authenticate=false
to the JVM.
Warning: this will allow unauthenticated access. Make sure no untrusted access to this port is possible or you should setup authenticated JMX access.
Bleemeo agent should auto-detect the JMX port, if some auto-detected parameters are wrong, you
can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Cassandra running outside a container
- type: "cassandra"
address: "127.0.0.1"
port: 9042
jmx_port: 7199
jmx_username: "cassandra" # by default, no authentication is done
jmx_password: "cassandra"
# For an additional Cassandra running outside a container
- type: "cassandra"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 9043
jmx_port: 7200
jmx_username: "cassandra" # by default, no authentication is done
jmx_password: "cassandra"
# For a Cassandra running in a Docker container
- type: "cassandra"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 9042
jmx_port: 7199
jmx_username: "cassandra" # by default, no authentication is done
jmx_password: "cassandra"
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of Cassandra |
cassandra_bloom_filter_false_ratio | Number of false positive of the bloom filter in percent |
cassandra_jvm_gc | Number of garbage collection per second |
cassandra_jvm_gc_utilization | Garbage collection utilization in percent |
cassandra_jvm_heap_used | Heap memory used in bytes |
cassandra_jvm_non_heap_used | Non-Heap memory used in bytes |
cassandra_read_requests | Number of read requests per second |
cassandra_read_time | Average time of read requests in seconds |
cassandra_sstable | Number of SSTable |
cassandra_write_requests | Number of write requests per second |
cassandra_write_time | Average time of write requests in seconds |
Bleemeo also support for detailed monitoring of specific Cassandra tables. To enable this,
add the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Cassandra running outside a container
- type: "cassandra"
detailed_items:
- "keyspace1.table1"
- "keyspace2.table2"
[...]
# For a Cassandra running in a Docker container
- type: "cassandra"
instance: "CONTAINER_NAME"
detailed_items:
- "keyspace1.table1"
- "keyspace2.table2"
[...]
The following per-table metrics will be gathered:
Metric | Description |
---|---|
cassandra_bloom_filter_false_ratio | Number of false positive of the bloom filter in percent |
cassandra_read_requests | Number of read requests per second |
cassandra_read_time | Average time of read requests in seconds |
cassandra_sstable | Number of SSTable |
cassandra_write_requests | Number of write requests per second |
cassandra_write_time | Average time of write requests in seconds |
Confluenceβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
To enable metrics gathering, Bleemeo agent needs to be installed with JMX enabled, see Java Metrics for setup details.
In addition Confluence need to expose JMX over a TCP port. To enable JMX you can follow Live Monitoring Using the JMX Interface.
Here is a summary for unauthenticated access:
- Add option
-Dcom.sun.management.jmxremote.port=3333 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
to the JVM. Default Confluence startup-script accept them from CATALINA_OPTS or JAVA_OPTS environment variable.
Warning: this will allow unauthenticated access. Make sure no untrusted access to this port is possible or you should setup authenticated JMX access.
Bleemeo agent should auto-detect the JMX port, if some auto-detected parameters are wrong, you
can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Confluence running outside a container
- type: "confluence"
address: "127.0.0.1"
port: 8090
jmx_port: 3333
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"
# For an additional Confluence running outside a container
- type: "confluence"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 8091
jmx_port: 3334
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"
# For a Confluence running in a Docker container
- type: "confluence"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 8090
jmx_port: 3333
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of Confluence |
confluence_db_query_time | Example for database query time in seconds |
confluence_jvm_gc | Number of garbage collection per second |
confluence_jvm_gc_utilization | Garbage collection utilization in percent |
confluence_jvm_heap_used | Heap memory used in bytes |
confluence_jvm_non_heap_used | Non-Heap memory used in bytes |
confluence_last_index_time | Time of last indexing task in seconds |
confluence_queued_error_mails | Number of mails in error queued |
confluence_queued_index_tasks | Number of indexing tasks queued |
confluence_queued_mails | Number of mails queued |
confluence_request_time | Average time of request in seconds |
confluence_requests | Number of requests per second |
BINDβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a BIND running outside a container
- type: "bind"
address: "127.0.0.1"
port: 53
# For an additional BIND running outside a container
- type: "bind"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 54
# For a BIND running in a Docker container
- type: "bind"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 53
Only the metric from service check is produced:
Metric | Description |
---|---|
service_status | Status of BIND |
Dovecotβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Agent uses a IMAP check if the service listen on port 143.
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Dovecot running outside a container
- type: "dovecot"
address: "127.0.0.1"
port: 143 # IMAP listener, Agent don't support IMAPS here
# For an additional Dovecot running outside a container
- type: "dovecot"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 144 # IMAP listener, Agent don't support IMAPS here
# For a Dovecot running in a Docker container
- type: "dovecot"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 143 # IMAP listener, Agent don't support IMAPS here
Only the metric from service check is produced:
Metric | Description |
---|---|
service_status | Status of Dovecot |
ejabberdβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For an ejabberd running outside a container
- type: "ejabberd"
address: "127.0.0.1"
port: 5672
# For an additional ejabberd running outside a container
- type: "ejabberd"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 5673
# For an ejabberd running in a Docker container
- type: "ejabberd"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 5672
Only the metric from service check is produced:
Metric | Description |
---|---|
service_status | Status of ejabberd |
Elasticsearchβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Agent uses a HTTP check if the service listen on port 9200.
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For an Elasticsearch running outside a container
- type: "elasticsearch"
address: "127.0.0.1"
port: 9200
# For an additional Elasticsearch running outside a container
- type: "elasticsearch"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 9201
# For an Elasticsearch running in a Docker container
- type: "elasticsearch"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 9200
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of Elasticsearch |
elasticsearch_docs_count | Number of documents stored in all indices |
elasticsearch_jvm_gc | Number of garbage collection per second |
elasticsearch_jvm_gc_utilization | Garbage collection utilization in percent |
elasticsearch_jvm_heap_used | Heap memory used in bytes |
elasticsearch_jvm_non_heap_used | Non-Heap memory used in bytes |
elasticsearch_size | Size of all indices in bytes |
elasticsearch_search | Number of search in a shard per seconds |
elasticsearch_search_time | Average time took by search in secondss |
elasticsearch_cluster_docs_count | Number of documents stored in all indices of the cluster |
elasticsearch_cluster_size | Size of all indices of the cluster in bytes |
Exim4β
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Agent uses a SMTP check if the service listen on port 25.
To enable metrics gathering, ensure Bleemeo agent could run mailq command.
This usually means to add the following to your exim configuration (e.g. /etc/exim4/conf.d/main/99_local
):
queue_list_requires_admin=false
You will then need to fresh configuration (warning: the following will lost any
local change in /etc/exim4/exim4.conf.template
):
update-exim4.conf.template --run
update-exim4.conf
service exim4 restart
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For an Exim running outside a container
- type: "exim"
address: "127.0.0.1"
port: 25
# For an additional Exim running outside a container
- type: "exim"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 26
# For an Exim running in a Docker container
- type: "exim"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 25
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of Exim |
exim_queue_size | Number of mails queued |
Fail2banβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
The agent will detect Fail2ban on your server and check that the server process stays active.
If you installed Glouton using wget
or with a system package Glouton will
gather metrics with no further configuration, otherwise, you need to allow
Glouton to run fail2ban-client status
as root. You can do so by adding the
following to /etc/sudoers.d/glouton
:
Cmnd_Alias FAIL2BAN = /usr/bin/fail2ban-client status, /usr/bin/fail2ban-client status *
glouton ALL=(root) NOEXEC: NOPASSWD: FAIL2BAN
Defaults!FAIL2BAN !logfile, !syslog, !pam_session
The following metrics are gathered:
Metric | Description |
---|---|
service_status | Status of Fail2ban |
fail2ban_failed | Number of failed authentications |
fail2ban_banned | Number of banned IP |
FreeRADIUSβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
HAProxyβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
To enable metrics gathering, Bleemeo agent needs access to HAProxy stats page and to know where it may find this page.
On HAProxy side, you will need to enable statistics page on a HTTP(S) frontend, for example:
frontend api-http
bind 0.0.0.0:80
stats enable
stats uri /statistics
On Bleemeo agent side, you need configure check by adding the following to
/etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a HAProxy running outside a container
- type: "haproxy"
address: "127.0.0.1"
port: 80
stats_url: "http://my-server/statistics"
# For authenticated access, use
# stats_url: "http://username:password@my-server/statistics"
- type: "haproxy"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 81
stats_url: "http://my-server:81/statistics"
# For authenticated access, use
# stats_url: "http://username:password@my-server:81/statistics"
# For a HAProxy running in a Docker container
- type: "haproxy"
instance: "CONTAINER_NAME"
address: "127.0.0.1"
port: 80
stats_url: "http://my-server/statistics"
# For authenticated access, use
# stats_url: "http://username:password@my-server/statistics"
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of HAProxy |
haproxy_act | Number of active servers |
haproxy_bin | Network traffic received from clients in bytes per second |
haproxy_bout | Network traffic sent to clients in bytes per second |
haproxy_ctime | Average time spent opening a connection in seconds |
haproxy_dreq | Number of request denied by HAproxy per second |
haproxy_dresp | Number of response denied by HAproxy per second |
haproxy_econ | Number of failed connection per second |
haproxy_ereq | Number of request in error per second |
haproxy_eresp | Number of response in error per second |
haproxy_qcur | Number of currently queued requests |
haproxy_qtime | Average time request spent in queue in seconds |
haproxy_req_tot | Number of HTTP request per second |
haproxy_rtime | Average response time in seconds |
haproxy_scur | Number of sessions opened |
haproxy_stot | Number of sessions per second |
haproxy_ttime | Average time of session in seconds |
InfluxDBβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Agent uses an HTTP check if the service listens on port 8086.
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For an InfluxDB running outside a container
- type: "influxdb"
address: "127.0.0.1"
port: 8086
# For an additional InfluxDB running outside a container
- type: "influxdb"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 8087
# For an InfluxDB running in a Docker container
- type: "influxdb"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 8086
Only the metric from service check is produced:
Metric | Description |
---|---|
service_status | Status of InfluxDB |
Jenkinsβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
To enable metric gathering, you need to configure the Jenkins URL and the credentials the agent should use to get information about the latest jobs.
To create a Jenkins API token, click your name on the top right, then click Configure and Add new Token.
To configure your service, add the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Jenkins running outside a container
- type: "jenkins"
address: "127.0.0.1"
port: 8080
# To enable metric gathering, the fields stats_url, username and password are required.
# The other fields are optional and can be omitted.
# Jenkins URL.
stats_url: "http://jenkins.example.com"
# Credentials used for authentication.
username: my_user
password: my_api_token
## TLS configurations.
ca_file: "/myca.pem"
cert_file: "/mycert.pem"
key_file: "/mykey.pem"
# Skip chain and host verification.
ssl_insecure: false
# Choose jobs to include or exclude. When using both lists, exclude has priority.
# Wildcards are supported: [ "jobA/*", "jobB/subjob1/*"]. If empty, all jobs are included.
included_items: []
excluded_items: []
# For a Jenkins running in a Docker container
- type: "jenkins"
instance: "CONTAINER_NAME"
port: 8080
stats_url: "http://jenkins.example.com"
username: my_user
password: my_api_token
When using Docker, you may use labels to set the configuration:
docker run --labels glouton.stats_url="http://jenkins.example.com" --labels glouton.username="my_user" [...]
The following metrics are gathered:
Metric | Description |
---|---|
service_status | Status of Jenkins |
jenkins_busy_executors | Number of busy executors |
jenkins_total_executors | Total number of executors (both busy and idle) |
jenkins_job_duration_seconds | Job duration in seconds |
jenkins_job_number | Number of times this job has been run |
jenkins_job_result_code | Job result code (0 = SUCCESS, 1 = FAILURE, 2 = NOT_BUILD, 3 = UNSTABLE, 4 = ABORTED) |
JIRAβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
To enable metrics gathering, Bleemeo agent needs to be installed with JMX enabled, see Java Metrics for setup details.
In addition Confluence need to expose JMX over a TCP port. To enable JMX you need to:
- Add option
-Dcom.sun.management.jmxremote.port=3333 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
to the JVM. Default JIRA startup-script accept them from CATALINA_OPTS or JAVA_OPTS environment variable.
Warning: this will allow unauthenticated access. Make sure no untrusted access to this port is possible or you should setup authenticated JMX access.
Bleemeo agent should auto-detect the JMX port, if some auto-detected parameters are wrong, you
can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a JIRA running outside a container
- type: "jira"
address: "127.0.0.1"
port: 8080
jmx_port: 3333
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"
# For an additional JIRA running outside a container
- type: "jira"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 8081
jmx_port: 3334
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"
# For a JIRA running in a Docker container
- type: "jira"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 8080
jmx_port: 3333
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of JIRA |
jira_jvm_gc | Number of garbage collection per second |
jira_jvm_gc_utilization | Garbage collection utilization in percent |
jira_jvm_heap_used | Heap memory used in bytes |
jira_jvm_non_heap_used | Non-Heap memory used in bytes |
jira_request_time | Average time of request in seconds |
jira_requests | Number of requests per second |
Kafkaβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
To enable metrics gathering, Bleemeo agent needs to be installed with JMX enabled, see Java Metrics for setup details.
In addition, Kafka needs to expose JMX over a TCP port.
If you are using Docker, add the environment variable KAFKA_JMX_PORT=1234
, see the
documentation for details.
In other cases, you need to export some environment variable before running kafka-server-start.sh
:
export JMX_PORT=1234
export KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=127.0.0.1"
Warning: this will allow unauthenticated access. Make sure no untrusted access to this port is possible or you should setup authenticated JMX access.
The Bleemeo agent should auto-detect the JMX port, if some auto-detected parameters are wrong, you
can manually override them by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Kafka running outside a container
- type: "kafka"
address: "127.0.0.1"
port: 9092
jmx_port: 1099
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"
# For an additional Kafka running outside a container
- type: "kafka"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 9092
jmx_port: 1099
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"
# For a Kafka running in a Docker container
- type: "kafka"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 9092
jmx_port: 1099
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"
The following metrics are gathered:
Metric | Description |
---|---|
service_status | Status of Kafka |
kafka_jvm_gc | Number of garbage collection per second |
kafka_jvm_gc_utilization | Garbage collection utilization in percent |
kafka_jvm_heap_used | Heap memory used in bytes |
kafka_jvm_non_heap_used | Non-Heap memory used in bytes |
kafka_topics_count | Total number of topics |
kafka_fetch_requests_sum | Total number of fetch requests per second |
kafka_fetch_time_average | Average time to process a fetch request in secondss |
kafka_produce_requests_sum | Total number of produce requests per second |
kafka_produce_time_average | Average time to process a produce request in secondss |
Bleemeo also supports detailed monitoring of specific Kafka topics. To enable this,
add the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Kafka running outside a container
- type: "kafka"
detailed_items:
- "topic1"
- "topic2"
[...]
# For a Kafka running in a Docker container
- type: "kafka"
instance: "CONTAINER_NAME"
detailed_items:
- "topic1"
- "topic2"
[...]
When using Docker, you may use labels to set the detailed topics:
docker run --labels glouton.detailed_items="topic1,topic2" [...]
The following per-topic metrics will be gathered:
Metric | Description |
---|---|
kafka_fetch_requests | Number of fetch requests per second |
kafka_produce_requests | Number of produce requests per second |
If you want to enable more JMX metrics, you can add custom JMX metrics. A list of all the available metrics is available here.
libvirtdβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Memcachedβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Agent uses a Memcached check if the service listen on port 11211.
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Memcached running outside a container
- type: "memcached"
address: "127.0.0.1"
port: 11211
# For a Memcached running outside a container
- type: "memcached"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 11212
# For a Memcached running in a Docker container
- type: "memcached"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 11211
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of Memcached |
memcached_command_get | Number of get request per second |
memcached_command_set | Number of set request per second |
memcached_connections_current | Number of client connection to Memcached |
memcached_items_current | Current number of items stored |
memcached_octets_rx | Network traffic received by Memcached in bytes per second |
memcached_octets_tx | Network traffic sent by Memcached in bytes per second |
memcached_ops_cas_hits | Number of successful CAS (Check-And-Set) request per second |
memcached_ops_cas_misses | Number of CAS (Check-And-Set) request against missing keys per second |
memcached_ops_decr_hits | Number of successful decr request per second |
memcached_ops_decr_misses | Number of decr request against missing keys per second |
memcached_ops_delete_hits | Number of successful delete request per second |
memcached_ops_delete_misses | Number of delete request against missing keys per second |
memcached_ops_evictions | Number of valid items removed from cache to free memory for new items per second |
memcached_ops_get_hits | Number of successful get request per second |
memcached_ops_get_misses | Number of get request against missing keys per second |
memcached_ops_incr_hits | Number of successful incr request per second |
memcached_ops_incr_misses | Number of incr request against missing keys per second |
memcached_ops_touch_hits | Number of successful touch request per second |
memcached_ops_touch_misses | Number of touch request against missing keys per second |
memcached_percent_hitratio | Hit ratio of get request in percent |
memcached_ps_count_threads | Number of worker threads |
memcached_uptime | Time spent since Memcache server start in seconds |
MongoDBβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a MongoDB running outside a container
- type: "mongodb"
address: "127.0.0.1"
port: 27017
# For an additional MongoDB running outside a container
- type: "mongodb"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 27018
# For a MongoDB running in a Docker container
- type: "mongodb"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 27017
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of MongoDB |
mongodb_open_connections | Number of client connections to Mongo server |
mongodb_net_in_bytes | Network traffic received by MongoDB in bytes per second |
mongodb_net_out_bytes | Network traffic sent by MongoDB in bytes per second |
mongodb_queued_reads | Number of clients waiting to read data from the MongoDB |
mongodb_queued_writes | Number of clients waiting to write data to the MongoDB |
mongodb_active_reads | Number of clients performing read operation |
mongodb_active_writes | Number of clients performing write operation |
mongodb_queries | Number of queries per second |
Mosquittoβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Mosquitto running outside a container
- type: "mosquitto"
address: "127.0.0.1"
port: 1883 # MQTT listener, Agent don't support MQTT-SSL here
# For an additional Mosquitto running outside a container
- type: "mosquitto"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 1884 # MQTT listener, Agent don't support MQTT-SSL here
# For a Mosquitto running in a Docker container
- type: "mosquitto"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 1883 # MQTT listener, Agent don't support MQTT-SSL here
Only the metric from service check is produced:
Metric | Description |
---|---|
service_status | Status of Mosquitto |
MySQLβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
To enable metrics gathering, credentials are required. Bleemeo agent will find the credentials if:
- MySQL and the agent are running on Ubuntu or Debian
- MySQL is running in a Docker container and root password is set through
the environment variable
MYSQL_ROOT_PASSWORD
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a MySQL running outside a container
- type: "mysql"
username: "USERNAME"
password: "PASSWORD"
address: "127.0.0.1"
# If set to existing socket file, Glouton will prefer using the Unix socket to
# connect and gather metric from MySQL.
# Service check will continue to use the TCP address.
metrics_unix_socket: "/var/run/mysqld/mysqld.sock"
port: 3306
# For an additional MySQL running outside a container
- type: "mysql"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
username: "USERNAME"
password: "PASSWORD"
address: "127.0.0.1"
# If set to existing socket file, Glouton will prefer using the Unix socket to
# connect and gather metric from MySQL.
# Service check will continue to use the TCP address.
metrics_unix_socket: "/var/run/mysqld/mysqld2.sock"
port: 3307
# For a MySQL running in a Docker container
- type: "mysql"
instance: "CONTAINER_NAME"
username: "USERNAME"
password: "PASSWORD"
address: "172.17.0.2"
port: 3306
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status: Status of MySQL | |
mysql_cache_result_qcache_hits | Number of query cache hits per second |
mysql_cache_result_qcache_inserts | Number of queries added to the query cache per second |
mysql_cache_result_qcache_not_cached | Number of uncacheable queries per second |
mysql_cache_result_qcache_prunes | Number of queries that were deleted from the query cache because of low memory per second |
mysql_cache_blocksize_qcache | Number of blocks in the query cache |
mysql_cache_free_blocks | Number of free memory blocks in the query cache |
mysql_cache_free_memory | Number of free memory for the query cache in bytes |
mysql_cache_size_qcache | Number of queries registered in the query cache |
mysql_locks_immediate | Number of table locks that could be granted immediately per second |
mysql_locks_waited | Number of table locks that could not be granted immediately per second |
mysql_innodb_history_list_len | Size of InnoDB transaction history list |
mysql_innodb_locked_transaction | Number of InnoDB transaction currently locked |
mysql_octets_rx | Network traffic received from clients in bytes per second |
mysql_octets_tx | Network traffic sent to clients in bytes per second |
mysql_queries | Number of queries per second |
mysql_slow_queries | Number of slow queries per second |
mysql_threads_cached | Number of threads in the thread cache |
mysql_threads_connected | Number of currently open connections |
mysql_threads_running | Number of threads that are not sleeping |
mysql_total_threads_created | Number of threads created per second |
mysql_commands_begin | Number of "BEGIN" statement executed per second |
mysql_commands_binlog | Number of "BINLOG" statement executed per second |
mysql_commands_call_procedure | Number of "CALL PROCEDURE" statement executed per second |
mysql_commands_change_master | Number of "CHANGE MASTER" statement executed per second |
mysql_commands_change_repl_filter | Number of "CHANGE REPL FILTER" statement executed per second |
mysql_commands_check | Number of "CHECK TABLE" statement executed per second |
mysql_commands_checksum | Number of "CHECKSUM TABLE" statement executed per second |
mysql_commands_commit | Number of "COMMIT" statement executed per second |
mysql_commands_dealloc_sql | Number of "DEALLOCATE PREPARE" statement executed per second |
mysql_commands_stmt_close | Number of "DEALLOCATE PREPARE" statement executed per second |
mysql_commands_delete_multi | Number of "DELETE" on multiple table statement executed per second |
mysql_commands_delete | Number of "DELETE" statement executed per second |
mysql_commands_do | Number of "DO" statement executed per second |
mysql_commands_execute_sql | Number of "EXECUTE" statement executed per second |
mysql_commands_stmt_execute | Number of "EXECUTE" statement executed per second |
mysql_commands_explain_other | Number of "EXPLAIN FOR CONNECTION" statement executed per second |
mysql_commands_flush | Number of "FLUSH" statement executed per second |
mysql_commands_ha_close | Number of "HA CLOSE" statement executed per second |
mysql_commands_ha_open | Number of "HA OPEN" statement executed per second |
mysql_commands_ha_read | Number of "HA READ" statement executed per second |
mysql_commands_insert_select | Number of "INSERT ... SELECT" statement executed per second |
mysql_commands_insert | Number of "INSERT" statement executed per second |
mysql_commands_kill | Number of "KILL" statement executed per second |
mysql_commands_preload_keys | Number of "LOAD INDEX INTO CACHE" statement executed per second |
mysql_commands_load | Number of "LOAD" statement executed per second |
mysql_commands_lock_tables | Number of "LOCK TABLES" statement executed per second |
mysql_commands_optimize | Number of "OPTIMIZE" statement executed per second |
mysql_commands_prepare_sql | Number of "PREPARE" statement executed per second |
mysql_commands_stmt_prepare | Number of "PREPARE" statement executed per second |
mysql_commands_purge_before_date | Number of "PURGE BEFORE DATE" statement executed per second |
mysql_commands_purge | Number of "PURGE" statement executed per second |
mysql_commands_release_savepoint | Number of "RELEASE SAVEPOINT" statement executed per second |
mysql_commands_repair | Number of "REPAIR" statement executed per second |
mysql_commands_replace_select | Number of "REPLACE SELECT" statement executed per second |
mysql_commands_replace | Number of "REPLACE" statement executed per second |
mysql_commands_reset | Number of "RESET" statement executed per second |
mysql_commands_resignal | Number of "RESIGNAL" statement executed per second |
mysql_commands_rollback_to_savepoint | Number of "ROLLBACK TO SAVEPOINT" statement executed per second |
mysql_commands_rollback | Number of "ROLLBACK" statement executed per second |
mysql_commands_savepoint | Number of "SAVEPOINT" statement executed per second |
mysql_commands_select | Number of "SELECT" statement executed per second |
mysql_commands_signal | Number of "SIGNAL" statement executed per second |
mysql_commands_slave_start | Number of "START SLAVE" statement executed per second |
mysql_commands_group_replication_start | Number of "START" for group replication statement executed per second |
mysql_commands_stmt_fetch | Number of "STMT FETCH" statement executed per second |
mysql_commands_stmt_reprepare | Number of "STMT REPREPARE" statement executed per second |
mysql_commands_stmt_reset | Number of "STMT RESET" statement executed per second |
mysql_commands_stmt_send_long_data | Number of "STMT SEND LONG DATA" statement executed per second |
mysql_commands_slave_stop | Number of "STOP SLAVE" statement executed per second |
mysql_commands_group_replication_stop | Number of "STOP" for group replication statement executed per second |
mysql_commands_truncate | Number of "TRUNCATE" statement executed per second |
mysql_commands_unlock_tables | Number of "UNLOCK TABLES" statement executed per second |
mysql_commands_update_multi | Number of "UPDATE" on multiple table statement executed per second |
mysql_commands_update | Number of "UPDATE" statement executed per second |
mysql_commands_xa_commit | Number of "XA COMMIT" statement executed per second |
mysql_commands_xa_end | Number of "XA END" statement executed per second |
mysql_commands_xa_prepare | Number of "XA PREPARE" statement executed per second |
mysql_commands_xa_recover | Number of "XA RECOVER" statement executed per second |
mysql_commands_xa_rollback | Number of "XA ROLLBACK" statement executed per second |
mysql_commands_xa_start | Number of "XA START" statement executed per second |
mysql_commands_assign_to_keycache | Number of assign to keycache commands per second |
mysql_handler_commit | Number of internal commit request per second |
mysql_handler_delete | Number of rows deleted from tables per second |
mysql_handler_write | Number of rows inserted per second |
mysql_handler_update | Number of rows updated per second |
mysql_handler_rollback | Number of transaction rollback requests given to a storage engine per second |
NATSβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
To enable metrics gathering, you need to enable the NATS monitoring endpoint on your server, see NATS Server Monitoring for details.
By default, the agent assumes that the monitoring endpoint is enabled on port 8222, if your server uses another port, you can change it in the configuration.
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a NATS running outside a container
- type: "nats"
address: "127.0.0.1"
port: 4222
stats_port: 8222
# For an additional NATS running outside a container
- type: "nats"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 4222
stats_port: 8222
# For a NATS running in a Docker container
- type: "nats"
instance: "CONTAINER_NAME"
port: 4222
stats_port: 8222
The following metrics are gathered:
Metric | Description |
---|---|
service_status | Status of NATS |
nats_uptime | Time since the NATS server is started in nanoseconds |
nats_routes | Number of registered routes |
nats_slow_consumers | Number of slow consumers |
nats_subscriptions | Number of active subscriptions |
nats_in_bytes | Amount of incoming bytes |
nats_out_bytes | Amount of outgoing bytes |
nats_in_msgs | Number of incoming messages |
nats_out_msgs | Number of outgoing messages |
nats_connections | Number of currently active clients |
nats_total_connections | Total number of created clients |
NFSβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Metrics are gathered using /proc/self/mountstats
.
NFS should be detected automatically on your server, but you can manually enable it
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
- type: "nfs"
The following metrics are gathered:
Metric | Description |
---|---|
nfs_ops | The number of operations executed per second |
nfs_transmitted_bits | The data exchanged in bits/s |
nfs_rtt_per_op_seconds | The average round-trip time per operation in seconds |
nfs_retrans | The number of times an operation had to be retried per second |
Nginxβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Agent uses a HTTP check if the service listen on port 80.
To enable metrics gathering, ensure Bleemeo agent can access nginx status
on the URL http://server-address/nginx_status
. This usually means you should add the
following to your site definition (e.g. /etc/nginx/sites-enabled/default
):
location /nginx_status {
stub_status on;
}
If your nginx is not build with stub_status or if you need more information about nginx stub_status, see https://nginx.org/en/docs/http/ngx_http_stub_status_module.html
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For an nginx running outside a container
- type: "nginx"
address: "127.0.0.1"
port: 80 # HTTP listener, Agent don't support HTTPS here
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:80" # Host header sent. Like other option, you can omit them and default value will be used.
# For an additional nginx running outside a container
- type: "nginx"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 81 # HTTP listener, Agent don't support HTTPS here
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:81" # Host header sent. Like other option, you can omit them and default value will be used.
# For an nginx running in a Docker container
- type: "nginx"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 80 # HTTP listener, Agent don't support HTTPS here
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:80" # Host header sent. Like other option, you can omit them and default value will be used.
When using Docker, you may use Docker labels to set http_path and http_host:
docker run --labels glouton.http_path="/readiness" --labels glouton.http_host="my-host" [...]
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of nginx |
nginx_requests | Number of requests per second |
nginx_connections_accepted | Number of client connections established per second |
nginx_connections_handled | Number of client connections processed per second |
nginx_connections_active | Number of client connections to nginx server |
nginx_connections_waiting | Number of idle client connections waiting for a request |
nginx_connections_reading | Number of client connections where nginx is reading the request header |
nginx_connections_writing | Number of client connections where nginx is writing the response |
NTPβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Agent uses a NTP check if the service listen on port 123.
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For an NTP running outside a container
- type: "ntp"
address: "127.0.0.1"
port: 123
# For an additional NTP running outside a container
- type: "ntp"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 124
# For an NTP running in a Docker container
- type: "ntp"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 123
Only the metric from service check is produced:
Metric | Description |
---|---|
service_status | Status of NTP |
OpenLDAPβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
To enable metrics gathering, you need to enable the slapd monitoring backend, and to set a bind CN and password in the agent configuration so it can access the metrics.
If some auto-detected parameters are wrong or you want to configure metric gathering,
you can add the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For an OpenLDAP running outside a container
- type: "openldap"
address: "127.0.0.1"
port: 389
# dn/password to bind with. If username is empty, an anonymous bind is performed.
username: "cn=admin,dc=example,dc=org"
password: "adminpassword"
# Use ldaps, note that port will likely need to be changed to 636.
ssl: true
# Use StartTLS (note that you can't enable both ssl and starttls at the same time).
starttls: true
# Don't verify host certificate.
ssl_insecure: true
# Path to PEM-encoded Root certificate to use to verify the server certificate.
ca_file: "/myca"
# For an additional OpenLDAP running outside a container
- type: "openldap"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 390
username: "cn=admin,dc=example,dc=org"
password: "adminpassword"
# For an OpenLdap running in a Docker container
- type: "openldap"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 389
username: "cn=admin,dc=example,dc=org"
password: "adminpassword"
The following metrics are gathered:
Metric | Description |
---|---|
service_status | Status of OpenLDAP |
openldap_connections_current | Current number of active connections |
openldap_waiters_read | Number of threads blocked waiting to read data from a client |
openldap_waiters_write | Number of threads blocked waiting to write data to a client |
openldap_threads_active | Threads (operations) currently active in slapd |
openldap_statistics_bytes | Outgoing bytes per second |
openldap_statistics_entries | Outgoing entries per second |
openldap_operations_add_completed | Number of add operations per second |
openldap_operations_bind_completed | Number of bind operations per second |
openldap_operations_delete_completed | Number of delete operations per second |
openldap_operations_modify_completed | Number of modify operations per second |
openldap_operations_search_completed | Number of search operations per second |
OpenVPNβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Only the metric from service check is produced:
Metric | Description |
---|---|
service_status | Status of OpenVPN |
PHP-FPMβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
To enable metrics gathering, Bleemeo agent needs access to PHP-FPM status page and to know where it may find this page.
PHP-FPM needs to expose its status. For example the pool configuration should include:
pm.status_path = /status
By defaults, Bleemeo agent try fcgi://<fpm-address>:<fpm-port>/status
. E.g using FCGI over the
TCP port on which PHP-FPM listen and use "/status" path.
If it does not match your configuration, you will need to override
"stats_url" parameters by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a PHP-FPM running outside a container
- type: "phpfpm"
address: "127.0.0.1"
port: 9000
stats_url: "fcgi://127.0.0.1:9000/status"
# For UNIX socket access, use
# stats_url: "/var/run/php5-fpm.sock"
# See below for additional note on UNIX socket permission
# For an additional PHP-FPM running outside a container
- type: "phpfpm"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 9001
stats_url: "fcgi://127.0.0.1:9001/status"
# For UNIX socket access, use
# stats_url: "/var/run/php5-fpm2.sock"
# See below for additional note on UNIX socket permission
# For a PHP-FPM running in a Docker container
- type: "phpfpm"
instance: "CONTAINER_NAME"
port: 9000
stats_url: "fcgi://my-server:9000/status"
If using UNIX socket, you should make sure Glouton user has permission to access the socket.
This usually means that Glouton user should be member of the group running php-fpm, which is www-data
on Debian/Ubuntu system. Therefor sudo adduser glouton www-data
and a restart of Bleemeo agent will grant access.
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of PHP-FPM |
phpfpm_accepted_conn | Number of requests per second |
phpfpm_active_processes | Number of active processes |
phpfpm_idle_processes | Number of idle processes |
phpfpm_listen_queue | Number of requests in the queue of pending connections |
phpfpm_listen_queue_len | Size of the queue of pending connections |
phpfpm_max_active_processes | Maximum number of active processes since FPM started |
phpfpm_max_children_reached | Number of times the process limit has been reached |
phpfpm_max_listen_queue | Maximum number of requests in the queue of pending connections since FPM started |
phpfpm_slow_requests | Number of slow requests per second |
phpfpm_start_since | Time spent since PHP-fpm start in seconds |
phpfpm_total_processes | Number of idle + active processes |
Postfixβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Agent uses a SMTP check if the service listen on port 25.
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Postfix running outside a container
- type: "postfix"
address: "127.0.0.1"
port: 25
# For an additional Postfix running outside a container
- type: "postfix"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 26
# For a Postfix running in a Docker container
- type: "postfix"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 25
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of Postfix |
postfix_queue_size | Number of mails queued |
PostgreSQLβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
To enable metrics gathering, credentials are required. The agent will find
the credentials by itself if PostgreSQL is running in a Docker container and username/password are set
through the environment variables POSTGRES_USER
(which defaults to postgres
)
and POSTGRES_PASSWORD
.
By default, only sum metrics are gathered, if you want to monitor specific databases
you can add them to the detailed_items
setting.
If some auto-detected parameters are wrong, or you want to monitor specific databases,
you can add the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a PostgreSQL running outside a container
- type: "postgresql"
username: "USERNAME"
password: "PASSWORD"
address: "127.0.0.1"
port: 5432
# For an additional PostgreSQL running outside a container
- type: "postgresql"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
username: "USERNAME"
password: "PASSWORD"
address: "127.0.0.1"
port: 5433
# For a PostgreSQL running in a Docker container.
- type: "postgresql"
instance: "CONTAINER_NAME"
username: "USERNAME"
password: "PASSWORD"
address: "172.17.0.2"
port: 5432
# Monitor the databases called "bleemeo" and "postgres".
detailed_items:
- bleemeo
- postgres
When using Docker, you may use labels to set the databases to monitor and the credentials:
docker run --labels glouton.detailed_items="bleemeo,postgres" --labels glouton.username="USERNAME" --labels glouton.password="PASSWORD" [...]
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of PostgreSQL |
postgresql_blk_read_utilization | PostgreSQL reading data file blocks utilization |
postgresql_blk_read_utilization_sum | Sum of PostgreSQL reading data file blocks utilization |
postgresql_blk_write_utilization | PostgreSQL writing data file blocks utilization |
postgresql_blk_write_utilization_sum | Sum of PostgreSQL writing data file blocks utilization |
postgresql_blks_hit_sum | Number of blocks read from PostgreSQL cache per second |
postgresql_blks_read_sum | Number of blocks read from disk per second |
postgresql_commit_sum | Number of commits per second |
postgresql_rollback_sum | Number of rollbacks per second |
postgresql_temp_bytes_sum | Temporary file write throughput in bytes per second |
postgresql_temp_files_sum | Number of temporary files created per second |
postgresql_tup_deleted_sum | Number of rows deleted per second |
postgresql_tup_fetched_sum | Number of rows fetched per second |
postgresql_tup_inserted_sum | Number of rows inserted per second |
postgresql_tup_returned_sum | Number of rows returned per second |
postgresql_tup_updated_sum | Number of rows updated per second |
Except for the status metric, all metrics also exist without the _sum suffix if
detailed_items
is used. For instance, the metric postgresql_commit with the
item mycontainer_mydb
corresponds to the number of commits per second on the database
mydb
running in the container mycontainer
.
RabbitMQβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Agent uses a AMQP check if the service listen on port 5672.
To enable metrics gathering, credentials are required. By default "guest/guest" is used.
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a RabbitMQ running outside a container
- type: "rabbitmq"
username: "USERNAME"
password: "PASSWORD"
address: "127.0.0.1"
port: 5672 # Port of AMQP service
stats_port: 15672 # Port of RabbitMQ management interface
# For an additional RabbitMQ running outside a container
- type: "rabbitmq"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
username: "USERNAME"
password: "PASSWORD"
address: "127.0.0.1"
port: 5673 # Port of AMQP service
stats_port: 15673 # Port of RabbitMQ management interface
# For a RabbitMQ running in a Docker container
- type: "rabbitmq"
instance: "CONTAINER_NAME"
username: "USERNAME"
password: "PASSWORD"
address: "172.17.0.2"
port: 5672 # Port of AMQP service
stats_port: 15672 # Port of RabbitMQ management interface
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of RabbitMQ |
rabbitmq_connections | Number of client connections to RabbitMQ server |
rabbitmq_consumers | Number of consumers |
rabbitmq_messages_acked | Number of messages acknowledged per second |
rabbitmq_messages_count | Number of messages |
rabbitmq_messages_delivered | Number of messages delivered per second |
rabbitmq_messages_published | Number of messages published per second |
rabbitmq_messages_unacked_count | Number of messages waiting for an acknowledgement from a consumer |
rabbitmq_queues | Number of queues |
Redisβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Agent uses a Redis check if the service listen on port 6379.
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Redis running outside a container
- type: "redis"
address: "127.0.0.1"
port: 6379
password: "password-if-configured"
# For an additional Redis running outside a container
- type: "redis"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 6380
password: "password-if-configured"
# For a Redis running in a Docker container
- type: "redis"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 6379
password: "password-if-configured"
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of Redis |
redis_current_connections_clients | Number of connected clients |
redis_current_connections_slaves | Number of connected slaves |
redis_evicted_keys | Total number of key evicted keys due to maxmemory limit per second |
redis_expired_keys | Total number of key expiration events per second |
redis_keyspace_hits | Number of successful lookup of keys per second |
redis_keyspace_misses | Number of successful lookup of keys per second |
redis_keyspace_hitrate | Hit ratio of keys lookup in percent |
redis_memory | Memory allocated by Redis in bytes |
redis_memory_lua | Memory used by Lua engine in bytes |
redis_memory_peak | Peak memory consumed by Redis in bytes |
redis_memory_rss | Memory used by Redis as seen by system in bytes |
redis_pubsub_channels | Global number of pub/sub channels with client subscriptions |
redis_pubsub_patterns | Global number of pub/sub pattern with client subscriptions |
redis_total_connections | Number of connections (client or slave) per second |
redis_total_operations | Number of commands processed by the server per second |
redis_uptime | Time spent since Redis server start in seconds |
redis_volatile_changes | Number of changes since the last dump |
Saltβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Salt master running outside a container
- type: "salt"
address: "127.0.0.1"
port: 4505
# For an additional Salt master running outside a container
- type: "salt"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 4505
# For a Salt master running in a Docker container
- type: "salt"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 4505
Only the metric from service check is produced:
Metric | Description |
---|---|
service_status | Status of Salt |
Squid3β
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Agent uses a HTTP check if the service listen on port 3128.
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Squid running outside a container
- type: "squid"
address: "127.0.0.1"
port: 3128
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:80" # Host header sent. Like other option, you can omit them and default value will be used.
# For an additional Squid running outside a container
- type: "squid"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 3129
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:81" # Host header sent. Like other option, you can omit them and default value will be used.
# For a Squid running in a Docker container
- type: "squid"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 3128
http_path: "/" # This is the path used for availability check, not metrics gathering.
http_host: "127.0.0.1:80" # Host header sent. Like other option, you can omit them and default value will be used.
When using Docker, you may use Docker labels to set http_path and http_host:
docker run --labels glouton.http_path="/readiness" --labels glouton.http_host="my-host" [...]
Only the metric from service check is produced:
Metric | Description |
---|---|
service_status | Status of Squid3 |
UPSDβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
The agent gathers metrics on UPS, you need to install and configure nut-server
.
To configure NUT, please refer to the documentation.
You may need to provide the UPSD user credentials to the agent depending on your configuration.
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
- type: "upsd"
address: "127.0.0.1"
port: 3128
# UPSD user credentials.
username: ""
password: ""
The following metrics are gathered:
Metric | Description |
---|---|
service_status | Status of UPSD |
upsd_battery_status | Battery status. The status is critical when the UPS is overloaded, on battery and when the battery is low or needs to be replaced. |
upsd_status_flags | Status flags, see apcupsd for details. |
upsd_battery_voltage | Battery voltage |
upsd_input_voltage | Input voltage |
upsd_output_voltage | Output voltage |
upsd_load_percent | Load of the UPS in percent |
upsd_battery_charge_percent | Battery charge in percent |
upsd_internal_temp | Internal temperature in Β°C |
upsd_input_frequency | Input frequency in Hz |
upsd_time_left_seconds | Time left on battery in seconds |
upsd_time_on_battery_seconds | Time spent on batttery in seconds |
uWSGIβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
The agent gathers metrics using the uWSGI stats server, so you should enable it.
You can do so by adding the options --stats 127.0.0.1:1717 --memory-report
to your server.
Note that without --memory-report
, the metric uwsgi_memory_used
won't be available.
By default, the agent assumes that the stats server is enabled on port 1717, if your server uses another port, you can change it in the configuration.
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a uWSGI running outside a container
- type: "uwsgi"
address: "127.0.0.1"
port: 8080
stats_port: 1717
# If your server uses the --stats-http option, you should set the protocol to "http".
# If the protocol is not set, "tcp" is used by default.
stats_protocol: "tcp"
# For an additional uWSGI running outside a container
- type: "uwsgi"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 8080
stats_port: 1717
# For a uWSGI running in a Docker container
- type: "uwsgi"
instance: "CONTAINER_NAME"
port: 8080
stats_port: 1717
The following metrics are gathered:
Metric | Description |
---|---|
service_status | Status of uWSGI |
uwsgi_requests | Number of requests per second |
uwsgi_transmitted | Amount of data transmitted in bits/s |
uwsgi_memory_used | Memory used in bytes |
uwsgi_avg_request_time | Average time to process a request in seconds |
uwsgi_exceptions | Number of exceptions per second |
uwsgi_harakiri_count | Number of worker timeout per second |
Valkeyβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
Agent uses a Redis check if the service listens on port 6379.
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Valkey running outside a container
- type: "valkey"
address: "127.0.0.1"
port: 6379
password: "password-if-configured"
# For an additional Valkey running outside a container
- type: "valkey"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 6380
password: "password-if-configured"
# For a Valkey running in a Docker container
- type: "valkey"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 6379
password: "password-if-configured"
Agent gathers the same metrics as Redis.
Varnishβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a Varnish running outside a container
- type: "varnish"
address: "127.0.0.1"
port: 6082
# For an additional Varnish running outside a container
- type: "varnish"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 6083
# For a Varnish running in a Docker container
- type: "varnish"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 6082
Only the metric from service check is produced:
Metric | Description |
---|---|
service_status | Status of Varnish |
ZooKeeperβ
Service Detection | Specific Check | Metrics |
---|---|---|
β | β | β |
The agent uses a ZooKeeper check if the service listen on port 2181.
The status check uses the ruok
command, you may need to add it to the whitelist in 4lw.commands.whitelist
, see
the documentation for details.
To enable metrics gathering, Bleemeo agent needs to be installed with JMX enabled, see Java Metrics for setup details.
To gather JMX metrics, ZooKeeper must expose JMX over a TCP port. To enable JMX you will need to:
- If using Docker for ZooKeeper, add environment variable
JMXPORT=1234
- In other cases, add the option
-Dcom.sun.management.jmxremote.port=1234 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
to the JVM. Usually in JAVA_OPTS in /etc/default/zookeeper.
Warning: in both case, this will allow unauthenticated access. Make sure no untrusted access to this port is possible or you should setup authenticated JMX access.
If some auto-detected parameters are wrong, you can manually override them
by adding the following to /etc/glouton/conf.d/99-local.conf
:
service:
[...]
# For a ZooKeeper running outside a container
- type: "zookeeper"
address: "127.0.0.1"
port: 2181
jmx_port: "JMX_PORT"
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"
# For an additional ZooKeeper running outside a container
- type: "zookeeper"
instance: "NAME_ASSOCIATED_WITH_YOUR_ADDITIONAL_SERVICE"
address: "127.0.0.1"
port: 2182
jmx_port: "JMX_PORT"
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"
# For a ZooKeeper running in a Docker container
- type: "zookeeper"
instance: "CONTAINER_NAME"
address: "172.17.0.2"
port: 2181
jmx_port: "JMX_PORT"
jmx_username: "monitorRole" # by default, no authentication is done
jmx_password: "secret"
Agent gathers the following metrics:
Metric | Description |
---|---|
service_status | Status of ZooKeeper |
zookeeper_connections | Number of client connections to ZooKeeper server |
zookeeper_packets_received | Number of packets received sent per second |
zookeeper_packets_sent | Number of packets sent per second |
zookeeper_ephemerals_count | Number of ephemeral node |
zookeeper_watch_count | Number of ZooKeeper watch |
zookeeper_znode_count | Number of znode |
Agent gathers the following metrics through JMX:
Metric | Description |
---|---|
zookeeper_jvm_gc | Number of garbage collection per second |
zookeeper_jvm_gc_utilization | Garbage collection utilization in percent |
zookeeper_jvm_heap_used | Heap memory used in bytes |
zookeeper_jvm_non_heap_used | Non-Heap memory used in bytes |