Skip to main content

📡 High Availability

Every component of the Bleemeo Community Edition can be scaled up and provide high availability.

Quickstart​

A docker compose is available to start a monitoring stack with:

  • a Cassandra cluster with 3 nodes
  • a Redis cluster with 6 nodes
  • 2 SquirrelDB behind a load balancer (nginx)
  • a NATS cluster with 3 nodes
  • 2 SquirrelDB Ingestors
  • 2 Glouton

For true high availability, each component must run on a different node, which is is not done by the docker compose.

NATS​

We recommend using NATS as the MQTT server used by Glouton because it supports clustering for high availability.

See the NATS clustering documentation to setup a NATS cluster.

SquirrelDB​

SquirrelDB can be scaled up, and rely on Cassandra and Redis clusters for high availability.

See the SquirrelDB high availability documentation for details.

SquirrelDB Ingestor​

SquirrelDB Ingestor can be scaled up with NATS subject mapping. It allows to distribute messages received on a topic to multiple topics. This make it possible to use multiple ingestors. Note that this works only with nats-server > v2.9.4.

For example to run two ingestors you can use the following NATS mapping configuration:

mappings = {
# Distribute the messages received on "v1/agent/+/data" to two topics.
# One ingestor listens on "v1/agent/+/data/1" and the other on "v1/agent/+/data/2".
v1.agent.*.data: [
{ destination: v1.agent.$1.data-1, weight: 50% },
{ destination: v1.agent.$1.data-2, weight: 50% }
]
}

Then you need to pass the argument --mqtt-id 1 to the first ingestor and --mqtt-id 2 to the second.

Note that running multiple ingestors doesn't give you high availability, if one ingestor is down, the messages won't be processed until it's restarted. This is not much of an issue since the messages are buffered in NATS so you won't lose any metric. Moreover, the ingestor is stateless, so it can be restarted with the same configuration, which might be automated by an orchestrator like Kubernetes.