High Availability

Running ilagent in high availability setups across HTTP, MQTT, and Kafka consumer modes.

This page covers strategies for running ilagent in a highly available setup depending on your consumer type.

Kafka

Kafka has built-in consumer group support, making HA straightforward.

Run multiple ilagent instances with the same --kafka_group_id. Kafka automatically distributes partitions across consumers in the group. If one instance dies, its partitions are reassigned to the remaining consumers.

# Instance 1
ilagent daemon --kafka_brokers kafka:9092 --kafka_group_id ilagent -e 'events'

# Instance 2
ilagent daemon --kafka_brokers kafka:9092 --kafka_group_id ilagent -e 'events'

Requirements:

  • The number of topic partitions must be equal to or greater than the number of instances

  • All instances must use the same --kafka_group_id

No additional configuration is needed — Kafka handles rebalancing, offset tracking, and failover automatically.

HTTP proxy

Run multiple ilagent instances behind a load balancer. Each instance maintains its own SQLite retry queue.

clients ── load balancer ── ilagent-1 (own SQLite)
                        └── ilagent-2 (own SQLite)

ilert deduplicates events via alertKey, so if both instances receive and forward the same event, it will only create a single alert. Make sure your events include a consistent alertKey for deduplication to work correctly.

MQTT

MQTT is the most nuanced case because the protocol does not have a native consumer group concept. There are several strategies, each with different trade-offs.

MQTT v5 introduced shared subscriptionsarrow-up-right, which distribute messages across subscribers in a named group — similar to Kafka consumer groups. The broker delivers each message to only one subscriber in the group.

ilagent supports this via the --mqtt_shared_group flag:

Under the hood, ilagent subscribes to $share/ilagent/ilert/events instead of ilert/events. The broker handles load balancing — messages are distributed across instances and delivered to exactly one consumer in the group.

Requirements:

  • Your MQTT broker must support MQTT v5 shared subscriptions (Mosquitto 2.x, HiveMQ, EMQX, VerneMQ, and most modern brokers)

  • All instances must use the same --mqtt_shared_group value

  • Combine with --mqtt_qos 1 to ensure at-least-once delivery

Option 2: Active-passive failover

Run two instances, but only one actively subscribes. Use ilert heartbeat monitoring (-b il1hbt123...) on the active instance — if the heartbeat stops, your orchestration layer (Kubernetes, systemd, etc.) switches to the standby.

This approach is simple and works with any MQTT broker, but has a failover gap while the switch happens. Using --mqtt_buffer ensures that events received before a crash are persisted in SQLite and retried on restart.

Option 3: Active-active with idempotency

Run multiple instances, all subscribing to the same topics. Every instance receives and processes every message.

This works because:

  • Events — ilert deduplicates on alertKey, so duplicate submissions are harmless

  • Escalation policy updates — the PUT calls are idempotent (setting the same user on the same level twice has no side effect)

The trade-off is doubled API traffic and processing load. This approach requires no broker-side support and works with any MQTT version.

Option 4: Topic partitioning

Split your messages across different topics by site, zone, or function. Each ilagent instance handles a dedicated subset:

This requires cooperation from the publisher side but gives you precise control over load distribution. Failure of one instance only affects its assigned topics.

Recommendations

Setup
Strategy
Broker requirement

Kafka

Consumer groups (built-in)

HTTP

Load balancer + alertKey dedup

MQTT (modern broker)

Shared subscriptions

MQTT v5

MQTT (any broker)

Active-active with idempotency

None

MQTT (simple)

Active-passive with heartbeat

None

For most MQTT deployments, we recommend shared subscriptions (--mqtt_shared_group) combined with --mqtt_qos 1 and --mqtt_buffer for the strongest delivery guarantees.

Last updated

Was this helpful?