Agent Rate Limiting

jameinel · 9 March 2021 14:53

Juju 2.7 introduced the ability to control how we rate limit agents connecting to the juju controller. The two items that are exposed in controller configuration are:

agent-ratelimit-max (default 10)
agent-ratelimit-rate (default 250ms)

The agents are rate limited using a Token Bucket. The idea is that every login hands out 1 “token”, from a bucket that can only hold so many at a time. It then refills that bucket at a particular rate. That allows for short term bursts, but avoids overloading the system if the requests are sustained.

Setting the agent-ratelimit-max higher increases the burst potential, but means that the peak load on the system will be higher, which may lead to higher chances of swapping, or having follow up failures due to increased load.

Setting agent-ratelimit-rate to a lower time interval will mean tokens get produced more often, which means that the sustained rate that you allow logins goes up. It is generally better to tweak agent-ratelimit-rate than agent-ratelimit-max. Depending on the system, setting it as low as 10ms (meaning 100 new connections per second) seems reasonable. It would be surprising to need to set it much higher (controllers should certainly be able to handle more than 4 connections per second).

The other thing to be aware of is that the rate limit does not apply for client connections (so even if agents are being rate limited, you should be able to run ‘juju status’).

To set and introspect the values, you would use juju controller-config. eg:

juju controller-config agent-ratelimit-rate=100ms
To read the current value you can use juju controller-config agent-ratelimit-rate
Note, if a value has never been set, you will likely get an error like:
ERROR key "agent-ratelimit-rate" not found in "lxd" controller
This is unfortunate side effect of how we implemented the default values (rather than setting the value to the default, we check if there is a value, and if not, use the default).