Aeron Latency at Lower Throughputs

Aeron is an efficient and high-performance messaging system and has gained popularity in the financial, gaming, and telecommunications industries for its ability to deliver ultra-low latency. However, like any system, its performance can vary greatly depending on the configuration.

Recently I was experimenting with Aeron in a low latency, but relatively low throughput use-case and noticed some interesting effects. The configuration of Aeron (especially the stock configuration) was a significant factor influencing the system’s performance in my use-case.

I will perhaps create another post with actual benchmarks at some point, but for now I will just focus on my observations. The setup was to use aeron in the default DEDICATED threading mode which is the one recommended for low latency.

The Unsettling Jitter with UDP

Starting with UDP, Aeron’s stock configuration added a significant jitter at lower throughputs. Specifically, at up to 10k messages per second, while the baseline latencies were in the range of 50 microseconds, the 90th percentile latencies were over 6ms. That is an unsettling amount of jitter and definitely not something you want in a low-latency use case.

Profiling aeron with perf, it looked like it was spending a large amount of time in nanosleep calls. Digging further, it looks like the default configuration for SENDER_IDLE_STRATEGY and RECEIVER_IDLE_STRATEGY is backoff which seems to be a combination of yields and sleeps.

Switching this to spin fixed the UDP jitter bringing down p90+ and max latencies to under 120 microseconds.

This can be done using the env vars:

1
2
AERON_SENDER_IDLE_STRATEGY=spin
AERON_RECEIVER_IDLE_STRATEGY=spin

IPC Jitter Even Worse

How about IPC. The default states of sender and receiver idle strategies should not impact IPC results, so perhaps this performs better? Turned out IPC, in its default state, offered better baseline latencies than UDP at about 40 microseconds, and much better p99 latencies at 150 microseconds.

However, the max latencies were over 100ms, indicating that under extreme conditions, IPC was introducing extreme jitter, which didn’t make a lot of sense.

After some digging, turns out, this is due to the messages used in my case being larger than the default IPC_MTU_LENGTH. Increasing this configuration to 9000 solved the issue with max latencies coming down to about 120 microseconds.

This can be done using the following env var:

1
AERON_IPC_MTU_LENGH=9k

Conclusion

The key takeaway here is the significant influence that configuration has on Aeron’s performance depending on use-case, and highlights the importance of not just relying on the default configuration, but carefully customizing the settings to suit the specific use case and conditions at hand.