Provisioning for Microbursts
Provisioning for Microbursts

By Fergal Toomey - Chief Scientist and Co-Founder at Corvil
In my last blog entry I discussed the prevalence of ‘microbursts’ in market data feeds – short, occasional bursts of activity in the feed during which data rates reach levels much higher than average. We saw that during busy periods the LatencyStats.com feeds can send many high-rate microbursts in succession, each burst lasting much less than one second. If the data rate during a microburst exceeds the capacity of a network link or a feed handler, the excess data will be forced to queue. This adds latency to the feed messages, and in the worst case messages may be discarded if a system runs out of buffer space to hold waiting data.
How should network and computing systems be engineered to handle microbursts? A simple but rather conservative approach is to ensure that the available capacity always exceeds the highest microburst data rate measured at some short timescale. For example, if the speed of a network link carrying a market data feed exceeds the feed’s highest 1-millisecond data rate, then the link will never be continuously busy for more than 1 millisecond at a time. Therefore no data will be delayed on the link for more than 1 millisecond.
In practice the capacity needed to keep latency below 1 millisecond is normally much less than the peak 1-millisecond data rate. This is because the microbursts in the feed tend not to be sustained over time. A 1-millisecond microburst exceeding link capacity will build up a queue of data waiting to be processed. But provided the system can buffer the queue and clear it quickly when the burst ends, it can still prevent any data from being delayed more than a millisecond. For example, on LatencyStats.com we display both the peak 1-millisecond rate and the network bandwidth required to avoid 1-millisecond latency for each feed – the latter value is computed using an algorithm based on queuing theory. At the time of writing (28 September 2010) the values displayed for ArcaBook for the last seven days are:
ArcaBook Equities Uncompressed (A)
| Peak 1-second bit-rate | 123.6 Mbps |
| Peak 1-millisecond bit-rate | 660.4 Mbps |
| Bandwidth required for less than 1 millisecond network queuing latency | 349.6 Mbps |
Plainly the peak 1-millisecond data rate is a conservative measure of how much bandwidth is needed. Nevertheless the actual bandwidth required is much higher than the peak 1-second data rate – showing the influence of short timescale microbursts.
What about users who receive multiple market data feeds together, over the same infrastructure? Normally one might hope that microbursts in different feeds would rarely coincide, and therefore the same resources can be shared among the feeds without much risk of overload. This phenomenon – called multiplexing gain – is in fact what usually happens when network and computing workloads from different sources are combined together. The resources needed to handle the total workload are much less than the sum of what each workload needs individually, for the same latency performance.
However, we saw last time that microbursts in the ArcaBook and OpenBook feeds do not occur independently and are in fact closely synchronized. Sudden spikes of activity in the feeds during busy periods of the day are found to coincide to within less than 1 millisecond. The correlated activity pattern is presumably down to the fact that these two feeds come from closely related markets. It implies that there will be little opportunity for multiplexing gain when these feeds are combined together over a shared resource.
To illustrate, I took a short two-minute period of data from each of the feeds (from the last hour of trading during a typical trading day) and calculated the network bandwidth needed to prevent 1-millisecond delays, for each feed individually and also for both feeds together.
The bandwidth needed when the feeds are combined is only slightly lower than the sum of their individual requirements.
Just to demonstrate the extent of multiplexing gain that you would normally expect to see when combining independent workloads of the same bursty nature, I also computed the same bandwidth values using feed data taken from periods on different days:
Taking the data from different days means that the microbursts from the two feeds no longer coincide. If that were the case, the bandwidth needed by the combined data set would be only slightly larger than what ArcaBook needs by itself.
The absence of any significant multiplexing gain when combining these feeds unfortunately means that provisioning for them will be more expensive than otherwise. On the plus side, it also means that it’s easy to calculate the network bandwidth they need when combined together: just add the numbers shown on LatencyStats.com for the individual feeds. For example, doing this for the 7-day values currently shown gives a total bandwidth requirement value of 676 Mbps (for no more than 1-millisecond queuing latency). Note that this value may grow in the future if the size of the feeds increases.

Post new comment