Dropwizard Reservoir Concurrency

Published 2018-06-06 on Farid Zakaria's Blog

Dropwizard Metrics

A senior colleague was recently writing some code that required some sliding window book-keeping. Worried about throughput and concurrency, the colleague opted for a home-grown solution following the single-writer principle.

From prior experience with Dropwizard Metrics, my quick quip was "meh, just use Dropwizard's SlidingTimeWindowReservoir", as I had come to expect the library to provide robust & highly concurrent data structures for metrics.

He ended up diving into the implementation and sure enough -- found it to be quite ingenious. It took me a little bit of understanding so I thought I would explain it here for my future self.

Underlying Datastructure

When drumming up ways to implement a SlidingTimeWindowReservoir, various data structures could be used however Dropwizard opt's for a ConcurrentSkipListMap, which is a lock free NavigableMap.

The map is sorted on tick (time), and the interface NavigableMap, allows for easy trimming.

private void trim() {
    measurements.headMap(getTick() - window).clear();
}

Concurrency

The key to the ConcurrentSkipListMap is the clock tick.

How do we solve the scenario where multiple writers try to record a value at the same clock granularity?

This is where the implementation is quite neat, by introducing a COLLISION_BUFFER.

Original source

    // allow for this many duplicate ticks before overwriting measurements
    private static final int COLLISION_BUFFER = 256;
   
    private long getTick() {
        for (; ; ) {
            final long oldTick = lastTick.get();
            final long tick = clock.getTick() * COLLISION_BUFFER;
            // ensure the tick is strictly incrementing even if there are duplicate ticks
            final long newTick = tick - oldTick > 0 ? tick : oldTick + 1;
            if (lastTick.compareAndSet(oldTick, newTick)) {
                return newTick;
            }
        }
    }

In the unlikely case where multiple writers are trying to add to the Map in the same clock granularity (i.e. clock.getTick() returns the same exact value) the use of a CAS allows the code to keep looping incrementing the tick value by 1 within a COLLISION_BUFFER.

Consider the simple case where clock.getTick() returns 2 & oldTick returns 256 (1 * 256).

The first writer does: tick - oldTick and assigns newTick as tick. The compareAndSet is successful and lastTick is set as 512.

The second writer fails the CAS and loops again but now lastTick is 512.
newTick will now be 513 and be set.