Skip to content

moming00/ddsketch-cpp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ddsketch-cpp

A cpp implementation of DDSketch

DDSketch, a quantile sketch with relative-error guarantees. This sketch computes quantile values with an approximation error that is relative to the actual quantile value. It works on both negative and non-negative input values.

For instance, using DDSketch with a relative accuracy guarantee set to 1%, if the expected quantile value is 100, the computed quantile value is guaranteed to be between 99 and 101. If the expected quantile value is 1000, the computed quantile value is guaranteed to be between 990 and 1010. DDSketch works by mapping floating-point input values to bins and counting the number of values for each bin. The underlying structure that keeps track of bin counts is store.

The memory size of the sketch depends on the range that is covered by the input values: the larger that range, the more bins are needed to keep track of the input values. As a rough estimate, if working on durations with a relative accuracy of 2%, about 2kB (275 bins) are needed to cover values between 1 millisecond and 1 minute, and about 6kB (802 bins) to cover values between 1 nanosecond and 1 day.

The size of the sketch can be have a fail-safe upper-bound by using collapsing stores. As shown in http://www.vldb.org/pvldb/vol12/p2195-masson.pdf the likelihood of a store collapsing when using the default bound is vanishingly small for most data.

DDSketch implementations are also available in: https://github.com/DataDog/sketches-go/ https://github.com/DataDog/sketches-py/ https://github.com/DataDog/sketches-js/

This project was inspired by https://github.com/tudor1805/sketches-cpp, with additional protobuf feature implemented.

Benchamrk test result based on the test script on ./test/main.cpp:

sketch count: 0   merge time: 24µs  serialize sketch: 74µs  deserialize sketch: 105µs   proto size: 15977B
sketch count: 1   merge time: 10µs  serialize sketch: 24µs  deserialize sketch: 64µs    proto size: 16041B
sketch count: 2   merge time: 10µs  serialize sketch: 35µs  deserialize sketch: 67µs    proto size: 16089B
sketch count: 3   merge time: 13µs  serialize sketch: 39µs  deserialize sketch: 111µs   proto size: 16145B
sketch count: 4   merge time: 7µs   serialize sketch: 26µs  deserialize sketch: 65µs    proto size: 16193B
sketch count: 5   merge time: 8µs   serialize sketch: 27µs  deserialize sketch: 63µs    proto size: 16249B
sketch count: 6   merge time: 1µs   serialize sketch: 27µs  deserialize sketch: 64µs    proto size: 16257B
sketch count: 7   merge time: 1µs   serialize sketch: 26µs  deserialize sketch: 63µs    proto size: 16289B
sketch count: 8   merge time: 1µs   serialize sketch: 35µs  deserialize sketch: 84µs    proto size: 16321B
sketch count: 9   merge time: 1µs   serialize sketch: 31µs  deserialize sketch: 65µs    proto size: 16369B
sketch count: 10  merge time: 1µs   serialize sketch: 41µs  deserialize sketch: 77µs    proto size: 16377B
sketch count: 11  merge time: 1µs   serialize sketch: 33µs  deserialize sketch: 66µs    proto size: 16409B
sketch count: 12  merge time: 7µs   serialize sketch: 28µs  deserialize sketch: 66µs    proto size: 16425B
sketch count: 13  merge time: 1µs   serialize sketch: 28µs  deserialize sketch: 69µs    proto size: 16449B
sketch count: 14  merge time: 1µs   serialize sketch: 28µs  deserialize sketch: 67µs    proto size: 16481B
sketch count: 15  merge time: 0µs   serialize sketch: 32µs  deserialize sketch: 67µs    proto size: 16489B
sketch count: 16  merge time: 1µs   serialize sketch: 31µs  deserialize sketch: 65µs    proto size: 16513B
sketch count: 17  merge time: 0µs   serialize sketch: 30µs  deserialize sketch: 65µs    proto size: 16537B
sketch count: 18  merge time: 1µs   serialize sketch: 31µs  deserialize sketch: 65µs    proto size: 16553B
sketch count: 19  merge time: 0µs   serialize sketch: 28µs  deserialize sketch: 64µs    proto size: 16569B
sketch count: 20  merge time: 0µs   serialize sketch: 33µs  deserialize sketch: 65µs    proto size: 16577B
sketch count: 21  merge time: 1µs   serialize sketch: 33µs  deserialize sketch: 86µs    proto size: 16601B
sketch count: 22  merge time: 0µs   serialize sketch: 29µs  deserialize sketch: 65µs    proto size: 16609B
sketch count: 23  merge time: 0µs   serialize sketch: 29µs  deserialize sketch: 63µs    proto size: 16625B
sketch count: 24  merge time: 1µs   serialize sketch: 28µs  deserialize sketch: 64µs    proto size: 16641B
sketch count: 25  merge time: 9µs   serialize sketch: 27µs  deserialize sketch: 61µs    proto size: 16657B
sketch count: 26  merge time: 0µs   serialize sketch: 29µs  deserialize sketch: 65µs    proto size: 16665B
sketch count: 27  merge time: 1µs   serialize sketch: 31µs  deserialize sketch: 65µs    proto size: 16673B
sketch count: 28  merge time: 0µs   serialize sketch: 31µs  deserialize sketch: 65µs    proto size: 16689B
sketch count: 29  merge time: 1µs   serialize sketch: 31µs  deserialize sketch: 68µs    proto size: 16705B
sketch count: 30  merge time: 1µs   serialize sketch: 32µs  deserialize sketch: 67µs    proto size: 16713B
sketch count: 31  merge time: 1µs   serialize sketch: 36µs  deserialize sketch: 71µs    proto size: 16721B
sketch count: 32  merge time: 1µs   serialize sketch: 29µs  deserialize sketch: 66µs    proto size: 16737B
sketch count: 33  merge time: 0µs   serialize sketch: 27µs  deserialize sketch: 64µs    proto size: 16745B
sketch count: 34  merge time: 0µs   serialize sketch: 27µs  deserialize sketch: 64µs    proto size: 16753B
sketch count: 35  merge time: 4µs   serialize sketch: 25µs  deserialize sketch: 62µs    proto size: 16761B
sketch count: 36  merge time: 2µs   serialize sketch: 43µs  deserialize sketch: 118µs   proto size: 16777B
sketch count: 37  merge time: 1µs   serialize sketch: 39µs  deserialize sketch: 106µs   proto size: 16785B

About

A cpp implementation of DDSketch

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages