Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: significantly improve the memory usage of histogram #610

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

shappir
Copy link
Contributor

@shappir shappir commented Feb 6, 2024

Significantly reduce the amount of memory used by histograms, especially for high cardinality/lots of buckets:

  1. Create bucketValues object with prototype of zero values instead of copying. This way counter per bucket is only allocated when its value is greater than zero (first time it's incremented).
  2. Allocate empty bucketExemplars (when using exemplars) instead of pre-filling with nulls.

Additional optimizations:

  1. Insert valueFromMap into hash only when it's allocated (don't reinsert every time)
  2. Don't lookup again for bucketExemplars. Instead reuse previous lookup

Notes:

  • Removed Object.freeze(this.bucketValues) because it causes += 1 to fail, even though the value in the prototype isn't actually changed. (This feels like a JS or v8 bug)
  • Because initial counter values are in a prototype, can't use hasOwnProperty to check for bucket existence

@shappir
Copy link
Contributor Author

shappir commented Feb 11, 2024

This is the reason Object.freeze needs to be removed: https://github.com/tc39/how-we-work/blob/main/terminology.md#override-mistake

Copy link
Collaborator

@SimenB SimenB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No immediate issues jumps out when looking through this 👍 Do you have any umbers or graph to confirm this helps things?

lib/histogram.js Outdated Show resolved Hide resolved
Co-authored-by: Simen Bekkhus <[email protected]>
@shappir
Copy link
Contributor Author

shappir commented Feb 12, 2024

Do you have any umbers or graph to confirm this helps things?

No systematic results. I will try get some.

@shappir
Copy link
Contributor Author

shappir commented Feb 14, 2024

My tests show a memory saving of only 5% 😢
Guess I shouldn't have called it significant ...
It's borderline worth it - your call.
(I will add the test example into the repo if you want.)

@zbjornson
Copy link
Collaborator

How many buckets total, and how many with values, did you test and get 5% savings?

@shappir
Copy link
Contributor Author

shappir commented Feb 15, 2024

@zbjornson @SimenB I tested one histogram with the default buckets: [0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10]
Tests:

  1. 10 labels with random distribution, inserting 1000 random values
  2. 10 labels with random distribution, inserting 10000 random values
  3. 5 labels with random distribution, inserting 100000 random values

I got roughly the same results in all cases, peaking at a saving of 5%.

Generally speaking (in percentages):

  1. The less cardinality you have the less benefit you'll get
  2. There's a fixed overhead for the histogram itself, so the less values you have, the less benefit you'll get

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants