-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
custom implementation of data storage #18
Comments
@UARTman try this one, please |
Acknowledged. Before I start writing perf tests, we need to make a couple of decisions. First, the benchmarking instrument. We can either use The second is whetner we use Rust's custom test harness feature. Enabling it will make the benchmarking code cleaner (by implementing a macro similar to |
@UARTman I vote in favor of criterion and nightly. |
I was leaning this way myself, so I'm glad we're on the same page. I'll start prototyping performance tests with those tools, as soon as I get a working proof-of-concept that benchmarks some operations, I'll link the branch here. |
I have some bad news: even with the nightly features on, we'll still have to keep the benchmarks outside of our Another possible issue is that Regarding benchmarking framework choice, we definitely made the right one, because graphing and granular loops were definitely very useful. |
@UARTman I think it's only logical, to keep test code (benchmarking or whatever) outside of |
In this case, it's probably best to count how many nodes a program uses (depending on its complexity), so that we know if we're dealing with hundreds, thousands, or millions of nodes. That will probably tell us if this is even worth optimizing. BTW, I've got a question on the size of chunks you use in the |
@UARTman a small program will have 5-10K nodes. A big one may have 100K+. I don't think we will ever reach the point of 1M nodes, no matter how large is the program. It's just a guess for now. We will store anything in |
@UARTman any progress? |
The last two weeks were very busy for me, so I haven't been able to work on the problem, but the exam week is over, so I'm returning to it. I do have an idea about what we can do about the unnecessary allocations, as well as the backlog of all the operations I haven't benchmarked yet. The idea is to make To make it work properly, I would probably need to redesign a couple of APIs around Hex to work with slices instead of |
A remark on the current state of benchmarks: they're not particularly reliable right now, since they tend to produce pretty inconsistent results. So this is definitely something to look into later. |
@UARTman if you put a heavy load on Sodg, I think any benchmark will give you reasonable data. Just run a billion operations and measure time (should be in minutes). Then, do the same after your refactoring. Obviously, if there is a positive effect, you will see it. |
Please look at my basic re-implementation of Hex. The only significant difference in API is that |
@UARTman where should I look? :) Please, submit a pull request |
@UARTman any more progress? |
@yegor256 For now, I've been stalling on this because there isn't anything obvious to do. I've tried to make benchmarking make sense and been mostly unsuccessful: the timing of, say, 10000 operations still changes by ~80% up and down between runs of the same code. |
At the moment, every time we call
put
, a new allocation in heap is happening (if I'm not mistaken). This may be very expensive, if we have manyput
calls with small data chunks (just a few bytes). Let's investigate how we can optimize this and then implement a new method. Before we do this, let's implement a performance test in order to measure the speed of current implementation. Then, compare new implementation with the current one.The text was updated successfully, but these errors were encountered: