Add write every 10K messages #12

christianbundy · 2019-09-17T22:10:32Z

Resolves #11

This could probably be much more sophisticated, but given the fact that I have 700 million messages I don't think a few more save points would be too terrible.

dominictarr · 2019-09-18T01:48:22Z

how did you pick ten thousand?

I think a time based thing would make more sense, because that's the thing the users feel. 10 thousand messages could be lots or not much, it's hard to say.

christianbundy · 2019-09-18T02:44:46Z

how did you pick ten thousand?

Totally arbitrary, I wanted something between 1 and log.since.value.

I think a time based thing would make more sense, because that's the thing the users feel.

Any thoughts on how you'd like to see this implemented?

arj03 · 2019-09-18T07:19:12Z

Would be interesting to see a performance test of this.

dominictarr · 2019-09-18T11:26:25Z

the fact that I have 700 million messages

hmm, you must mean 700 megabytes of messages?

@christianbundy hmm, I guess you'd have a timeout (unref it so it doesn't keep the node process going) that causes the write, and clear it if you do a write because it's in sync. Hmm, funny feeling that sometimes I've seen flumeview-reduce write many times, because the reduce is so light that it if you are streaming the records in, it will write many times. Lets ignore that for now since it didn't do that currently, anyway.

Hmm, also: maybe it would be best to fix it in async-single? and then call that every write? then flumeview-reduce can focus on just doing database stuff. If I recall correctly, the way I'm doing clearTimeout and setTimeout is a gonna be a bit slow there...

mixmix · 2019-09-29T07:32:15Z

This code looks good to me, seems like a smart improvement.
10k seems high, but for 700MB of 8k messages that would be only ~9 write calls. Many messages will be less than that... feels about right though. I wonder what the mean message size is (backs carefully away from nerd-snipe trap)

dominictarr · 2019-09-29T23:03:16Z

@mixmix mean message size is about 700 bytes, because the most popular messages, contacts (follows), and votes (likes) are small.

mixmix · 2019-09-29T23:31:53Z

That would mean ~100 write calls over 700MB ... still ok Could make it 20e3 and cut it down to 50. I guess another way to think about it is what the time-period is between saves.... but that might depend on the complement of indexes you're running (to my eye they seem to increase in step but don't know what's governing under the hood)

…

On Sep 30 2019, at 12:03 pm, Dominic Tarr ***@***.***> wrote: @mixmix (https://github.com/mixmix) mean message size is about 700 bytes, because the most popular messages, contacts (follows), and votes (likes) are small. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub (#12?email_source=notifications&email_token=AAUK3HWREJA4BIGG3GNLUMLQMEX3JA5CNFSM4IXVTSM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD74A4UQ#issuecomment-536350290), or mute the thread (https://github.com/notifications/unsubscribe-auth/AAUK3HSJBJ56DLVQJQGHZLLQMEX3JANCNFSM4IXVTSMQ).

dominictarr · 2019-09-30T00:11:48Z

@mixmix flumedb is ment to be generic, it's not just for ssb. That's why there is no "ssb-" at the start.
also, to get the correct write number, also multiply by the number of reduces.

mixmix · 2019-10-01T02:29:38Z

Good point - forgot the general context

…

On Sep 30 2019, at 1:11 pm, Dominic Tarr ***@***.***> wrote: @mixmix (https://github.com/mixmix) flumedb is ment to be generic, it's not just for ssb. That's why there is no "ssb-" at the start. also, to get the correct write number, also multiply by the number of reduces. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub (#12?email_source=notifications&email_token=AAUK3HTLEGV6SH5T7YD5TQLQME74JA5CNFSM4IXVTSM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD74CFQQ#issuecomment-536355522), or mute the thread (https://github.com/notifications/unsubscribe-auth/AAUK3HQ3V75I6QWS4LZ3Z33QME74JANCNFSM4IXVTSMQ).

dominictarr · 2019-10-01T20:33:12Z

@mixmix see the organizational patterns doc https://hackmd.io/IM5_tWIfSFuNoe3jtrjrtQ#well-thought-out-good-ideas

christianbundy · 2019-12-11T02:02:31Z

Been thinking about this and I should probably write it down.

Are there any simple gradient descent patterns we could use here? I'm sure there are much better algorithms, but I'm imagining an implementation where you do batches of:

16: If the batch of 16 finishes first, you do 8 16 32.
32: If the batch of 32 finishes first do 16 32 64 again
64 If the batch of 64 finishes first you do 32 64 128

I'm sure we have better tools at our disposal, but I'd imagine that any solution that worked here would also work in flumedb/flumeview-level#15.

dominictarr · 2019-12-16T09:32:28Z

that's a good idea.

hmm, we've had this type of problem a bunch of places...
maybe the question is how do we model it?
Okay there is a tradeoff with batch size, latency and throughput.
but what's the tradeoff? how does it work? hmm, on tcp, each packet has overhead,
so many small packets mean many headers, throughput includes the headers so it means that the total proportion of throughput that is payload is smaller.

You could approach this by setting a relative value for latency and throughput. (That value will be hard to define of course)

But also, you could just try to maximize just throughput you'd send small packets, but not if there are many of them... oh hang on, lets say that each packet has a 10 byte overhead. but how do you decide it's okay to send 1 packet a second, but not one packet a millisecond? hmm, if you had a time box, then you'd maximize throughput within that time... but wouldn't that just be delaying until the end of that box? (not ideal)

If you know the parameters of the system, say, in tcp the maximum transfer unit is 1500 bytes, so no reason to buffer bigger than that... but sometimes you don't know what the upper layers in the system will do. (prehaps there is an encryption layer or other framing that adds more overhead...)

Okay, maybe the right idea is to try and look at prior art, https://en.wikipedia.org/wiki/Nagle%27s_algorithm

dominictarr · 2019-12-16T09:32:43Z

sorry, brain splurg there

dominictarr · 2019-12-16T09:56:53Z

turns out that nagels algorithm is really simple, we do a write and if a unacknowledged packet has already been written, buffer that write until you get the ack.

I already used that pattern with https://www.npmjs.com/package/pull-write if the write hasn't called back yet, buffer until it does. (pull-write also has a fixed max)

the nice thing about nagel is that it doesn't actually require a model of overhead, but that's also the problem.

I found a recent comment from nagel himself
https://developers.slashdot.org/comments.pl?sid=174457&threshold=1&commentsort=0&mode=thread&cid=14515105

The right answer is that TCP should keep track of whether delayed ACKs are "winning" or "losing". A "win" is when, before the 500ms timer runs out, the application replies. Any needed ACK is then coalesced with the next outgoing data packet. A "lose" is when the 500ms timer runs out and the delayed ACK has to be sent anyway. There should be a counter in TCP, incremented on "wins", and reset to 0 on "loses". Only when the counter exceeds some number (5 or so), should ACKs be delayed. That would eliminate the problem automatically, and the need to turn the "Nagle algorithm" on and off.

here is another one: https://news.ycombinator.com/item?id=9050645 but this time he just says disable delayed ack.

dominictarr · 2019-12-16T10:05:51Z

this is also interesting: https://en.wikipedia.org/wiki/TCP_tuning

High performance networks have very large BDPs. To give a practical example, two nodes communicating over a geostationary satellite link with a round-trip delay time (or round-trip time, RTT) of 0.5 seconds and a bandwidth of 10 Gbit/s can have up to 0.5×1010 bits, i.e., 5 Gbit = 625 MB of unacknowledged data in flight. Despite having much lower latencies than satellite links, even terrestrial fiber links can have very high BDPs because their link capacity is so large. Operating systems and protocols designed as recently as a few years ago when networks were slower were tuned for BDPs of orders of magnitude smaller, with implications for limited achievable performance.

with a half second latency nagel would cause transmission to go very slowly, of course...

dominictarr · 2019-12-16T10:30:17Z

https://en.wikipedia.org/wiki/Bufferbloat

some modern network devices have large buffers, but that means tcp doesn't know it's saturated, because of extra buffering it's reliable and doesn't slow down, so it just sends more data, but that takes longer for the buffer to drain.

how to test for buffer bloat: https://www.bufferbloat.net/projects/bloat/wiki/Tests_for_Bufferbloat/

do ping google.com in a terminal, then run a speed test. If the ping gets longer while the speed test runs then you are having buffer bloat. That happened to me! ping changed from 75ms to 300ish.

dominictarr · 2019-12-16T10:55:38Z

super interesting rabbit hole here: http://blog.cerowrt.org/post/net_neutrality_customers/

black-puppydog · 2020-04-28T00:51:54Z

Folks, can we just merge this? You seem to all agree that for any practical purpose this works and is a net improvement for whoever is using the library.
And it would solve ssbc/patchwork#935 which can be a real pain for long syncs.
If someone wants to implement a fancy algorithm to replace this, why not? But I'd want to see them feeling that pain before.

The only real argument against this that I have seen is from @dominictarr, that this might impact non-ssb applications because the value chosen is too low and it causes tons of writes'.
If we choose the write value big enough, this PR essentially becomes a no-op: most applications will finish indexing before the threshold is reached. So the question is: do we think 10K is close enough to that point?

Add write every 10K messages

66de998

christianbundy added the bug label Sep 17, 2019

christianbundy requested review from dominictarr, arj03 and mixmix September 17, 2019 22:10

christianbundy self-assigned this Sep 17, 2019

dominictarr mentioned this pull request Sep 18, 2019

avoid clearing timeouts dominictarr/async-single#1

Open

christianbundy mentioned this pull request Apr 23, 2020

Indexes on master aren't being saved ssbc/patchwork#935

Closed

black-puppydog mentioned this pull request Jan 10, 2021

*Allow* periodic writes of the views to avoid loss of work when restarting #16

Merged

black-puppydog mentioned this pull request Jan 25, 2021

Doesn't save until all the way synced (I think?) #11

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add write every 10K messages #12

Add write every 10K messages #12

christianbundy commented Sep 17, 2019

dominictarr commented Sep 18, 2019

christianbundy commented Sep 18, 2019

arj03 commented Sep 18, 2019

dominictarr commented Sep 18, 2019

mixmix commented Sep 29, 2019

dominictarr commented Sep 29, 2019

mixmix commented Sep 29, 2019 via email

dominictarr commented Sep 30, 2019

mixmix commented Oct 1, 2019 via email

dominictarr commented Oct 1, 2019

christianbundy commented Dec 11, 2019

dominictarr commented Dec 16, 2019

dominictarr commented Dec 16, 2019

dominictarr commented Dec 16, 2019

dominictarr commented Dec 16, 2019

dominictarr commented Dec 16, 2019

dominictarr commented Dec 16, 2019

black-puppydog commented Apr 28, 2020

Add write every 10K messages #12

Are you sure you want to change the base?

Add write every 10K messages #12

Conversation

christianbundy commented Sep 17, 2019

dominictarr commented Sep 18, 2019

christianbundy commented Sep 18, 2019

arj03 commented Sep 18, 2019

dominictarr commented Sep 18, 2019

mixmix commented Sep 29, 2019

dominictarr commented Sep 29, 2019

mixmix commented Sep 29, 2019 via email

dominictarr commented Sep 30, 2019

mixmix commented Oct 1, 2019 via email

dominictarr commented Oct 1, 2019

christianbundy commented Dec 11, 2019

dominictarr commented Dec 16, 2019

dominictarr commented Dec 16, 2019

dominictarr commented Dec 16, 2019

dominictarr commented Dec 16, 2019

dominictarr commented Dec 16, 2019

dominictarr commented Dec 16, 2019

black-puppydog commented Apr 28, 2020