Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement group commit into journal #96

Open
marvin-j97 opened this issue Nov 17, 2024 · 2 comments
Open

Implement group commit into journal #96

marvin-j97 opened this issue Nov 17, 2024 · 2 comments
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed performance possibly breaking

Comments

@marvin-j97
Copy link
Collaborator

No description provided.

@marvin-j97 marvin-j97 added enhancement New feature or request help wanted Extra attention is needed good first issue Good for newcomers performance labels Nov 17, 2024
@jcdickinson
Copy link

Thinking about contributing to this, what's the plan? I assume something along the lines of:

  1. Create a transaction ID
  2. Mark journal entries with the transaction ID
  3. Create transaction completed journal event with the transaction ID

That should allow multiple active transactions to interleave. The main question is binary compatibility in the journal, is that a concern?

@marvin-j97
Copy link
Collaborator Author

marvin-j97 commented Nov 18, 2024

@jcdickinson The main idea is to amortize the fsync costs in multithreaded workloads. Right now it's effectively singlethreaded:

t1:insert
t1:sync
t2:insert
t2:sync
t3:insert
t3:sync

The rough idea is the same as what RocksDB is doing:

As most other systems relying on logs, RocksDB supports group commit to improve WAL writing throughput, as well as write amplification. RocksDB's group commit is implemented in a naive way: when different threads are writing to the same DB at the same time, all outstanding writes that qualify to be combined will be combined together and write to WAL once, with one fsync. In this way, more writes can be completed by the same number of I/Os.

https://github.com/facebook/rocksdb/wiki/WAL-Performance

So it becomes:

t1:insert
t2:insert
t3:insert
t1:sync

One problem is that the insert() method is not aware of sync=true/false, however it should be implementable for Batch or WriteTransaction, when their durability is set to Sync*, e.g. let tx = keyspace.write_tx().durability(Some(SyncData)).

It should be easy to benchmark this by writing with 1, 2, 4, 8 threads with SyncData or SyncAll on a fairly slow device (e.g. HDD).

If this cannot be implemented without a breaking change, this would be a good candidate for a v3 change. TBD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed performance possibly breaking
Projects
None yet
Development

No branches or pull requests

2 participants