Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async mempool scanning #970

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

conduition
Copy link
Contributor

@conduition conduition commented Dec 8, 2023

Fixes #968. Mempool scanning is now done asynchronously in a separate thread called the mempool_sync thread.

Method

My approach was to split the Mempool::sync method into two parts: (1) fetching the updated mempool transactions [slow, read-only] and (2) applying those transactions [fast, write].

  • Mempool scanning is first triggered when the main thread sends a HashSet<Txid> to the mempool_sync thread. This set represents the current set of TXIDs in the mempool which we already know about.
  • The mempool_sync thread proceeds to scan the mempool from the remote node. While the scan is ongoing (could take a long time) the main thread can still process block updates, script notifications, and electrum client requests.
  • Once the mempool scan finishes, the mempool_sync thread blocks until it can send a MempoolSyncUpdate to the main thread.
  • When the main thread has a spare moment to receive this packaged update, it synchronously applies the update to the Mempool, via Rpc::mempool_apply. This adds new transactions, removes outdated ones, and updates the fee histogram.

Advantages

Scanning is asynchronous and read-only, while writing the update to the mempool's in-memory state is synchronous WRT the main-thread. This way, we don't introduce any mutexes because we don't need concurrent read/writes of Mempool. Electrum client requests are always processed with a consistent view of the mempool, even if a scan is ongoing and could finish at any time.

Histogram Perf

While i was in there, I improved the performance of applying mempool updates, by avoiding needless recomputation of the FeeHistogram from scratch every time. Instead we dynamically update fee histogram bins according to which transactions have been added or removed. I added unit tests to demonstrate the logic is sound. If preferable, I can separate the FeeHistogram changes and package it into another PR (although it will probably conflict with this one).

@romanz
Copy link
Owner

romanz commented Dec 9, 2023

Many thanks!
Could you please open a separate PR for the first two commits?
(You can keep these commits also in this PR, to prevent merge conflicts)

src/mempool.rs Outdated Show resolved Hide resolved
src/mempool.rs Outdated Show resolved Hide resolved
src/mempool.rs Outdated Show resolved Hide resolved
@conduition conduition force-pushed the async-mempool branch 2 times, most recently from da11fc8 to 0162c5f Compare December 10, 2023 03:47
@conduition
Copy link
Contributor Author

conduition commented Dec 10, 2023

Requested changes force pushed, amending the existing commits. @romanz you can view the diffs here. If you feel comfortable with that, i'll submit 9b5f925 and ed42680 as a new PR.

@romanz
Copy link
Owner

romanz commented Dec 10, 2023

If you feel comfortable with that, i'll submit 9b5f925 and ed42680 as a new PR.

Please do :)

@conduition
Copy link
Contributor Author

PR open here, this PR is now dependent on that one

@romanz
Copy link
Owner

romanz commented Dec 11, 2023

Please rebase over master (since #971 is merged).

@conduition
Copy link
Contributor Author

Rebase complete 👍

@romanz
Copy link
Owner

romanz commented Dec 12, 2023

Seems that the CI failure is not related to 06303d9 - I'll merge it via #973

@romanz
Copy link
Owner

romanz commented Dec 12, 2023

Please rebase over latest master.

@romanz
Copy link
Owner

romanz commented Dec 12, 2023

Could you please take a look at the failure?
You can reproduce it locally by running:

$ docker build -f Dockerfile.ci . --rm -t electrs:tests
$ docker run -v $PWD/contrib/:/contrib -v $PWD/tests/:/tests --rm electrs:tests bash -x /tests/run.sh
...
+ TXID=20521100abc8df819901ccc5f964059ffc02303c4b421dccbf43322928221a2b
+ echo ' * get_tx_status'
 * get_tx_status
++ electrum --regtest --wallet=data/electrum/wallet get_tx_status 20521100abc8df819901ccc5f964059ffc02303c4b421dccbf43322928221a2b
++ jq -c .
Transaction not in wallet.
+ test '' == '{"confirmations":0}'
...

IIUC, it seems that the broadcasted transaction is not being reported back to Electrum client.

@romanz
Copy link
Owner

romanz commented Dec 12, 2023

You can modify tests/run.sh to debug this issue, e.g.:

diff --git a/tests/run.sh b/tests/run.sh
index e19b4d2..b19d8cd 100755
--- a/tests/run.sh
+++ b/tests/run.sh
@@ -13,6 +13,7 @@ cleanup() {
   	kill $j
   	wait $j
   done
+  cat data/electrs/regtest-debug.log
 }
 trap cleanup SIGINT SIGTERM EXIT
 
@@ -68,6 +69,8 @@ NEW_ADDR=`$EL getunusedaddress`
 echo " * payto & broadcast"
 TXID=$($EL broadcast $($EL payto $NEW_ADDR 123 --fee 0.001 --password=''))
 
+sleep 20
+
 echo " * get_tx_status"
 test "`$EL get_tx_status $TXID | jq -c .`" == '{"confirmations":0}'

@conduition
Copy link
Contributor Author

conduition commented Dec 14, 2023

Rebased ✅

I'm now looking into the mempool missing TX error. My guess is that async mempool scanning results in a delayed reaction where electrs only receives the newly broadcasted TX back from bitcoind after the current mempool scan iteration runs once or twice.

Prior to this PR, the synchronous nature of mempool scanning meant that after broadcasting a transaction, the next iteration of the main thread's event loop triggered a mempool scan, and further client requests (like the subsequent TX lookup) weren't processed until the mempool scan was complete, so the request always got the most up-to-date view of the mempool possible after broadcasting.

I see two options to address this:

  1. After a TX broadcast, immediately execute a synchronous mempool poll/update cycle in the main thread (blocking other client requests)
  2. After a TX broadcast, add the new transaction into the Mempool state manually (using Mempool::apply_sync_update).

I think option 2 is the more appropriate choice. I don't see any obvious associated negatives. The mempool sync thread should remove the entry later if the broadcasted TX we add manually is evicted for some reason.

Still waiting for the dockerfile to build for the moment. i'll return tomorrow with an update once I've verified my suspicions

@conduition
Copy link
Contributor Author

@romanz sorry for the delay. My suspicions were correct, and I applied a fix which automatically appends the newly broadcasted TX to the mempool immediately upon broadcast.

@romanz
Copy link
Owner

romanz commented Dec 20, 2023

@conduition Could you please review #979 which should also help when syncing large mempools?

The mempool scanning procedure can now be launched
in a separate thread, while mempool updates are
applied synchronously in the main thread.
@conduition
Copy link
Contributor Author

conduition commented Dec 24, 2023

@romanz I rebased this now that #979 has been merged. I think this PR still has merit, because even if mempool scans are now much faster, there's still no need to block client requests on them. For my use case, where i have bitcoind on a separate machine, this PR is a must-have

@conduition
Copy link
Contributor Author

Putting this briefly into Draft state; i found I was getting spurious logs after a07b70d.

[2023-12-24T18:21:13.575Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed
[2023-12-24T18:21:13.583Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed
[2023-12-24T18:21:13.621Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed
[2023-12-24T18:21:13.675Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed
[2023-12-24T18:21:13.682Z DEBUG electrs::p2p] loading 1 blocks
[2023-12-24T18:21:15.677Z DEBUG electrs::status] 1 transactions from 1 blocks
[2023-12-24T18:21:15.677Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed
[2023-12-24T18:21:15.684Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed
[2023-12-24T18:21:15.759Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed
[2023-12-24T18:21:15.764Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed

will fix

@conduition conduition marked this pull request as draft December 24, 2023 18:27
@conduition conduition marked this pull request as ready for review December 25, 2023 18:20
@conduition
Copy link
Contributor Author

OK should be fixed now

@romanz romanz self-requested a review December 29, 2023 09:53
@romanz romanz added enhancement New feature or request performance mempool labels Dec 29, 2023
@conduition
Copy link
Contributor Author

Hey @romanz, just wanted to check in and see if I should take the time to rebase this or?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Initial mempool sync may take a lot of time
2 participants