Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ar_data_sync: Skip unpacking if the chunk was downloaded from the local peer #637

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

shizzard
Copy link
Collaborator

No description provided.

@shizzard
Copy link
Collaborator Author

@JamesPiechota I did implement the requested packing of the served chunks for local peers only.

I did test all the possible combinations of packer and miner configurations, including:

  • for packer:
    • setting local_peer address and not setting it (to check the local_peers filtering for the packer);
    • setting the pack_served_chunks and not setting it (to check on-demand packing);
  • for miner:
    • setting sync_from_local_peers_only to ensure syncing only from local packer;
    • setting data_sync_request_packed_chunks: to test on-demand packing);
    • setting local_peer address and not setting it (to check the local_peers filtering for requesting packed chunks);

I didn't find any node misbehaving.

@JamesPiechota
Copy link
Collaborator

I'm rerunning the auto tests. ar_packing_tests failed - in the past that hasn't been that flaky, so it may indicate a real issue, but re-running just in case.

@JamesPiechota
Copy link
Collaborator

@JamesPiechota I did implement the requested packing of the served chunks for local peers only.

I did test all the possible combinations of packer and miner configurations, including:

  • for packer:

    • setting local_peer address and not setting it (to check the local_peers filtering for the packer);
    • setting the pack_served_chunks and not setting it (to check on-demand packing);
  • for miner:

    • setting sync_from_local_peers_only to ensure syncing only from local packer;
    • setting data_sync_request_packed_chunks: to test on-demand packing);
    • setting local_peer address and not setting it (to check the local_peers filtering for requesting packed chunks);

I didn't find any node misbehaving.

Those tests look good! Were you able to confirm when chunk repacking actually occurs on both packer and miner? I think last time we only checked that the requests were formatted correctly, and missed the fact that even with correctly formatted requests, the miner had to do some unpacking to validate.

@shizzard
Copy link
Collaborator Author

Were you able to confirm when chunk repacking actually occurs on both packer and miner? I think last time we only checked that the requests were formatted correctly, and missed the fact that even with correctly formatted requests, the miner had to do some unpacking to validate.

No, how can I actually check that the chunk was packed correctly?

@@ -2056,6 +2062,9 @@ handle_get_chunk(OffsetBinary, Req, Encoding) ->
{400, #{}, jiffy:encode(#{ error => invalid_offset }), Req}
end.

is_matching_addr({{O1, O2, O3, O4}, _Port1}, {O1, O2, O3, O4, _Port2}) -> true;
is_matching_addr(_CowboyPeer, _LocalPeer) -> false.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really have a good place for peer-related utilities. But there are a few in ar_util. I think is_matching_addr and maybe even the local_peers fold could live there with the other peer utilities (e.g. maybe ar_util:peer_addr_is_member or something)

@JamesPiechota
Copy link
Collaborator

Were you able to confirm when chunk repacking actually occurs on both packer and miner? I think last time we only checked that the requests were formatted correctly, and missed the fact that even with correctly formatted requests, the miner had to do some unpacking to validate.

No, how can I actually check that the chunk was packed correctly?

Easiest way is probably to monitor the metrics. Every time a node does a pack, unpack, or repack, the packing_requests metric is incremented. You can use this Grafana board to monitor packing behavior, or also manually check the /metrics endpoint and look at the packing_requests value. If you want to use the grafana board we'll have to add your IP:PORT to our prometheus server, If you DM me those values I can make the addition.

@shizzard shizzard force-pushed the ar_data_sync-sjup-local-peer-verification branch from 7e809c0 to 3c05276 Compare November 4, 2024 15:40
IsLocalPeerAddr = lists:foldl(fun
(_, true) -> true;
(LocalPeer, _) -> is_matching_addr(Peer, LocalPeer)
end, false, Config#config.local_peers),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local_peers should be all in the same format so I think we can just use lists:member

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually not, that's the problem here.
Cowboy uses standard Erlang address tuple: {{O1, O2, O3, O4}, Port}, while local peers are parsed into five-tuple: {O1, O2, O3, O4, Port}.
Probably should change this, but I'm afraid I can suddenly break something. Should I try?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shizzard unfortunately it looks ar_util:safe_parse_peer and ar_util:parse_peer are used to parse all config peers and they always dump to the 5 element tuple. I agree probably too risky to update that now, but can you log a github issue? Seems like something we should rationalize with cowboy.

FWIW it looks like the parse_peer code is from 7 years ago (!)

ok = ar_semaphore:acquire(get_and_pack_chunk, infinity),
{RequestedPacking, ok};
{_, _} ->
{none, {reply, {403, #{}, <<>>, Req}}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a 404 to match the old behavior: https://github.com/ArweaveTeam/arweave/pull/637/files#diff-0b7084531612365a22d52032a422eb664adc91b34cefb5f8d9499584072aa489L2013

We could have something like {true, false} be a 403

But I think best to just always 404 if the server can't or is unwilling to serve the chunk in the requested format

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I did this change to distinct this particular response from others during the tests. Will revert.

JamesPiechota and others added 5 commits November 11, 2024 22:25
Prior to this change ar_tx_blacklist_tests relied on every test node
having pack_served_chunks set and being able to repack chunks on
demand. To preserve that behavior we have to add each node to
the local_peers config for the test nodes.

Note: I haven't run this change on other tests yet, it's possible
it causes something else to fail (in which case we may want to
clear local_peers in that failing test)
vird and others added 4 commits November 25, 2024 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants