Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support 'metadata update' in 'sign_metadata' task #355

Merged
merged 15 commits into from
Nov 6, 2023

Conversation

lukpueh
Copy link
Collaborator

@lukpueh lukpueh commented Aug 14, 2023

related to #336

Implement support for distributed asynchronous root metadata signging in the course of a "metadata update" event.

Other than the already supported "bootstrap" signing event, signatures added to root during "metadata update" must validate with keys from trusted OR new root, and meet the signature threshold of trusted AND new root.

Related changes:

  • Refactor _validate_{signature, threshold} helpers to accept an optional delegator (e.g. trusted root).
  • Add _sign_result helper to return a "sign metadata"-specific task result.

msg = f"Root v{root.signed.version} is pending signatures"
return self._sign_result(msg, True)

# TODO: Refactor `_root_metadata_update` to de-duplicate validation
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know, if I should refactor _root_metadata_update in this PR or in a follow-up. I think it's correct as is, but performs some unnecessary cycles.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You made a minimalist change in this PR, great work.
So, we can accommodate the refactoring here, but it is up to you.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to do it in a follow-up. :) I can ticketize.

@lukpueh lukpueh marked this pull request as draft August 14, 2023 11:01
@lukpueh lukpueh marked this pull request as ready for review August 14, 2023 11:59
@lukpueh
Copy link
Collaborator Author

lukpueh commented Aug 14, 2023

Need to fix tests (some expected messages changed) and add new ones for the new feature. Otherwise this should be good for review.

@codecov
Copy link

codecov bot commented Aug 16, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (32394d9) 100.00% compared to head (b82fbb2) 100.00%.

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #355   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           14        14           
  Lines          960       999   +39     
=========================================
+ Hits           960       999   +39     
Files Coverage Δ
repository_service_tuf_worker/repository.py 100.00% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lukpueh
Copy link
Collaborator Author

lukpueh commented Aug 17, 2023

This is ready for another round of review. I adopted updates in existing tests and added new tests for:

  • invalid signature (different combinations)
  • valid signature and invalid threshold (different combinations)
  • valid signature and valid threshold -> finalise metadata update

Copy link
Member

@MVrachev MVrachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that ROLE_SIGNING is used in many places where we have documented SIGNING_ROLE: https://repository-service-tuf.readthedocs.io/en/latest/devel/design.html#rstuf-repository-settings-configuration
Do we want to update the docs or update our code?
@lukpueh @kairoaraujo

Also, what should we do with the payload['role'] argument?
If we decide to keep it we should make the code less opinionated about the role and not mention root everywhere (with the exception of the bootstrap case of course).
I think there is a possibility we would introduce offline key(s) support for the Targets role in the future considering we want to give more flexibility on how users want to delegate trust, so I am in favor of keeping this argument for now.

repository_service_tuf_worker/repository.py Outdated Show resolved Hide resolved
repository_service_tuf_worker/repository.py Show resolved Hide resolved
repository_service_tuf_worker/repository.py Show resolved Hide resolved
repository_service_tuf_worker/repository.py Show resolved Hide resolved
repository_service_tuf_worker/repository.py Show resolved Hide resolved
repository_service_tuf_worker/repository.py Show resolved Hide resolved
repository_service_tuf_worker/repository.py Show resolved Hide resolved
repository_service_tuf_worker/repository.py Outdated Show resolved Hide resolved
repository_service_tuf_worker/repository.py Outdated Show resolved Hide resolved
@@ -3311,3 +3267,124 @@ def fake_get_fresh(key):
assert test_repo.write_repository_settings.calls == [
pretend.call("ROOT_SIGNING", "fake_metadata")
]

@pytest.mark.parametrize(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a discussion with @kairoaraujo and @KAUTH regarding table testing done in this way and we are all with mixed feelings.
The biggest issues I see with them:

  1. it's harder to debug which test case fails when you see only test_sign_metadata__update with a specific assert that not always point you to the correct use case.
  2. it could be a little complicated for new users.

What do you think @kairoaraujo and @KAUTH ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a discussion with @kairoaraujo and @KAUTH regarding table testing done in this way and we are all with mixed feelings. The biggest issues I see with them:

Yeah, I figured. That's why I used the copy/paste approach for the first two main test cases, and only refactored on top of that. (Note though that the table does have a few extra test cases).

  1. it's harder to debug which test case fails when you see only test_sign_metadata__update with a specific assert that not always point you to the correct use case.

That's not necessarily true, if you look at the log you'll see that each call of the method is listed as separat test.

  1. it could be a little complicated for new users.

🤷 I guess, that's debatable. If we expect devs to get familiar with pytest, monkey patching, pretend, mock, etc., we might as well ask them to read pytest docs for parametrize. I personally find thousands of lines of testing code more intimidating. But yeah I agree with the intention of making testing code as little engineered as possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to decide about this test and methodology @kairoaraujo and @lukpueh.
Do we want to meet or discuss it here?

PS: Now when I read it it's not immediately clear to me which test case is testing what.
Maybe if there was some kind of a title of a test case would be better.
Something familiar to what we did in python-tuf here: https://github.com/theupdateframework/python-tuf/blob/038ecd65dc6b74e9d0306d3c57c3e7d621155e47/tests/test_updater_key_rotations.py#L92

@lukpueh
Copy link
Collaborator Author

lukpueh commented Aug 24, 2023

Thanks for the review, @MVrachev! I'll push a commit for some of your inline suggestions. Regarding the things below:

I saw that ROLE_SIGNING is used in many places where we have documented SIGNING_ROLE: https://repository-service-tuf.readthedocs.io/en/latest/devel/design.html#rstuf-repository-settings-configuration Do we want to update the docs or update our code? @lukpueh @kairoaraujo

Let's fix this in the docs, as agreed on slack.

Also, what should we do with the payload['role'] argument? If we decide to keep it we should make the code less opinionated about the role and not mention root everywhere (with the exception of the bootstrap case of course). I think there is a possibility we would introduce offline key(s) support for the Targets role in the future considering we want to give more flexibility on how users want to delegate trust, so I am in favor of keeping this argument for now.

See #355 (comment) for my feelings about payload['role']. If you folks insist on leaving it there, I can add a check for it.

Regarding making the function more generic, as you suggest inline, e.g. using payload['role'] instead of a hardcoded "root", I am rather opposed. The function is simple enough to update, when we support other metadata types. Until then I don't see a point in making it more complicated than necessary or look more generic than it is.

@lukpueh lukpueh requested a review from MVrachev August 24, 2023 14:55
@MVrachev
Copy link
Member

Thanks for the review, @MVrachev! I'll push a commit for some of your inline suggestions. Regarding the things below:

I saw that ROLE_SIGNING is used in many places where we have documented SIGNING_ROLE: https://repository-service-tuf.readthedocs.io/en/latest/devel/design.html#rstuf-repository-settings-configuration Do we want to update the docs or update our code? @lukpueh @kairoaraujo

Let's fix this in the docs, as agreed on slack.

Fixed by repository-service-tuf/repository-service-tuf#433.

Also, what should we do with the payload['role'] argument? If we decide to keep it we should make the code less opinionated about the role and not mention root everywhere (with the exception of the bootstrap case of course). I think there is a possibility we would introduce offline key(s) support for the Targets role in the future considering we want to give more flexibility on how users want to delegate trust, so I am in favor of keeping this argument for now.

See #355 (comment) for my feelings about payload['role']. If you folks insist on leaving it there, I can add a check for it.

Regarding making the function more generic, as you suggest inline, e.g. using payload['role'] instead of a hardcoded "root", I am rather opposed. The function is simple enough to update, when we support other metadata types. Until then I don't see a point in making it more complicated than necessary or look more generic than it is.

I don't insist on making this change now, but given we already accept role argument in the API payload:
https://github.com/repository-service-tuf/repository-service-tuf-api/blob/407ffffef7f2c94cddcee2db6fd58cb256e3fb5e/repository_service_tuf_api/metadata.py#L152
I think it's better we validate it at least and return a message if a non-root value is given.
Of course, another solution is changing the API itself.
Finally, of course, it's a solution to not do anything, but it's not ideal as we accept an argument we don't actually use.

The final question is about the testing. I think that this is a question that's debatable and we have different opinions.
I think it's better to not hurry to merge them directly, but try and come up with a decision with the rest of the team: @kairoaraujo and @KAUTH.
Given the circumstances and that we don't know how much time you will be around these weeks, I am open to merging this without the tests themselves, but we need to have an issue about this.

@lukpueh
Copy link
Collaborator Author

lukpueh commented Aug 25, 2023

I don't insist on making this change now, but given we already accept role argument in the API payload: https://github.com/repository-service-tuf/repository-service-tuf-api/blob/407ffffef7f2c94cddcee2db6fd58cb256e3fb5e/repository_service_tuf_api/metadata.py#L152 I think it's better we validate it at least and return a message if a non-root value is given. Of course, another solution is changing the API itself. Finally, of course, it's a solution to not do anything, but it's not ideal as we accept an argument we don't actually use.

You're right. Let's not decide, if the argument makes sense or not, but just accept its existence. Let me add a check and a test.

The final question is about the testing. I think that this is a question that's debatable and we have different opinions. I think it's better to not hurry to merge them directly, but try and come up with a decision with the rest of the team: @kairoaraujo and @KAUTH. Given the circumstances and that we don't know how much time you will be around these weeks, I am open to merging this without the tests themselves, but we need to have an issue about this.

Sounds good. I think we can merge the tests in their table form and open an issue where we discuss this. If the team is opposed, it will be easy to just copy and paste this function len(table) times.

@lukpueh
Copy link
Collaborator Author

lukpueh commented Oct 4, 2023

I rebased and force pushed this branch to resolve some conflicts, and also added another commit to re-use parts of the "metadata update" task (see last commit).

For this feature to be useful / usable, we still need to change the "metadata update" task, to accept partially signed metadata. I have another commit for that which comes at ~ 65 additions and 44 deletions (w/o tests).

Should I push it here, or open a separate PR?

@kairoaraujo
Copy link
Member

@lukpueh
I'm ok with pushing it here. I think we made good progress, and it is partially reviewed.

@lukpueh
Copy link
Collaborator Author

lukpueh commented Oct 5, 2023

@kairoaraujo, @MVrachev: This is ready for another round of review. The most recent commit patches metadata_update task interface to allow partially signed root metadata, which can then be finalised via sign_metadata.

The big caveat is that metadata_update enters the signing event in any case if the the threshold is not reached. Also if there is not initial signature at all, or if there are invalid signatures, etc. Changing this would require a more invasive patch, which I think we should hold off until #367.

@MVrachev
Copy link
Member

I will review the code today, but for testing, it's good if we have repository-service-tuf/repository-service-tuf#515.
Do we want to work on merging this without functional tests first or do we want to wait for them @kairoaraujo, @lukpueh?

@kairoaraujo
Copy link
Member

I will review the code today, but for testing, it's good if we have repository-service-tuf/repository-service-tuf#515.
Do we want to work on merging this without functional tests first or do we want to wait for them @kairoaraujo, @lukpueh?

I'm running the tests to check if it doesn't affect the fully signed metadata update.
I don't see issues so far, so merging this has no problems.

Copy link
Member

@MVrachev MVrachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main code is clean, elegant and easy to read.
Left a couple of small nits about it.

We still need to discuss:

  • do we want to merge this without FT tests?
  • what to do with the parametrize test?

repository_service_tuf_worker/repository.py Outdated Show resolved Hide resolved
repository_service_tuf_worker/repository.py Outdated Show resolved Hide resolved
@@ -3311,3 +3267,124 @@ def fake_get_fresh(key):
assert test_repo.write_repository_settings.calls == [
pretend.call("ROOT_SIGNING", "fake_metadata")
]

@pytest.mark.parametrize(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to decide about this test and methodology @kairoaraujo and @lukpueh.
Do we want to meet or discuss it here?

PS: Now when I read it it's not immediately clear to me which test case is testing what.
Maybe if there was some kind of a title of a test case would be better.
Something familiar to what we did in python-tuf here: https://github.com/theupdateframework/python-tuf/blob/038ecd65dc6b74e9d0306d3c57c3e7d621155e47/tests/test_updater_key_rotations.py#L92

@lukpueh
Copy link
Collaborator Author

lukpueh commented Oct 10, 2023

We need to decide about this test and methodology @kairoaraujo and @lukpueh.

Let me know what it should be. If you are fine with my approach, I can add titles as Martin suggests. Otherwise, I can copy paste. Note that the difference is ~140 vs. ~450 LOC

@kairoaraujo
Copy link
Member

We need to decide about this test and methodology @kairoaraujo and @lukpueh.

Let me know what it should be. If you are fine with my approach, I can add titles as Martin suggests. Otherwise, I can copy paste. Note that the difference is ~140 vs. ~450 LOC

I'm fine with using pytest.mark.parametrize

I had an experience (with parametrize) when some contributors (actually coworkers) had difficulties identifying, understanding, and reading the tests. Most of the time, they avoid touching the code base as it requires changes in the test. Sometimes, they could change the code but needed to know how to fix or implement a test to avoid regression, adding code without tests.

I noticed that the difficulties with tests, in general, can scare contributors with short experience in the language.

That is why I choose to implement tests in very descriptive steps/repetitions, where whoever touches the code can read and follow the steps side-by-side.
tests/<test_type>/<package>/test_<module>.py and look for all test_<function name>_<scenario>
I'm not sure if I achieved it here. I'm sharing why it is like this now, but I'm open to change if we decide to.

@MVrachev
Copy link
Member

In a meeting with @kairoaraujo and our new contributors @enyinna1234 and @ivanayov they read the parametrize test and shared it was difficult for them to follow the test.

Maybe it's because we have to use a lot of mocking when we test sign and this combined with parametrize makes it more complicated. The example I give with https://github.com/theupdateframework/python-tuf/blob/038ecd65dc6b74e9d0306d3c57c3e7d621155e47/tests/test_updater_key_rotations.py#L92 the test code and code that is tested are simple. No mocking is required and things are more or less straightforward.

Either way, for this particular case it seems it's better to use the more descriptive approach with code duplications.
I think it's a good idea to experiment with parametrize more, but testing relatively easy functions with multiple possible branches and ideally without mocking anything.

@lukpueh
Copy link
Collaborator Author

lukpueh commented Oct 11, 2023

In a meeting with @kairoaraujo and our new contributors @enyinna1234 and @ivanayov they read the parametrize test and shared it was difficult for them to follow the test.

Okay, that's a good argument. One final note, though ... and then I'll gladly change it. Is it really the usage of parametrize that makes the test difficult to follow? 😄

As I said before, I think that nested functions, "fake" variables, pretend, stub, call_recorder, monkeypatch, lambda, iter, etc. set the bar for complicated constructs fairly high. It took me at least quite a while to understand their usage in other copy-paste style tests, when writing this one.

Would code documentation like this clarify the use of parametrize (it's really just an invisible for loop) ...?

# `parametrize` calls the test function for each "row" in the test data table. 
# The fields in a row are passed as arguments to the test function. 
#
#   payload_patch: input part for tested function (`sign_metadata`) 
#   validation_results: mock return value of called helper functions
#   details: expected return value part of test function
#   status: expected return value part of test function

I am very much open for renames and restructures of these variables, e.g. validation_results could easily be split in two variables and/or details and status could be combined in one. There could also be a title field, which describes each test scenario, as Martin suggested.

lukpueh and others added 13 commits October 20, 2023 12:08
Implement support for distributed asynchronous root metadata signing
in the course of a "metadata update" event.

Other than the already supported "bootstrap" signing event, signatures
added to root during "metadata update" must validate with keys from
trusted OR new root, and meet the signature threshold of trusted AND new
root.

*Related changes:*
- Ignore obsolete "rolename" in sign_metadata payload. We only support
  root, and check the type when loading "ROOT_SIGNING".
- Refactor `_validate_{signature, threshold}` helpers to accept an
  optional delegator (e.g. trusted root).
- Add local `_result` helper to return a "sign metadata"-specific
  task result.

Signed-off-by: Lukas Puehringer <[email protected]>
- remove obsolete "test_sign_metadata_root_signing_no_bootstrap"
  Now, if there is no ongoing "bootstrap", we just assume there is an
  ongoing "metadata update".
- adopt changes in "test_sign_metadata_invalid_role_type"
  - new expected error message
  - fail earlier, before consulting with "BOOTSTRAP" state variable

Signed-off-by: Lukas Puehringer <[email protected]>
Combine existing sign_metadata/update metadata tests in a single
parametrized test method and add additional test cases for different
return values from internal `_validate_{signature, threshold}` calls, in order
to test correct sage of OR/AND operators:
- signature must be valid according to trusted OR new root
- threshold must be met according to trusted AND new root

Note: The test removes asserts for internal method calls, which don't
seem so interesting, as long as we get the expected result.

Signed-off-by: Lukas Puehringer <[email protected]>
- Use elif instead of elsewhere appropriate
- Remove blank line in docstrings
- Clarify comment about signature/threshold validation

Co-authored-by: Martin Vrachev <[email protected]>
Signed-off-by: Lukas Puehringer <[email protected]>
This is a temporary measure until after _root_metadata_update has been
refactored (see code comment) to not fail silently e.g in in tests that
mock the argument passed to _root_metadata_update.

This commit also updates the related tests to now mock the
`_root_metadata_update` result too. As the wrong result would no longer
fail silently.

Signed-off-by: Lukas Puehringer <[email protected]>
'sign_metadata' only supports root, thus the role in the payload is
not relevant, and was ignored previously.

For consistency, this commit adds a check that the role is indeed
root and fails otherwise.

This is also tested by adding another column to the test table of
test_sign_metadata__update, used to patch the default payload in test
runs.

Signed-off-by: Lukas Puehringer <[email protected]>
Factor out "finalize" part of _root_metadata_update to re-use in
sign_metadata.

Prior to this commit, sign_metadata would call _root_metadata_update
duplicating much of the verification behavior, although it only cared
for the finalization part. Now, it can call into the desired subroutine
only.

Signed-off-by: Lukas Puehringer <[email protected]>
Update `_root_metadata_update` subroutine of `metadata_update` task
interface to accept partially signed metadata. If the required threshold
is not met, the passed metadata is written to the "ROOT_SIGNING"
repository setting and the task returns with a "pending signatures"
message

Missing signatures can then be added using the `sign_metadata` task
interface, which also finalizes the metadata update, as soon as the
threshold is reached.

NOTE: Currently, there is no sanity check of signatures below the
threshold.  A useful check might be, that passed metadata has at least 1
initial and only valid signatures, akin to bootstrap. repository-service-tuf#367 will make
this a lot easier.

This change also includes a reordering of the validation routine to check
the version increment prior to signature threshold. Otherwise, a bad
version would only be detected after all signatures have been added.

Signed-off-by: Lukas Puehringer <[email protected]>
Use test copy pasta instead of @parametrize.

Signed-off-by: Lukas Puehringer <[email protected]>
@lukpueh
Copy link
Collaborator Author

lukpueh commented Oct 20, 2023

Force pushed w/o change to rebase on main and added two new commits:

  • var renames, as requested by @MVrachev
  • copy tests instead of using @parametrize, as requested by all :)

Comment on lines +3553 to +3554
fake_signature_result = iter((False, True))
fake_threshold_result = iter((False, True))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting way of mocking return type.
I like it better than compared to using MagicMock as we did until now when we have to mock a different behavior of two calls.

Copy link
Member

@MVrachev MVrachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reapprove to run functional tests with pr repository-service-tuf/repository-service-tuf#515

@kairoaraujo kairoaraujo merged commit 17d8dcb into repository-service-tuf:main Nov 6, 2023
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

3 participants