Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/bisect: filter out obviously unrelated crash types #5414

Open
a-nogikh opened this issue Oct 18, 2024 · 2 comments
Open

pkg/bisect: filter out obviously unrelated crash types #5414

a-nogikh opened this issue Oct 18, 2024 · 2 comments
Assignees
Labels

Comments

@a-nogikh
Copy link
Collaborator

Random lost connection crashes seem to regularly derail the bisection process. A recent example:

https://lore.kernel.org/all/[email protected]/T/#ef85d48463732e6ac91be891e77e9bf90ba88ddee

In the case above, the problems began here:

testing commit b8c8ba73c68bb3c3e9dad22f488b86c540c839f9 gcc
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
kernel signature: 2421239c9ec8055b97da598fa4ac0a54ba5b001988adf334fe6b4cd3791ada98
run #0: crashed: lost connection to test machine
run #1: crashed: lost connection to test machine
run #2: OK
run #3: OK
run #4: OK
run #5: OK
run #6: OK
run #7: OK
run #8: OK
run #9: OK
representative crash: lost connection to test machine, types: [UNKNOWN]
# git bisect bad b8c8ba73c68bb3c3e9dad22f488b86c540c839f9

We do have a lower bound on the number of observed crashes to conclude the step to be a bisect bad:

wantBadRuns := max(2, (total-infra)/6) // For 10 runs, require 2 crashes. For 20, require 3.

But apparently it's not enough. We should be also looking at least at the correspondence of the observed crash types to the crash type of the original bisected issue.

@a-nogikh a-nogikh added the bug label Oct 18, 2024
@a-nogikh a-nogikh self-assigned this Oct 18, 2024
@a-nogikh
Copy link
Collaborator Author

One more bisection that derailed because of lost connection: https://syzkaller.appspot.com/x/bisect.txt?x=129f2d87980000

# git bisect bad 9bfae8f5ca6570c8afb7635a17328fa3f1b1a6c3
Bisecting: 139 revisions left to test after this (roughly 7 steps)
[729fe5cc34dc1b0c44a74fce3f6cec7a6c0735c7] Merge branch 'timers/core' into core/merge, to resolve conflict

testing commit 729fe5cc34dc1b0c44a74fce3f6cec7a6c0735c7 gcc
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
kernel signature: 29f9ca9f1f1dfe31a876ea96af8f7fb943fc6792164ce16a820e238c5779dafa
all runs: crashed: WARNING: locking bug in trie_delete_elem
representative crash: WARNING: locking bug in trie_delete_elem, types: [LOCKDEP]
# git bisect bad 729fe5cc34dc1b0c44a74fce3f6cec7a6c0735c7
Bisecting: 74 revisions left to test after this (roughly 6 steps)
[4febce44cfebcb490b196d5d10ae9f403ca4c956] posix-timers: Cure si_sys_private race

testing commit 4febce44cfebcb490b196d5d10ae9f403ca4c956 gcc
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
kernel signature: ea4f16b5b804ab603220aba8e735c9000027f9ba77181a80b6c71ddac84b57ce
run #0: crashed: lost connection to test machine
run #1: crashed: lost connection to test machine
run #2: crashed: lost connection to test machine
run #3: crashed: lost connection to test machine
run #4: crashed: lost connection to test machine
run #5: crashed: lost connection to test machine
< ... >
run #7: OK
run #8: OK
run #9: OK
representative crash: lost connection to test machine, types: [UNKNOWN]
# git bisect bad 4febce44cfebcb490b196d5d10ae9f403ca4c956

@a-nogikh
Copy link
Collaborator Author

One more case: https://syzkaller.appspot.com/x/bisect.txt?x=12a981a7980000

testing commit 0ac20437412bfc48d67d33eb4be139eafa4a0800 gcc
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
kernel signature: 0d6cb28901f7c354cebda461fb76e7a1de117e58165cc52348fcbc8efb279da8
run #0: crashed: lost connection to test machine
run #1: crashed: lost connection to test machine
run #2: boot failed: can't ssh into the instance
run #3: OK
run #4: OK
run #5: OK
run #6: OK
run #7: OK
run #8: OK
run #9: OK
representative crash: lost connection to test machine, types: [UNKNOWN]
# git bisect bad 0ac20437412bfc48d67d33eb4be139eafa4a0800

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant