-
Notifications
You must be signed in to change notification settings - Fork 12.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Emit a waitcnt instruction after each memory instruction #79236
[AMDGPU] Emit a waitcnt instruction after each memory instruction #79236
Commits on Apr 4, 2024
-
[AMDGPU] Emit a waitcnt instruction after each memory instruction
This patch introduces a new command-line option for clang, namely, amdgpu-precise-mem-op. When this option is specified, a waitcnt instruction is generated after each memory load/store instruction. The counter values are always 0, but which counters are involved depends on the memory instruction.
Jun Wang committedApr 4, 2024 Configuration menu - View commit details
-
Copy full SHA for 502406d - Browse repository at this point
Copy the full SHA 502406dView commit details -
Combined insertions of waitcnt with existing SIMemoryLegalizer code.
Jun Wang committedApr 4, 2024 Configuration menu - View commit details
-
Copy full SHA for 6624b6a - Browse repository at this point
Copy the full SHA 6624b6aView commit details -
Merge code for precise mem with the existing SICacheControl classes.
Jun Wang committedApr 4, 2024 Configuration menu - View commit details
-
Copy full SHA for 8abdc34 - Browse repository at this point
Copy the full SHA 8abdc34View commit details -
Jun Wang committed
Apr 4, 2024 Configuration menu - View commit details
-
Copy full SHA for adaa16c - Browse repository at this point
Copy the full SHA adaa16cView commit details -
Some small changes based on code review.
Jun Wang committedApr 4, 2024 Configuration menu - View commit details
-
Copy full SHA for c02b87b - Browse repository at this point
Copy the full SHA c02b87bView commit details -
Jun Wang committed
Apr 4, 2024 Configuration menu - View commit details
-
Copy full SHA for 571ce58 - Browse repository at this point
Copy the full SHA 571ce58View commit details -
Change the option from amdgpu-precise-memory-op to precise-memory
for the backend.
Jun Wang committedApr 4, 2024 Configuration menu - View commit details
-
Copy full SHA for 93b00bf - Browse repository at this point
Copy the full SHA 93b00bfView commit details -
Move implementation from SIMemoryLegalizer to SIInsertWaitcnts.
Jun Wang committedApr 4, 2024 Configuration menu - View commit details
-
Copy full SHA for c42d3fb - Browse repository at this point
Copy the full SHA c42d3fbView commit details -
Jun Wang committed
Apr 4, 2024 Configuration menu - View commit details
-
Copy full SHA for 44bada0 - Browse repository at this point
Copy the full SHA 44bada0View commit details -
Use getAllZeroWaitcnt() when creating the Wait obj. Some changes
to the test file.
Jun Wang committedApr 4, 2024 Configuration menu - View commit details
-
Copy full SHA for 20312a1 - Browse repository at this point
Copy the full SHA 20312a1View commit details -
Use update_llc_test_checks.py on insert_waitcnt_for_precise_memory.ll.
Jun Wang committedApr 4, 2024 Configuration menu - View commit details
-
Copy full SHA for 12dde5f - Browse repository at this point
Copy the full SHA 12dde5fView commit details -
Replace testcases test_load_store() and test_load_store_as5() with
tail_call_byval_align16() to cover buffer_load/store, and scratch_load/store.
Jun Wang committedApr 4, 2024 Configuration menu - View commit details
-
Copy full SHA for 1e3c7dd - Browse repository at this point
Copy the full SHA 1e3c7ddView commit details -
Check if mem instruciton is already immediately followed by a
waitcnt instruction. If so, do not insert another waitcnt. Also add a testcase that has ds_add_rtn. Formatting change made to SIMemoryLegalizer.cpp is reverted.
Jun Wang committedApr 4, 2024 Configuration menu - View commit details
-
Copy full SHA for 4f4bf31 - Browse repository at this point
Copy the full SHA 4f4bf31View commit details -
Change iterator update operator from post-inc to pre-inc.
Jun Wang committedApr 4, 2024 Configuration menu - View commit details
-
Copy full SHA for 4ae38b6 - Browse repository at this point
Copy the full SHA 4ae38b6View commit details -
With llvm#87539, previous commit that checks for the instruction imme…
…diately after a load/store is not necessary.
Jun Wang committedApr 4, 2024 Configuration menu - View commit details
-
Copy full SHA for 49cad2d - Browse repository at this point
Copy the full SHA 49cad2dView commit details
Commits on Apr 9, 2024
-
Add testcase that covers flat_atomic_swap, an atomic without return.
Jun Wang committedApr 9, 2024 Configuration menu - View commit details
-
Copy full SHA for 6d52f6e - Browse repository at this point
Copy the full SHA 6d52f6eView commit details