-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix undersized buffer for Vulkan pixel history with many fragments #3014
Fix undersized buffer for Vulkan pixel history with many fragments #3014
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks great, the fix is clean and fits in well, and I definitely appreciate you adding the test - it will be especially good once more of your improvements come in.
I've left some notes mostly of minor/stylistic things on the test, but nothing major.
util/test/demos/vk/vk_blend.cpp
Outdated
// Blending on an SRGB color space gives incorrect results for this test, | ||
// as eventually the values it outputs are too small to make a difference. | ||
// Use a linear color space instead. | ||
requestedSwapChainFormat = VK_FORMAT_B8G8R8A8_UNORM; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you need a specific image format to render to, please create an offscreen target to render to that. There are helpers around that will let you create images, renderpasses and framebuffers with minimal extra boilerplate. VK_Shader_Debug_Zoo
is an example that creates an RGBA32F target which is needed for testing full floating point output values.
Using a higher precision format like RGBA16F would also allow you to e.g. blend with 1.0
instead of 1.0/255.0
, and also clearly distinguish between 255 and 512 levels of overdraw.
The backbuffer should only be used when the precision doesn't matter, or as a preview (where also the specific format doesn't matter much), which means we then don't have to worry about e.g. if this format isn't supported by a platform's WSI etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed it to RGBA32F but left the values at 1.0/255.0 for now, and changed the test to handle that (which only requires treating the last value specially). But, currently the presented image shows up as black/undefined, which doesn't affect the test itself but still might be confusing. (1.0/255.0
seems better for showing it on the screen directly, though I'm sure it's also possible to normalize it afterwards; I just didn't find a helper for that.)
I'm not sure how VK_Shader_Debug_Zoo
handles that (though also, I couldn't run it on my machine as the demo isn't usable due to "Limit 'maxPerStageDescriptorStorageImages' 16 is insufficient (need at least 64)").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeh VK_Shader_Debug_Zoo
doesn't render to the backbuffer, it just clears it. Since it's only rendering single pixels there's not much to output there and all the actual python checks are on the offscreen target. Some tests like VK_CBuffer_Zoo
will use the blitToSwap()
helper to blit to the backbuffer, which you can use if you want to have a preview.
16 for maxPerStageDescriptorStorageImages sounds absolutely crippled. Are you running on a software emulation vulkan ICD, or a many-years-old vulkan 1.0 minimum spec device?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, blitToSwap
did what I wanted.
16 for maxPerStageDescriptorStorageImages sounds absolutely crippled. Are you running on a software emulation vulkan ICD, or a many-years-old vulkan 1.0 minimum spec device?
It's an Intel integrated GPU. It does support Vulkan 1.3, but it's generally fairly weak.
NUM_TRIANGLES_RED = 16 | ||
NUM_TRIANGLES_RED_REAL = NUM_TRIANGLES_RED * 2 - 1 | ||
NUM_TRIANGLES_GREEN = 255 | ||
NUM_TRIANGLES_BLUE = 512 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I've done in the past is pass these constants out through the capture via simple setMarker
labels in the capture. E.g. to have the demos program insert markers that say "NUM_TRIANGLES_RED: 16" and parse it out.
There's no reason you have to do that in this PR so you don't have to change this if you don't want to. I expect these values will probably not change or if they do it'll be obvious, but I thought it might be good to know for the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That (combined with #3014 (comment)) is a good idea for other uses, but yeah, I don't think it's helpful here, so I haven't changed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeh it's more something to have in your toolbox than particularly critical here.
red_last_eid = self.find_action("End of red").next.eventId | ||
green_eid = self.find_action("Green: 255 (the maximum we can handle) in a single drawcall").next.eventId | ||
blue_eid = self.find_action("Blue: 512 (more than the maximum) in a single drawcall").next.eventId | ||
all_eid = self.find_action("All of the above in a single drawcall").next.eventId |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these comments in the markers are good, but I'm a little uneasy about finding them exactly like this. It's a little brittle against any changes in the code, more than e.g. the number of triangle constants.
Since find_action
is a substring match, one easy way would be to make a more unique identifier like REDTEST: groups of repeated draws
and then serarch for REDTEST:
. You could also do multiple markers - one for a comment, and then the next one just before the draw as a more programmatic identifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed it to check for only Red:
and similar.
Before, the buffer could be overrun, which could result in anything from garbage data to the GPU device being lost to segfaults. Now, correct data is gathered. We can't know in advance how many primitives will hit the targetted pixel, so it's not possible to create the buffer until after our first pass for each events (fortunately we do know the number of events in advance). It is possible that we don't need a larger buffer, though, in which case the original one can be re-used.
ba555d6
to
ef64a04
Compare
ef64a04
to
fe32013
Compare
Description
This PR fixes a bug with Vulkan pixel history where the results were corrupted (or the driver would crash) when a large number of primitives all hit a single pixel (compared to the number of events). The bug is fixed by ensuring that the buffer is sufficiently large, allocating a new one if necessary (but re-using the previous buffer for events if it works).
A demo and test case is also added for this. The demo repeatedly draws overlapping triangles with one of the color components being 1 and with additive blending enabled (alpha is not used). Without the bugfix commit, attempting to use pixel history with the demo will cause a crash. The test verifies that the corresponding color gets incremented. (It also verifies the current behavior when more than 255 primitives hit a single pixel in one drawcall, where only the first 255 are recorded (due to that being the max for the stencil) and the shader out value being that of the 255th draw, but the postmod value being that of the final draw.)
(Split off from #3003.)