Replies: 2 comments 7 replies
-
I do like the idea of incrementally updating the texture without rendering. Not because it will allow dirty rect algorithms, but because it allows the user to avoid keeping a copy of the framebuffer. And thus avoids a memcpy in some cases. I have written dirty rect bookkeeping in a game engine before and my disappointment is that they amortize to the same performance as copying the whole buffer each frame. I saw another experiment recently where the author noted the same issues: https://blogs.scummvm.org/subr3v/2014/08/11/dirty-rectangles-system-performance-considerations/ (I highly recommend reading the whole series, there are five parts. This is the last of them.) In this specific case, it didn't help where you would want it most, when the FPS was lower than 60. But it did greatly improve scenes that already had better than 60 FPS without. This is not a generalization, but it matches my own experience.
We can try it though, smaller regional uploads. I don't anticipate it will be a marked improvement in the general case. You brought up slow hardware, and I think it is worth digging into this topic in more detail. In the context of There are two good articles (both commonly cited) that describe a good architecture for a game loop: They both arrive at the same conclusion (and deWiTTERS even links directly to the Timestep article). E.g. the game updates at a fixed 30 Hz and the display refreshes as fast as it is able. And "as fast as it is able" means one of two things:
But, in the case of 2, as both articles astutely observe, refreshing the display that quickly for a 30 Hz game isn't really that useful! You might as well just cap the refresh rate at 30 FPS. It will look just as smooth as it would at 240 FPS. Instead, they recommend interpolating frames at higher refresh rates, which implies that you are re-rendering and re-uploading the whole frame regardless. And that's my takeaway, as well. If you want to avoid the slightly wasteful re-uploads with lower update frequencies, the best way to do that is by capping the frame rate with a sleep. This is the most common thing I found when I was exploring how other projects handled frame rate for #174. The other issue with regards to frame rate is frame pacing. And this is more about "emulating 30 FPS" on a higher frequency display when updating the display at its native refresh rate. VRR helps a lot here when using a sleep to cap the frame rate. For machines without VRR, your sleep needs to account for the native refresh rate. For instance, so that it begins rendering precisely before every even numbered frame on a 60 HZ display. If you go over the refresh rate budget, that causes frame pacing issues. In other words, you have to pay attention to the frequency of frame skipping. Don't just let it average out over a long period (like one second) you want the frame skipping to be relatively constant at all times. To recap, these kinds of issues are largely outside of the scope of pixels/examples/invaders/simple-invaders/src/lib.rs Lines 38 to 42 in 5461133 pixels/examples/invaders/src/main.rs Lines 153 to 158 in 5461133 Why is the update frequency so high in this example? It's for predictable and robust physics integration. It seems kind of silly to do this for a Space Invaders clone, but it makes sense when you add particle physics. Like I did in this capture from an experimental branch: simple-invaders-particle-collisions.movAlso, I think the linked articles make an implied assumption that state updates always take longer than drawing. In my experience it's the opposite. State updates are relatively simple and fast, at least in the kinds of games that I have worked on. Physics, AI, path finding, ray casting, they all have various optimizations and many of these can run in parallel, making use of SIMD and multiple CPU cores. On the other hand, taxing your GPU is ridiculously easy to do with complex shaders and scenes. Many of the optimizations happen on the CPU-side, like culling VBOs to reduce memory bandwidth. But that's still part of drawing. VBOs aren't really a problem with 2D games, but some postprocessing effects like blur and bloom can be notoriously expensive. But this is also outside of the That's pretty much everything I know about the subject. Hope it is helpful. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the detailed response. The requirements of an emulator can be a little unique compared to a game.
The usefulness is that we might have a render rate of some FPS, but a shader framerate higher than that. Consider a CRT shader that does some sort of rolling raster effect at native refresh (144Hz in my case). Or drawing egui at 60fps for smooth window dragging and scrolling, regardless of how fast the pixel buffer is updated. In these cases updating the pixel buffer each frame is a waste. |
Beta Was this translation helpful? Give feedback.
-
We may want to render the surface at a faster rate than we update the underlying pixel buffer, especially if we are using render_with to overlay a gui or render with a custom shader. We could have a game tick or emulation target that runs at 30Hz, while updating a shader or debug gui at 60Hz. Currently, the pixel buffer will be uploaded to the GPU on every call to render_with(). Although not an issue for most PCs, it can become an issue on lower-spec hardware.
I propose some sort of facility to render without re-uploading the pixel buffer. To avoid breaking changes, perhaps this can be a flag set (set_dirty(false)?) before calling render_with, or perhaps we can expand functionality with a separate function with a new interface, like render_with_ex. The latter may be preferable I think, especially since we could expand this function in several ways like providing a rect to only upload a portion of the pixel buffer. wgpu seems to support updating regions of a texture via the 'origin' field of ImageCopyTexture and 'size' argument to write_texture, unless I am mistaken.
Beta Was this translation helpful? Give feedback.
All reactions