Replies: 1 comment 2 replies
-
Hi, regarding your first question: Regarding the optimization: I already considered precomputing the frames, but this is not quite practical for larger displays. E.g., the 1872*1404 7.8" display takes 1872 * 1404 * 2 bits per pixel / 8 bits per byte / 2**20 = ~0.62MiB per frame, we cannot fit enough frames. Many waveforms have even longer sequences than 32, like 50-ish. If you only use smaller displays it could work for you, but not in the general case. |
Beta Was this translation helpful? Give feedback.
-
Hello,
I designed a custom PCB based on EPDiy version 7. Everything is functioning well overall, but I encountered challenges with the error "Assert failed: retrieve_line_isr render_lcd.c:41 (thread < NUM_RENDER_THREADS)" as noted in the EPDiy v7 common errors page (https://github.com/vroland/epdiy/wiki/v7-common-errors). Adjusting the bus speed did not resolve the issue. However, updating to one of the latest "main" releases (as of May 9th, 2024) helped improve the situation slightly. This improvement might be due to optimized rendering computation (using assembly code—fantastic!).
I suspect the main issue with my custom board stems from using an ESP32-S3 module with 2 MB PSRAM Quad SPI. The original EPDiy version 7 utilizes an ESP32-S3 module with Octal SPI PSRAM, which may offer twice the speed of Quad SPI. Given that significant frame data resides in PSRAM, my Quad SPI setup may be too slow. Therefore, the next version of my custom PCB will incorporate an ESP32-S3 module with Octal SPI PSRAM.
One aspect of my current setup involves exploring ways to further enhance and reduce the computational time required for rendering. As far as I understand, around 32 or more frames are needed to render the frame buffer on the display. Each frame is based on the previous color or gray shade, the new color or gray shade, and the waveform. The current software version calculates the frames "on-the-fly": while one row of a frame is transmitted to the display via the LCD interface (using DMA), the software calculates the next row of the frame (very simply speaking).
This situation requires that the computation time for each row is lower than the transmission time of a row to the display, necessitating high computational power and fast memory access, which Quad SPI PSRAM may not sufficiently provide.
Here's my question: Why not pre-calculate all frames and store each one in external PSRAM in advance, then transmit these frames to the display through the LCD interface using DMA in a second step? This approach would eliminate any "on-the-fly" calculation during transmission. Given the ample space available in external PSRAM, it should be feasible to store 32 or more frames.
Pre-calculating all frames could potentially eliminate the race condition encountered with the "on-the-fly" approach and may also be faster overall: Separating reads and writes to external PSRAM between the pre-calculation phase and the transmission phase could enhance speed. In the pre-calculation phase, only writes to external PSRAM are required (in the best case, if all other memories like frame buffers, LUTs, etc. are NOT in PSRAM), while the transmission phase involves only reads from external PSRAM. In contrast, the "on-the-fly" approach interleaves reads and writes from/to external PSRAM, which may take more time overall.
Beta Was this translation helpful? Give feedback.
All reactions