ESP8266 >22% speed improvement #146
Replies: 4 comments 1 reply
-
Can you show your test code that can run faster? |
Beta Was this translation helpful? Give feedback.
-
The full source code would be about 10,000 lines long ...because an example image alone is 9,614 lines. Here is the file that defines the #ifndef PAJA_RAW_H_
#define PAJA_RAW_H_
#define MEMALIGN
#if defined(__AVR__)
# include <avr/io.h>
# include <avr/pgmspace.h>
#elif defined(ESP8266)
# include <pgmspace.h>
// The 8266 *demands* that flash reads happen on a DWord boundary - else it will sigseg, and reboot!
# define MEMALIGN __attribute__((aligned (4)))
#elif defined(__IMXRT1052__) || defined(__IMXRT1062__)
// PROGMEM is defefind for T4 to place data in specific memory section
# undef PROGMEM
# define PROGMEM
#else
# define PROGMEM
#endif
// This is the struct by which you will access the image data
// eg. tftImage_t newImage; newImage.w = 720;
// tftImage_t* pAnImage = &demoImage; int width = pAnImage->w;
typedef
struct {
const unsigned long w;
const unsigned long h;
const unsigned long Bpp;
const unsigned char* img;
}
pajaRaw_t;
#endif // PAJA_RAW_H_ Here is a file which uses the MEMALIGN macro: #include "paja_raw.h"
#define WIDTH ( 320) // Width in pixels
#define HEIGHT ( 240) // Height in pixels
#define BPP ( 2) // BYTES per pixel
// Image format is: 16bpp (bits per pixel)
// RGB 5-6-5 ... 5 bits each for Red & Blue {0..31}, 6 bits for Green {0..64}
// Big-endian, ie. |rrrr|rggg||gggb|bbbb|
// Coordinates are: {y, x} with {0, 0} in the Top-Left ...ie. {down[v], across[->]}
static const unsigned char img_testImage_data[] PROGMEM MEMALIGN = {
0x53,0x9e, 0x53,0x9e, 0x53,0x9e, 0x53,0x9e, 0x53,0x9e, 0x53,0x9e, 0x53,0x9e, 0x53,0x9e, // { 0, 0} .. { 0, 7}
0x53,0x9e, 0x53,0x9e, 0x53,0x9e, 0x53,0x9e, 0x53,0x9e, 0x53,0x9e, 0x53,0x9e, 0x53,0x9e, // { 0, 8} .. { 0, 15}
...
...
0x00,0x15, 0x00,0x14, 0x00,0x15, 0x00,0x14, 0x00,0x15, 0x00,0x14, 0x00,0x15, 0x00,0x14 // {239,312} .. {239,319}
};
pajaRaw_t img_testImage = (pajaRaw_t){WIDTH, HEIGHT, BPP, img_testImage_data}; Here is my patch to your // workaround of a15 asm compile error
#ifdef ESP8266
# if !defined NOT_A15
# undef pgm_read_word
# define pgm_read_word(addr) (*(const unsigned short *)(addr))
# endif
#endif Here is some code which which uses #define NOT_A15 // disable moon's bug workaround (what is "A15"??)
#include <Arduino_GFX_Library.h>
#undef NOT_A15
#include "paja_raw.h"
//----------------------------------------------------------------------------- ---------------------------------------
// Test timer
//
#define TIMER 0 // 0|1 : disable|enable timer
#if TIMER
# define TIMER_START uint16_t timer = millis()
# define TIMER_END(s) do { \
timer = millis() - timer; \
Serial.print(F(s)); \
Serial.print(timer, DEC); \
Serial.print(F("mS = ")); \
Serial.print(1000.0 / timer, DEC); \
Serial.println(F("fps")); \
} while (0)
#else
# define TIMER_START
# define TIMER_END(s)
#endif
//+============================================================================ =======================================
void showImage (pajaRaw_t* ip)
{
TIMER_START;
# define LINES (4) // each line costs 640bytes of "local variable RAM" (stack space)
PGM_VOID_P p = ip->img;
uint16_t bsize = ip->w * (sizeof(uint16_t) * LINES);
uint8_t* buf = (uint8_t*)alloca(bsize);
bus->beginWrite();
for (int y = 0; y < ip->h; y += LINES, p += bsize) {
memcpy_P(buf, p, bsize);
gfx->writeAddrWindow(0, y, ip->w, LINES);
bus->writeBytes(buf, bsize);
}
bus->endWrite();
# undef LINES
TIMER_END("image speed: ");
} If you do NOT use If you need to see both cases for yourself, I suggest you use See OP for:
Yes, I have tried BC |
Beta Was this translation helpful? Give feedback.
-
Thank you for your advice, but sadly Either way, I am not asking for help, and I am not trying to convince you of anything. I just found a 22% time saving for the cost of an attribute on a variable and I wanted to share it with you as a Thank You for your amazing work on this project :) |
Beta Was this translation helpful? Give feedback.
-
Sorry for LVGL example it should using draw16bitBeRGBBitmap(). startWrite();
writeAddrWindow(x, y, w, h);
_bus->writeBytes((uint8_t *)bitmap, (uint32_t)w * h * 2);
endWrite(); |
Beta Was this translation helpful? Give feedback.
-
All reads from flash memory on the ESP8266 (¿and ESP32?) MUST be 32bit/DWord aligned, and ALWAYS return 32bits of data. documentaion
The
pgm_read_*()
,memcpy_P()
(etc.) functions hide what is happening. Which is: If a requested Word or DWord falls over a DWord boundary, TWO read operations are made, and then some bit twiddling is done to give you the result you expect.By adding
__attribute__((aligned (4)))
to myPROGMEM
image data here ...I was able to reduce the display time [320x240x2] from ~114ms down to ~68mS ... More than 22% faster!Beta Was this translation helpful? Give feedback.
All reactions