This is currently a collection of random things about very low level workings of AVR without any sort of organization. Once there's enough here to organize, I'll clean it up and provide some structure to it.
We all know how writing an ill-considered value to a hardware register can cause weird behavior, and the interrupt safety and need to disable interrupts for read-modify-write processes to prevent and interrupt and main program code from fighting over what value that register should have. One thing you probably havent considered is that just reading from a single register can present a problemk in time-or-flash constrained conditions - simply because each time you grab the value from the register, because it's volatile, it needs to fetch it all the way out of RAM with an LDS/LDS. When every byte matters: each use of a volatile byte (such as a register) imposes a minimum of 1 extra cycle penalty vs a non-volatile one, assuming it's close to the last place you used it (such that the CPU would not have written it back to RAM; biggest benefit is seen in things like a big long boolean that checks whether any of a bunch of permutations of ((register & bitmask)&&(some_other_criteria)) are true
It's a register - classic AVR parts have three, modern AVRs have four They've changed the naming of these several times for different families of parts; Dx-series and Ex-series call them GPR.GPRn, modern tinyAVRs call them GPIORn, and classic AVRs and xmegas use either GPIOn or GPIORn. If using my cores, call them GPIOR for maximum compatibility - I provide defines as aliases if they aren't provided by the io headers. Like most registers, they hold 1 byte each - but unlike other registers they don't control anything in particular (GPR best describes them - General Purpose Register). The key thThey are also located in lower I/O space, so IN, OUT, CBI, SBI, SBIS and SBIC instructions work on them. This is a BIG DEAL As long as they are located in the so-called Low I/O space (always the case on modern AVRs - they're the 4 last registers in the 32-byte low I/O space, while classic AVRs sometimes only have GPIOR0 in the low I/O), IN, OUT, CBI, SBI, SBIS and SBIC instructions work on them. This is a BIG DEAL.
Thus GPIORn |=1 <<x
and GPIORn &= ~(1<<x)
(where x = 0-7 and is compile-time known) are atomic and compile to a single instruction, and execute in one clock on AVRxt (modern AVR) and 2 on classic AVR. The test in if (GPIOR0 & 1<<x)
(where x = 0-7 and is compile-time known) compiles to a single instruction that executes in one clock (though it is usually followed by an rjmp or rcall, but you often need that anyway). And as long as the register is known at compile time, both low and high I/O spaces (first 64 bytes, though most of the high I/O spacce is usually unavailable, especially on modern AVRs) can be loaded and stored in 1 clock, without needing a pointer too (the value being stored though must be in a a register already)
As the datasheet suggests, they could be used for global flags that might be set, read, and unset by both interrupts and non-interrupt code - as long as you set and unset single bits at a time, there is no read-modify-write that could be interrupted when you change it and no need to disable interrupts to prevent an ISR's changing it between read and write. Compared to a volatile uint8_t in SRAM. setting or unsetting a bit (from start to finish) takes 1 instruction word of flash and 1 or 2 clocks. Instead of 7 instruction words .
Because it is easily referenced, but doesn't do anything, and is almost never used in Arduino code, GPIO0 is used by some versions of Micronucleus and Optiboot (including the ones distributed with all of my cores) to save the reset cause in case the application needed it, while still allowing the entry conditions to be honored by the bootloader - even if the application never cleared MCUSR (which Arduino code almost never does - even if it were widely known, very few use cases would benefit from checking it). So it stashes the contents of MCUSR (or RSTCTRL.RSTFR, the modern AVR equivalent in GPIOR0 and clears the reset flag register so that bits set by previous resets don't cause it to enter the bootloader (and delay sketch startup) inappropriately. That it could be written with a single instruction was an added bonus in a bootloader where every byte of flash matters. megaTinyCore and DxCore do the same thing - this greatly reduces the ability of a software bug to result in a hang (most are converted to resets), see the mTC or DxC reset reference for more information.
On classic AVR, see the datasheet - it's different for every family.
On modern AVR, the rest of the low I/O space only ever has the VPORT registers - 4 per port, VPORTx.IN, .OUT, .DIR, and .INTFLAGS. These mirror the registers of the same name in the PORTx structure, except that they fit in the low I/O space.
Yes, one big one: It has to actually be in the low I/O space!!!
I have seen huge blocks of code written assuming that every SFR they were configuring was in the low I/O space, so setting up to 3 bits (or 6 if you'd need to disable interrupts) would have been faster than directly assigning the desired value - and would be automatically atomic. Unfortunately, this was on a classic AVR, and most parts didn't have those registers in the low I/O. The resulting code bloat was impressive.