Hello, I am using an LM3s9b90 Controller with cortex M3 Core. I wrote a simple test which toggles a Port pin by accessing the appropriate Bitband-Alias. In following example EL_PF_3 is an Bitband Alias to Port F bit 3. (Address 0x424A7F8C ,1112178572 dec) In the assembler listing, you can see that two 16Bit moves are used to load the address, I don't understand why. I think, this could be done with one 32Bit move. ssi.c **** EL_PF_3 = 0; 21538 .loc 1 19 0 21539 0000 47F68C73 movw r3, #:lower16:1112178572 21540 0004 C4F24A23 movt r3, #:upper16:1112178572 21541 0008 0021 movs r1, #0 ssi.c **** EL_PF_3 = 1; 21542 .loc 1 20 0 21543 000a 0122 movs r2, #1 21544 .loc 1 19 0 21545 000c 1960 str r1, [r3, #0] 21546 .loc 1 20 0 21547 000e 1A60 str r2, [r3, #0] I am using arm-elf-gcc, gcc-Version 4.4.0 with following flags: -O3 -g3 -Wall -L lib -mcpu=cortex-m3 -mthumb Has someone an idea, what might be wrong? Best regards Arne
I think it is because the flash prefetch buffers, fetching two consecutive instructions is faster than fetching an instruction and then a 32 bits value displaced in memory which would require refilling the buffer with the insertion of wait states. Giovanni --- ChibiOS/RT http://chibios.sourceforge.net
arne wrote: > In the assembler listing, you can see that two 16Bit moves are used to > load the address, There are 2 alternative ways to load a 32bit constant. PC relative from a constant pool like the old ARM core had to do, and the way shown here. The PC relative load needs 6 bytes total, but depending on the machine it could cost a lot of clock cycles due to slow flash memory (~6 cycles on a LPC1700 at max freq). Embedding the constant in the sequential instruction stream needs 8 bytes, but runs in 2 clock cycles as sequential operations are largely unaffected by flash speed for the reason mentioned previously. In GCC the taken alternative depends on the optimization setting. With -Os you get the smaller code.