# Forum: ARM programming with GCC/GNU tools AVR GCC writing a bit from one byte into another

 Author: Alex P. (alex_p) Posted on: 2012-06-27 10:05

Rate this post
 0 ▲ useful ▼ not useful
Hello community,

Programming my Atmega8 I came across the task of doing the following
operation:

bit A of byte BUFFER = bit B of byte PORT

I have 2 questions, the first is about efficiency:
I'd like to read from a port with this operation (yess it would be
better to do this with a PAL or interface chip), and therefore I only
have 30 or so clock cycles to do my bit operation.
I thought this should be plenty, but it wasn't. Looking at the
disassembly it became clear why. Is there a nicer or more efficient way
to do this?

Secondly, I'd like to have a timer interrupt (Here I wrote the dummy
function timer_start()) which should interrupt the reading after a
certain period (for example if no more data come in - this is not shown
in the code for simplicity).
In Java I have this wonderful throw operation for interrupts. Is there a
possibility to do this here as well, or do I really need to check the
timer every iteration? What's the most efficient implementation?
In Assembler I imagine I could, within the timer interrupt routine,
change the return address, so after the timer interrupt the program
doesn't jump back to my loop but to the code after it. Is this possible
in C?

I'm using AVR Studio 5 with maximum optimization option for my Atmega8
with 16MHz. I'd like to read a data but @500kHz -> 32 clock cycles.

I'd be glad for any hints or book suggestions :)
Thank you very much in advance and best regards
Alex


 // C-CODE void myfunction() { start_timer(); // Starts a timer that should interrupt the loop for(uint16_t i=0; i<1000; i++) { // This is time critical -> I don't want to check the timer every loop iteration if(PORTD & (1<



 # DISASSEMBLY # void myfunction() { # start_timer(); // Starts a timer that should interrupt the loop # for(uint16_t i=0; i<1000; i++) { // This is time critical -> I # don't want to check the timer every loop iteration 0000042C LDI R24,0x00 Load immediate 0000042D LDI R25,0x00 Load immediate # if(PORTD & (1< I # don't want to check the timer every loop iteration 00000445 ADIW R24,0x01 Add immediate to word 00000446 LDI R18,0x03 Load immediate 00000447 CPI R24,0xE8 Compare with immediate 00000448 CPC R25,R18 Compare with carry 00000449 BRNE PC-0x19 Branch if not equal } 0000044A RET Subroutine return 

 Author: Oliver (Guest) Posted on: 2012-06-27 10:36

Rate this post
 0 ▲ useful ▼ not useful
A variable shift is a very expensive opertation on an AVR. Better use
constant shifts.

  for(uint16_t i=0; i<1000; i++) { // This is time critical -> I don't want to check the timer every loop iteration switch (PORTD & (1<

Oliver

 Author: Oliver (Guest) Posted on: 2012-06-27 10:39

Rate this post
 0 ▲ useful ▼ not useful
>  switch (PORTD & (1<<DATA)) {

should be

>if (PORTD & (1<<DATA))
>  switch (i&0xFF)

but anyhow, you got the idea ;)

Oliver

 Author: Rolf Magnus (Guest) Posted on: 2012-06-27 11:10

Rate this post
 0 ▲ useful ▼ not useful
> have 30 or so clock cycles to do my bit operation.
> I thought this should be plenty, but it wasn't. Looking at the
> disassembly it became clear why.

The main problem here is that the AVR can only shift by one bit, so if
you get your shift width from a variable, the compiler has to do this as
a loop. In addition, since i is of type int, the the shift operation is
done in 16 bit.

> Is there a possibility to do this here as well, or do I really need to check the
> timer every iteration? What's the most efficient implementation?

I would check it every iteration. Alternatively, you can make two nested
loops and only check once per iteration of the outer loop if it's ok to
have the loop run for a few more microseconds before it stops. That
would also have the advantage that your loop counters could be 8 bit and
you could lose the division.

> In Assembler I imagine I could, within the timer interrupt routine,
> change the return address, so after the timer interrupt the program
> doesn't jump back to my loop but to the code after it. Is this possible
> in C?

No. There is setjmp/longjmp, but that probably won't work from an ISR.
Even in assembler, I would consider it a dirty hack.

I'd suggest something like this:

 for (uint8_t i = 0; i < 125; i++) { uint8_t bitval = 1; for (uint8_t j = 0; j < 8; j++) { if(PORTD & (1<

When compiling this with -O3, the code will be quite long due to loop
unrolling, but since that code does 8 iterations, it should in fact be
faster than yours.

 Author: Oliver (Guest) Posted on: 2012-06-27 12:14

Rate this post
 0 ▲ useful ▼ not useful
After reading your posta second time, there are some questions ;)

Fist of all,
>if(PORTD & (1<<DATA))
will not do, what you want.

But:
What exactly do you want to achive?
Is it necessary to do the bit shift during the measurement, or can it be
done later?
What is DATA? Where does it come from?
What happens, if you have finished all 1000 loop iterations, without
timer interrupt?

Oliver

 Author: Alex P. (alex_p) Posted on: 2012-06-27 17:08

Rate this post
 0 ▲ useful ▼ not useful
Awesome! Thanks a lot Rolf and Oliver, that was just what I was looking
for.

> I would check it every iteration.
What a pitty! It would be awesome to have interrupts that can interrupt
loops or even functions without much computational effort.

> if(PORTD & (1<<DATA))
I should have explained, DATA is just the bitnumber where in PORTD the
data comes in. In fact, I had to test the clock as well and everything,
but I left it out for the sample here because that worked fine.
That's why I wanted the timer interrupt, because what if I'm waiting for
the clock and the sender doesn't want to send me any more data?
 while(!(PORTD & (1<
I would get stuck in that loop then.

But your suggestion with the switch statement looks much better in the
Disassembler:
  # case 5: buffer[i>>3] |= (1<<5); break; 00000555 MOVW R30,R24 Copy register pair 00000556 LSR R31 Logical shift right 00000557 ROR R30 Rotate right through carry 00000558 LSR R31 Logical shift right 00000559 ROR R30 Rotate right through carry 0000055A LSR R31 Logical shift right 0000055B ROR R30 Rotate right through carry 0000055C SUBI R30,0x6B Subtract immediate 0000055D SBCI R31,0xFE Subtract immediate with carry 0000055E LDD R18,Z+0 Load indirect with displacement 0000055F ORI R18,0x20 Logical OR with immediate 00000560 STD Z+0,R18 Store indirect with displacement 00000561 RJMP PC-0x0012 Relative jump 

 Author: Oliver (Guest) Posted on: 2012-06-28 08:48

Rate this post
 0 ▲ useful ▼ not useful
>> if(PORTD & (1<<DATA))
>I should have explained, DATA is just the bitnumber where in PORTD the
>data comes in.

Well, in PORTD never anything will come in...

Again my question: Why can't you do the bitshifting stuff after the
measurement loop has finished? This would reduce the required cycles in
the measurement loop significantly.

Oliver

 Author: Alex P. (alex_p) Posted on: 2012-06-29 09:44

Rate this post
 0 ▲ useful ▼ not useful
> Well, in PORTD never anything will come in...
Aaah yes I meant PIND of course sorry.

> Why can't you do the bitshifting stuff after the
measurement loop has finished?
That's what I did in the end, but I think the much nicer solution would
have been to improve efficiency and do it directly instead of just
"recording" it in realtime and then getting the data out afterwards.

Alex

 Author: Joseph (Guest) Posted on: 2012-08-21 06:22

Rate this post
 0 ▲ useful ▼ not useful
hi all
I'm using this macro for long time.
 #define checkbit(byte,bit) byte&(1<

but i didn't test its assembly equal.
i hope helped you.

 Author: gjl (Guest) Posted on: 2012-08-21 08:33

Rate this post
 0 ▲ useful ▼ not useful
If you really need to quench out the last bit of performance, you can
use some GCC fu:

 /* In some header */ #include /* Same as 1 << (n % 8)) */ static __inline__ __attribute__((__always_inline__)) uint8_t bitmask_asm (uint8_t n) { uint8_t mask; __asm__ ("ldi %0, 1 << 1 $" "sbrs %1, 1$" "clr %0 $" "sbrc %0, 0$" "lsl %0 $" "sbrc %1, 2$" "swap %0" : "=&d" (mask) : "r" (n)); return mask; } /* Same as 1 << (n % 8)) */ static __inline__ __attribute__((__always_inline__)) uint8_t bitmask (uint8_t n) { return __builtin_constant_p (n) ? (1 << (n % 8)) : bitmask_asm (n); } /* I some module */ #include void set (uint8_t data, uint8_t x) { extern uint8_t c; PORTB |= bitmask (1+2+3); PORTB |= bitmask (data); PORTB &= ~bitmask (1+2+3); if (PORTB & bitmask (data)) c |= bitmask (x); }

Compiling this with an optimizing avr-gcc yields, here with version 4.7:

 set: sbi 0x18,6 ; 11 *sbi [length = 1] in r25,0x18 ; 13 movqi_insn/4 [length = 1] mov r18,r24 ; 51 movqi_insn/1 [length = 1] /* #APP */ ldi r24, 1 << 1 sbrs r18, 1 clr r24 sbrc r24, 0 lsl r24 sbrc r18, 2 swap r24 /* #NOAPP */ or r25,r24 ; 16 iorqi3/1 [length = 1] out 0x18,r25 ; 18 movqi_insn/3 [length = 1] cbi 0x18,6 ; 23 *cbi [length = 1] in r25,0x18 ; 25 movqi_insn/4 [length = 1] and r25,r24 ; 28 andqi3/1 [length = 1] breq .L1 ; 30 branch [length = 1] /* #APP */ ldi r25, 1 << 1 sbrs r22, 1 clr r25 sbrc r25, 0 lsl r25 sbrc r22, 2 swap r25 /* #NOAPP */ lds r24,c ; 34 movqi_insn/4 [length = 2] or r24,r25 ; 35 iorqi3/1 [length = 1] sts c,r24 ; 36 movqi_insn/3 [length = 2] .L1: ret ; 54 return [length = 1] 

In the 1st and 3rd call of bitmask the argument is known at compile
time and the compiler can fold the expressions to SBI resp. CBI.

If the argument to bitmask is not a compile time constant, then the
optimized asm sequence will be used.  That sequence takes 7 ticks, and
you need some additional ticks for IN and OUT.

Notice that in the latter case the port change is not atomic.

Moreover, the asm sequence is only expanded once for data and then
reused in the remainder.  That's the reason why the asm should not be
volatile:  The asm is const (like a function can be const) and thus has
no side effects and can be reused.

 Author: Hanno (Guest) Posted on: 2013-02-08 18:16

Rate this post
 0 ▲ useful ▼ not useful
For the sake of completeness:

The AVR instruction set provides the BLD and BST operations which allow
easy transfer of single bits from one register to another, both at
arbitrary bit positions. They use the T-bit in the SREG which is
otherwise not used by gcc.

Using inline assembler a single bit can be transferred between two 8-bit
variables with two instructions in two cycles as simple as:

 asm volatile ( "bst %[src], %[srcbit] \r\n" "bld %[dest], %[destbit] \r\n" : [dest] "+r" (dest) : [src] "r" (src), [srcbit] "n" (2), [destbit] "n" (7) ); 

Using BLD/BST, no shifts/rotates are needed, and no AND/OR operation
either. Only the designated bit in dest is affected.`

• $formula (LaTeX syntax)$