Indirect Subroutine Call in Assembly for ARMv4T architecture

von Tat W. (Company: Universiti Sains Malaysia) (tcwan)

2010-11-19 10:46

Rate this post

•	▲ useful ▼ not useful

Hi,

I'm trying to figure out what is the proper way to call a subroutine 
which has its address given as the contents of a register in ARMv4T 
assembly.

In ARMv5 there is the BLX instruction which accepts a register operand.
Unfortunately this is not available on ARMv4T.

TIA

2011-02-07 11:30: Moved by Admin

Report post Edit Move Thread sperren Anmeldepflicht aktivieren Anpinnen Delete topic Thread mit anderem zusammenführen Quote selected text Reply Reply with quote

Re: Indirect Subroutine Call in Assembly for ARMv4T architecture

von (prx) A. K. (prx)

2010-11-19 11:02

Rate this post

•	▲ useful ▼ not useful

The traditional sequence without support for interworking (being able to 
call Thumb code from ARM and vice versa) is
   mov lr,pc
   mov pc,target -or- ldr pc,target

With interworking:
   mov lr,pc
   bx  target

Report post Edit Delete Quote selected text Reply Reply with quote

Re: Indirect Subroutine Call in Assembly for ARMv4T architecture

von Tat W. (Company: Universiti Sains Malaysia) (tcwan)

2010-11-19 23:15

Rate this post

•	▲ useful ▼ not useful

@prx:

Thanks! Is this because pc = current instruction address + 8, so when
executing "mov lr, pc", it will store the instruction address after the 
branch?

Also, is there any penalty in using bx on ARMv4T for doing ARM -> ARM or 
Thumb -> Thumb subroutine calls?

Report post Edit Delete Quote selected text Reply Reply with quote

Re: Indirect Subroutine Call in Assembly for ARMv4T architecture

von (prx) A. K. (prx)

2010-11-20 08:58

Rate this post

•	▲ useful ▼ not useful

Tat Wan wrote:

> Thanks! Is this because pc = current instruction address + 8, so when
> executing "mov lr, pc", it will store the instruction address after the
> branch?

Yes.

> Also, is there any penalty in using bx on ARMv4T for doing ARM -> ARM or
> Thumb -> Thumb subroutine calls?

You should look at ARMs core documentation to read about individual 
instruction timings.

Report post Edit Delete Quote selected text Reply Reply with quote

Re: Indirect Subroutine Call in Assembly for ARMv4T architecture

von Tat W. (Company: Universiti Sains Malaysia) (tcwan)

2011-01-28 05:14

Rate this post

•	▲ useful ▼ not useful

A. K. wrote:

> With interworking:
>    mov lr,pc
>    bx  target

To revisit this topic, I am trying to understand how Thumb -> ARM 
Interworking is supposed to work in ARMv4T.

The example given works for ARM -> Thumb because LR[0] is 0 for ARM.
However, when we want to call from Thumb -> ARM, we need LR[0] to be 1.

In Thumb mode, BL <label> works since LR[0] is set to 1 automatically.
I can't find any documentation which specifies the behavior of
'mov LR, PC' for Thumb mode which assures me that LR[0] := 1, such that

    mov lr, pc
    bx  target

will work for Thumb -> ARM.

All description of interworking seem to gloss over this issue. There's 
mention of veneers but that seems to be generated by the linker for C 
object files only? Can anyone please clarify how this should be solved 
when programming purely in Assembly Language?

Report post Edit Delete Quote selected text Reply Reply with quote

Re: Indirect Subroutine Call in Assembly for ARMv4T architecture

von Tat W. (Company: Universiti Sains Malaysia) (tcwan)

2011-01-28 08:57

Attached files:

Interwork.tgz (1.31 KB)
interwork.objdump (1.91 KB)

Rate this post

•	▲ useful ▼ not useful

Following up to myself:
[My Toolchain is arm-none-eabi-binutils and ld --
GNU ld (Linux/GNU Binutils) 2.20.51.0.9.20100526
GNU assembler (Linux/GNU Binutils) 2.20.51.0.9.20100526]

I tried it out in an example project. By looking at the linker output, 
it seems that a veener is automatically generated for Thumb->ARM calls.

Nonetheless, veneers seem to be generated for ALL .global labels. i.e., 
even if the routine is a Thumb routine, it will still result in the 
generation of a veneer (see excerpt from the interwork.objdump file 
below). thumb_routine2 is in a separate source file, so it needs to be 
declared .global, resulting in an invalid veneer being generated for the 
'BL routine3' in TEST_THUMB, as well as the 'BL icall_TEST_ARM' which is 
16-bit Thumb code to switch mode to ARM, declared as a .global. Is there 
a way to suppress linker veneer code generation?

00000060 <icall_TEST_THUMB>:
  60:   200f            movs    r0, #15
  62:   f000 f811       bl      88 <__TEST_ARM_from_thumb>
  66:   f7ff fffb       bl      60 <icall_TEST_THUMB>
  6a:   f000 f811       bl      90 <__thumb_routine2_from_thumb>
  6e:   f000 f803       bl      78 <thumb_routine3>
  72:   f000 f805       bl      80 <__icall_TEST_ARM_from_thumb>
00000078 <thumb_routine3>:
0000007c <thumb_routine2>:
  7c:   3002            adds    r0, #2
00000080 <__icall_TEST_ARM_from_thumb>:
  82:   46c0            nop                     ; (mov r8, r8)
  84:   eaffffef        b       48 <icall_TEST_ARM>
00000088 <__TEST_ARM_from_thumb>:
  8a:   46c0            nop                     ; (mov r8, r8)
  8c:   eaffffef        b       50 <TEST_ARM>
00000090 <__thumb_routine2_from_thumb>:
  92:   46c0            nop                     ; (mov r8, r8)
  94:   eafffff8        b       7c <thumb_routine2>


Finally, I've created some macros in interwork.h for declaring ARM 
routines to support Thumb-to-ARM calls based on the veneer code. 
However, GAs adds an ARM NOP instruction after the veneer. This is not 
fatal, but it does consume 4-bytes more. Is there any reason why GAs 
inserts the extral 32-bit NOP?

00000048 <icall_TEST_ARM>:
  4a:   46c0            nop                     ; (mov r8, r8)
  4c:   e1a00000        nop                     ; (mov r0, r0)


Am I doing something wrong, or are these GAs quirks that I need to work 
around?

Report post Edit Delete Quote selected text Reply Reply with quote

Re: Indirect Subroutine Call in Assembly for ARMv4T architecture

von Tat W. (Company: Universiti Sains Malaysia) (tcwan)

2011-02-07 03:41

Attached files:

interwork.h (429 Bytes) | highlighted code

Rate this post

•	▲ useful ▼ not useful

Ok, after much futzing with the macros and code disassembly, I've come 
to the following conclusions:

1. All Interworked routines (ARM or Thumb) must be declared .global
2. All Thumb Interworked routines MUST have .thumb_func declared as well
   (This is critical for it to be recognized as a Thumb routine by the
    linker).
3. Interworking calls just use normal BL <interwork_routine>, the
   linker will handle the rest, and insert a veneer as necessary.

An updated macro file is included in case anyone is interested.
The arm_icall and thumb_icall are provided for programming clarity and 
just implements 'BL <routine>'.

It is not reliable to use manually generated veneers (based on the 
Linker generated code), the linker performs code block alignment which 
may cause invalid NOPs (32-bit instructions instead of 16-bit 
instructions, and vice versa) to be inserted after the veneer and mess 
up the mode switching. Of course, it is possible to write veneers that 
implement mode switching regardless of inserted NOPs, but I don't think 
it is worth the effort (in terms of number of instructions, and also 
execution cycles lost due to NOPs).

Report post Edit Delete Quote selected text Reply Reply with quote

EmbDev.net

Forum: ARM programming with GCC/GNU tools Indirect Subroutine Call in Assembly for ARMv4T architecture