EmbDev.net

Forum: ARM programming with GCC/GNU tools Tricky RAM alignment question


von jrmymllr j. (jrmymllr)


Rate this post
useful
not useful
Well at least it seems tricky.  Here's the problem:

I'm using DMA on a Cortex-M3.  DMA requires an array in RAM of 32 
structures to be used by the CPU for housekeeping.  Each struct 
represents one DMA channel and consumes 16 bytes each.  That's 512 
bytes, mostly wasted.  Why?

Of those 32 DMA channels, numbered 0 - 31, I'm using only one, channel 
29.  This means in that 512 byte array, members 0 - 28 and 30 - 31 are 
unused.  The datasheet says array members associated with unused DMA 
channels can be used for something else.  Easily said, not as easily 
done.

To make it worse, the start of the array must be 1024-byte aligned. 
This results in more wasted RAM because of the alignment.  Apparently 
GCC can't arrange other RAM assignments any better.

All that's required to set up this array is to declare it, then program 
the array's address into a specified CPU register.

So what I was thinking is this:  Declare only the 16 byte struct I need, 
and, align it such that it's 464 bytes from a 1024 byte boundary.  I 
think there's some linker/makefile methods to use, like 
--section-start=.MySection=org along with _attribute_ ((section 
(".MySection"))), but this didn't work since it displayed .MySection in 
the .map file but put other variables in it's space.

von (prx) A. K. (prx)


Rate this post
useful
not useful
jrmymllr jrmymllr wrote:

> I'm using DMA on a Cortex-M3.

The Cortex-M3 does not have any DMA. The controller however may have 
both a Cortex-M3 core and DMA. So while it is nice to know the processor 
core, it would help more to know the type of controller.

von jrmymllr j. (jrmymllr)


Rate this post
useful
not useful
It's a TI LM3S9B92, however I didn't think even the core type made a 
difference since I'm only trying to locate a RAM struct in a specific 
location.

von Jim M. (turboj)


Rate this post
useful
not useful
> To make it worse, the start of the array must be 1024-byte aligned.
> This results in more wasted RAM because of the alignment.  Apparently
> GCC can't arrange other RAM assignments any better.

Have you tried the gcc option "-fdata-sections"? The linker works only 
with whole sections IIRC.
How did you tell gcc about the 1024 Byte alignment?

von jrmymllr j. (jrmymllr)


Rate this post
useful
not useful
I have not tried that yet, but will look into it.
The alignment was done with  _attribute_ ((aligned(1024)));

von (prx) A. K. (prx)


Rate this post
useful
not useful
You may have to use your own section and setup the linker configuration 
to place the section with the required alignment, and the data block 
within this section. Likely the regular data section does not have a 
1024 byte alignment and therefore the alignment attribute within the 
section cannot be satisfied.

In the linker script, you can subtract the 1K block from top of RAM 
assigned to normal RAM allocation and use this 1K block for the DMA 
section. Since your channel is located near the end of this 1K block, 
the block need not be configured for full 1K though, so you can reuse 
the lower channels space for top of regular RAM.

von jrmymllr j. (jrmymllr)


Rate this post
useful
not useful
The problem here is that freeing up the lower channels is easy; I simply 
don't make the array any longer than I need to.  It's the upper channels 
that's the problem, and use the most memory.

My plan was to choose a hard address that meets the requirement.  If I 
could get the linker to not allocate anything to a 16 byte block of my 
choosing, this would solve the problem.  For example, 464 bytes from the 
start of the RAM (0x2000 0000) would do. Those 464 bytes before, and the 
rest of the 96K after would be free to give the linker.

Problem is, I haven't figured out yet how to talk in the language of the 
linker file.

Below is my linker file, written by TI from an example:

MEMORY
{
    FLASH (rx) : ORIGIN = 0x00001000, LENGTH = 252K
    SRAM (rwx) : ORIGIN = 0x20000000, LENGTH = 96K
}

SECTIONS
{
    .text :
    {
        KEEP(*(.isr_vector))
        *(.text*)
        *(.rodata*)
        _etext = .;
    } > FLASH

    .data : AT (ADDR(.text) + SIZEOF(.text))
    {
        _data = .;
        *(vtable)
        *(.data*)
        _edata = .;
    } > SRAM

    .bss(NOLOAD) :
    {
        _bss = .;
        *(.bss*)
        *(COMMON)
        _ebss = .;
        . = ALIGN (8);
       _end = .;
    } > SRAM
}

/* end of allocated ram _end */
PROVIDE( _HEAP_START = _end );

/* end of the heap -> align 8 byte */
PROVIDE ( _HEAP_END = ALIGN(ORIGIN(SRAM) + LENGTH(SRAM),8) );

von (prx) A. K. (prx)


Rate this post
useful
not useful
For a complete 1K DMA region:

MEMORY
{
    FLASH (rx) : ORIGIN = 0x00001000, LENGTH = 252K
    SRAM (rwx) : ORIGIN = 0x20000000, LENGTH = 95K
    DMA (rmx)  : ORIGIN = 0x20017C00, LENGTH = 1K
}

SECTIONS
{
    .text :
    {
        KEEP(*(.isr_vector))
        *(.text*)
        *(.rodata*)
        _etext = .;
    } > FLASH

    .data : AT (ADDR(.text) + SIZEOF(.text))
    {
        _data = .;
        *(vtable)
        *(.data*)
        _edata = .;
    } > SRAM

    .bss(NOLOAD) :
    {
        _bss = .;
        *(.bss*)
        *(COMMON)
        _ebss = .;
        . = ALIGN (8);
       _end = .;
    } > SRAM

    .dma(NOLOAD) :
    {
        *(.dma)
    } > DMA
}

And use "__attribute__" to place the DMA block into section ".dma".

When only channels 28+ are used, you may use this method to allocate 128 
bytes instead of 1024 bytes at top of RAM:
    SRAM (rwx) : ORIGIN = 0x20000000, LENGTH = 98176
    DMA (rmx)  : ORIGIN = 0x20017F80, LENGTH = 128

In this case you have to mask the low order bits of the DMA block base 
address address when configuring the DMA controller. Or calculate a 
linker variable in a way similar to _HEAP_END.

von jrmymllr j. (jrmymllr)


Rate this post
useful
not useful
I was going to say the control structure is 512 bytes, not 1K, and this 
won't work.  However I believe I can specify the alternate control 
structure, which would make the entire DMA control block 1K.  The 
required DMA channel in this case would end only 32 bytes from the end 
of the 1K, meaning I waste only 32 bytes with not much hassle.  And, one 
of those two channels at the end I could potentially use in the future.

Many thanks for the linker script, I will give this a try later.

Please log in before posting. Registration is free and takes only a minute.
Existing account
Do you have a Google/GoogleMail account? No registration required!
Log in with Google account
No account? Register here.