Hello, I am working on bare metal ARM cortex-A53 using GCC compiler 4.8.2-3 , I am facing an issue for Multicore boot in GCC . Multicore boot and test are working fine when I use DS-5 with almost same boot code. For gcc ,single core is OK but issue is with multicore. By some debugging I could figure out that my stack/heap were overlapping , so to avoid such issue and to make sure that heap for 4 cores are @ different memory locations I explicitly provided different address for heaps of 4 cores. Following is the linker code for that: MEMORY { ram (rw) : ORIGIN = 0, LENGTH = 0x14000000 uncachedram (rwx) : ORIGIN = 0x14000000, LENGTH = 0x10000000 cachedram1 (rwx) : ORIGIN = 0x15000000, LENGTH = 0x10000000 cachedram2 (rwx) : ORIGIN = 0x16000000, LENGTH = 0x2A000000 ddr (rwx) : ORIGIN = 0x40000000, LENGTH = 0xEAFFFFFF } /* Highest address of the user mode stack */ _estack = 0x16000000; /* end of RAM */ _Min_Heap_Size0 = 0x100000; /* required amount of heap */ _Min_Stack_Size0 = 0x10000; /* required amount of stack */ _Min_Heap_Size1 = 0x100000; /* required amount of heap */ _Min_Stack_Size1 = 0x10000; /* required amount of stack */ _Min_Heap_Size2 = 0x100000; /* required amount of heap */ _Min_Stack_Size2 = 0x10000; /* required amount of stack */ _Min_Heap_Size3 = 0x100000; /* required amount of heap */ _Min_Stack_Size3 = 0x10000; /* required amount of stack */ /* User_heap_stack section, used to check that there is enough RAM left */ ._user_heap_stack : { . = ALIGN(4); PROVIDE ( end0 = . ); PROVIDE ( _end_ = . ); . = . + _Min_Heap_Size0; PROVIDE (_max_heap0 = . ); . = . + _Min_Stack_Size0; . = ALIGN(4); PROVIDE (end1 = . ); . = . + _Min_Heap_Size1; PROVIDE (_max_heap1 = . ); . = . + _Min_Stack_Size1; . = ALIGN(4); PROVIDE (end2 = . ); . = . + _Min_Heap_Size2; PROVIDE (_max_heap2 = . ); . = . + _Min_Stack_Size2; . = ALIGN(4); PROVIDE (end3 = . ); . = . + _Min_Heap_Size3; PROVIDE (_max_heap3 = . ); . = . + _Min_Stack_Size3; . = ALIGN(4); } >ddr The sbrk implementation is as follows: caddr_t _sbrk_r ( struct _reent *r, int incr ) { unsigned int id; id = get_cpu_id(); if(id == 0) { extern int end0; extern void* _max_heap0; static void *heap_end0; void *prev_heap_end0; if (heap_end0 == NULL) heap_end0 = (void *)&end0 ; prev_heap_end0 = heap_end0; if (heap_end0 + incr > &_max_heap0) { r->_errno = ENOMEM; return (caddr_t) -1; } heap_end0 += incr; return (caddr_t) prev_heap_end0; } else if(id==1) { extern int end1; extern void* _max_heap1; static void *heap_end1; void *prev_heap_end1; if (heap_end1 == NULL) heap_end1 = (void *)&end1 ; prev_heap_end1 = heap_end1; if (heap_end1 + incr > &_max_heap1) { r->_errno = ENOMEM; return (caddr_t) -1; } heap_end1 += incr; return (caddr_t) prev_heap_end1; } else if(id == 2) { extern int end2; extern void* _max_heap2; static void *heap_end2; void *prev_heap_end2; if (heap_end2 == NULL) heap_end2 = (void *)&end2 ; prev_heap_end2 = heap_end2; if (heap_end2 + incr > &_max_heap2) { r->_errno = ENOMEM; return (caddr_t) -1; } heap_end2 += incr; return (caddr_t) prev_heap_end2; } else { extern int end3; extern void* _max_heap3; static void *heap_end3; void *prev_heap_end3; if (heap_end3 == NULL) heap_end3 = (void *)&end3 ; prev_heap_end3 = heap_end3; if (heap_end3 + incr > &_max_heap3) { r->_errno = ENOMEM; return (caddr_t) -1; } heap_end3 += incr; return (caddr_t) prev_heap_end3; } } I run test which does malloc for buffer sizes 256,512,1024 and 8192 and also printed the address along with cupid , following is the outcome for GCC and Ds-5 compiler GCC Heap_Base Address returned from malloc for buffer size = 256 Address returned from malloc for buffer size = 512 Address returned from malloc for buffer size = 1024 Address returned from malloc for buffer size = 8192 Cpu0 40000000 0x40000518 0x40110210 NO Result test HANG NO Result test HANG Cpu1 40110000 0x40000930 0x40110008 NO Result test HANG NO Result test HANG Cpu2 40220000 0x40000828 0x40110820 0x40110e30 NO Result test HANG Cpu3 40330000 0x40000410 0x40000620 0x40000a38 0x40330008 DS-5 Heap_Base Address returned from malloc for buffer size = 256 Address returned from malloc for buffer size = 512 Address returned from malloc for buffer size = 1024 Address returned from malloc for buffer size = 8192 Cpu0 40000000 0x40000018 0x40021220 0x40042628 0x40063e30 Cpu1 43C00000 0x43c00018 0x43c21220 0x43c42628 0x43c63e30 Cpu2 47800000 0x47800018 0x47821220 0x47842628 0x47863e30 Cpu3 4B400000 0x4b400018 0x4b421220 0x4b442628 0x4b463e30 So by looking at the results above this seems that the malloc in case of GCC allocates a big chunk of memory and then on any further request allocates from same already reserved chunk of memory whereas In case of DS-5 this is not case , This could be due to following reason 1. In case of Ds-5 the stack heap setup is done at boot time before calling main test using __user_initial_stackheap (in my case I overridden it using __user_setup_stackheap where I reserved heap /stack for all 4 cores @ different memory location). Following is the section of code for this : __user_setup_stackheap 0x00001840: e1a0400e .@.. MOV r4,lr 0x00001844: ee100fb0 .... MRC p15,#0x0,r0,c0,c0,#5 0x00001848: e2000003 .... AND r0,r0,#3 0x0000184c: e3a0150f .... MOV r1,#0x3c00000 0x00001850: e0030190 .... MUL r3,r0,r1 0x00001854: e59f0018 .... LDR r0,[pc,#24] ; [0x1874] = 0x40000000 0x00001858: e0800003 .... ADD r0,r0,r3 0x0000185c: e59fd014 .... LDR sp,[pc,#20] ; [0x1878] = 0x43c00000 0x00001860: e08dd003 .... ADD sp,sp,r3 0x00001864: e59f1010 .... LDR r1,[pc,#16] ; [0x187c] = 0x43b00000 0x00001868: e0833001 .0.. ADD r3,r3,r1 0x0000186c: e1a02003 . .. MOV r2,r3 0x00001870: e12fff14 ../. BX r4 2. In case of GCC when I call malloc it in turn call sbrk where I define the heap section for each cores , but if the memory chunk is already available then sbrk is not called and memory is allocated from the available chunk that’s why in above table for GCC case the allocated address is not range of the respective cpus e.g. compiler first reserves a chunk of 12 K (0x3000)memory from 0x40000000 (which is cpu0 heap) and then starts allocating from this memory and once this is exhausted again call sbrk and then reserves memory from the heap which cpu it is . 3. I tried to trace the sbrk call for 4 cores and got following : Heap_base incr Heap_base incr Heap_base incr Cpu0 0x40000000 0x418 0x40000418 0xbe8 0x40001000 0x3000 Cpu1 0x40110000 0x3000 Cpu2 0x40220000 0x3000 0x40223000 0x1000 Cpu3 This means that if the call goes to sbrk then the cpu’s get correct address in their reserved range of heap , but if it does not go to sbrk then the address allocated might be from any other cores range Based on above findings , following are my queries: 1. Is there any way in case of GCC compiler also through which in boot itself I can tell compiler about the heap for all cores like I was doing in ds-5 case using “__user_setup_stackheap” 2. What is the flow for GCC for initialization sequence like one shown in above diagram for ds-5 3. Is there any compiler option through which this can be achieved 4. Am I missing something in case of GCC implementation ? Please provide your feedback on the above queries and please correct me if you see some gap in my findings . Thanks in Advance for your help. Thanks and Regards, Monika
Please log in before posting. Registration is free and takes only a minute.
Existing account
Do you have a Google/GoogleMail account? No registration required!
Log in with Google account
Log in with Google account
No account? Register here.