EmbDev.net

Forum: ARM programming with GCC/GNU tools How to create smallest binary?


von Joe D. (jdupre)


Rate this post
useful
not useful
I have installed WinARM, and compiled the AT91SAM7S64_Atmel_Interupt
project in the examples.  And it runs.  Excellent!

In the makefile, OPT=s (for smallest code size), FORMAT=binary.  And it
creates a binary of 8196 bytes.

Doing the build of the Interupt demo project with the IAR compiler
yields a binary of only 2048 bytes.

I am hoping that I am just missing some step of the optimization or
stripping process.  Why such the huge difference in filesize?  I'd
expect some difference, but not a factor of 4!

- Joe

von Martin Thomas (Guest)


Rate this post
useful
not useful
Joe Dupre wrote:
> I have installed WinARM, and compiled the AT91SAM7S64_Atmel_Interupt
> project in the examples.  And it runs.  Excellent!
>
> In the makefile, OPT=s (for smallest code size), FORMAT=binary.  And it
> creates a binary of 8196 bytes.

This is because the "__inline" for the AT91-Library is not defined
correctly in this example so all functions from the library are linked
even if not needed. "__inline" should be defined as "static inline". As
can bee seen in the "newer" AT91SAM7S-examples in the WinARM-package or
from my "AT91-projects"-page. Sorry, I have forgotten to update the
example-code. I will upload an updated version to the web-server next
week. Please monitor the page:
http://www.siwawi.arubi.uni-kl.de/avr_projects/arm_projects/index_at91.html

> Doing the build of the Interupt demo project with the IAR compiler
> yields a binary of only 2048 bytes.

With the updated code and compiled with the tools from the WinARM
20060606-collection (arm-elf-gcc 4.1.1 etc.):

section            size      addr
.text              2084   1048576
.rodata              96   1050660
.bss                 12   2097152
...

> I am hoping that I am just missing some step of the optimization or
> stripping process.  Why such the huge difference in filesize?  I'd
> expect some difference, but not a factor of 4!

Neither a problem with optimization nor with stipping (the binaries for
the target do not include debug-information), just my fault with the
__inline definition.

Martin Thomas

von Martin Thomas (Guest)


Rate this post
useful
not useful
Martin Thomas wrote:
> Please monitor the page:
> http://www.siwawi.arubi.uni-kl.de/avr_projects/arm_projects/index_at91.html

The updated version of the example has been uploaded.

Martin Thomas

von Ralawa (Guest)


Rate this post
useful
not useful
With my project which use libefsl and snprintf with float support, the
WinARM toolchain generates a binary file of 54276 bytes. In this case I
use the good __inline definition.
With the same project, the demo IAR toolchain generates a bin file of
31744 bytes.

So the file size has a factor of 1,70 between IAR and WinARM.

Does it seems ok for you? Or is it still possible de reduce the size?

Thank you.

von Martin Thomas (Guest)


Rate this post
useful
not useful
Ralawa wrote:
> With my project which use libefsl and snprintf with float support, the
> WinARM toolchain generates a binary file of 54276 bytes. In this case I
> use the good __inline definition.
> With the same project, the demo IAR toolchain generates a bin file of
> 31744 bytes.
>
> So the file size has a factor of 1,70 between IAR and WinARM.
>
> Does it seems ok for you? Or is it still possible de reduce the size?

Yes it is kind of "ok". The stdio-functions in the newlib (the libc used
in the WinARM-collection and most of the other arm-elf-gcc collections)
need some memory esp. when using floating points.

Possibilities to reduce the size:

- convert the floating-points to strings or integers (int/frac
"Fixed-Point") with other methods/functions and use the
*iprinf-functions from the newlib. The integer-version have a smaller
memory-"footprint". You may even use something like "my_putstring" so
stdio is not needed at all. This is the method I usualy use. I have not
tried all following suggestions myself.

- try another libc like i.e. dietlibc. The stdio-functions in other
libraries have a smaller memory-demand. (Maybe a little difficult to
set-up)

- try replacement-functions like i.e. from the TRIO-library (
http://daniel.haxx.se/projects/trio/ )  or the contributed rprintf with
FP-support available at
http://www.siwawi.arubi.uni-kl.de/avr_projects/arm_projects/winarmtests/rprintf_fp_test.zip

- try/buy Rowley Crossworks. Rowley provides an "self-made" libc (not
newlibc, dietlibc,...) which may produce smaller binaries when using
stdio with FP

- stay with the IAR-tools and buy the full version.


Martin Thomas

von Ralawa (Guest)


Rate this post
useful
not useful
Thank you very much Thomas for you suggestions.
I continue my investigations.

von Andreas S. (andreas) (Admin)


Rate this post
useful
not useful
Another reason why GCC often produces larger binaries than other
compilers is that unused functions are not removed. Since 4.0 the
compiler/linker can do this automatically if you add the following
options:

LDFLAGS:
-Wl,-Map=$(TARGET).map,--cref,--gc-section

CFLAGS:
-ffunction-sections -fdata-sections

@Martin: this should probably be included in the example Makefiles.

von Jim K. (ancaritha)


Rate this post
useful
not useful
> LDFLAGS:
> -Wl,-Map=$(TARGET).map,--cref,--gc-section
>
> CFLAGS:
> -ffunction-sections -fdata-sections


I added these to my make file and I get the following error while
linking.

1>Linking: .out/ArmExternalComm-0.1.0.elf
1>arm-elf-gcc -mthumb -mcpu=arm7tdmi -mthumb-interwork -I. -gdwarf-2
-DExternalComm -DEXT_BAUD_RATE=115200 -DALT_BUS_MASTER=3
-DArmExternalComm -DVERSION_MAJOR=0 -DVERSION_MINOR=1
-DVERSION_RELEASE=0 -DROM_RUN -DVECTORS_RAM -D__WinARM__  -O1 -Wall
-Wno-cast-align -Wimplicit -Wno-non-virtual-dtor -Wpointer-arith
-Wswitch -Wredundant-decls -Wreturn-type -Wshadow -Wunused
-I../../../../Root/Builds/BuildComps/ArmResources
-I../../../../Root/GenComps -I../../../../Root/SysComps
-I../../../../Root/AppComps -I../../../../Root/ControlComps
-I../../../../Root/HwDrivers/ARM -I../../../../Root/SystemConfigurations
-I../../../../Root/HwConfigurations -I../../../../Root/HwComponents
-ffunction-sections -fdata-sections -MD -MP -MF
.dep/ArmExternalComm-0.1.0.elf.d .out/SAM7A3Assembly.o  .out/SAM7Ainit.o
.out/WinARMsyscalls.o  .out/Arm7DemoBoardHwConfig.o
.out/ExternalCommPlatformConfig.o .out/AppMain.o
.out/ExtCommP2_1Config.o .out/ExtCommConfig.o .out/ExternalCommMgr.o
.out/ExternalCommLayer.o .out/TokenParser.o .out/AppMessages.o
.out/Messages.o .out/FcError.o .out/PmStatus.o .out/SysEnums.o
.out/AppEnums.o .out/InterruptServicer.o .out/USBserialPorts.o
.out/OCHardware.o .out/USBOut.o .out/USBIn.o .out/USBCtrl.o
.out/USBenumerate.o .out/eepromDD.o .out/serialPorts.o .out/EventMgr.o
.out/TimerMgr.o .out/FcAssert.o .out/BufferPool.o .out/SerialEncoder.o
.out/AccessControlLayer.o .out/InterSubSystemComm.o .out/Debug.o
.out/SysStats.o .out/AtomicLock.o .out/ObjectList.o .out/BaseObject.o
.out/ConfigObject.o .out/Heap.o   --output
.out/ArmExternalComm-0.1.0.elf -nostartfiles
-Wl,-Map=.out/ArmExternalComm-0.1.0.map,--cref,--gc-section -lc  -lm -lc
-lgcc  -lstdc++
-T../../../../Root/Builds/BuildComps/ArmResources/AT91SAM7A3-ROM.ld
1>c:\winarm\bin\..\lib\gcc\arm-elf\4.1.0\..\..\..\..\arm-elf\bin\ld.exe:
internal error c:/winarms/binutils-060330/ld/ldlang.c 4241
1>collect2: ld returned 1 exit status


My make file has:

# Linker flags.
#  -Wl,...:     tell GCC to pass this to linker.
#    -Map:      create map file
#    --cref:    add cross reference to  map file
LDFLAGS = -nostartfiles
-Wl,-Map=$(OBJDIR)/$(OUTTARGET).map,--cref,--gc-section
LDFLAGS += -lc
LDFLAGS += $(NEWLIBLPC) $(MATH_LIB)
LDFLAGS += -lc -lgcc
LDFLAGS += $(CPLUSPLUS_LIB)

ifdef USE_SMALL_PRINTF
LDFLAGS += $(PRINTF_LIB) $(SCANF_LIB) $(MATH_LIB)
endif

# Set Linker-Script Depending On Selected Memory
ifeq ($(RUN_MODE),RAM_RUN)
LDFLAGS +=-T$(PATH_TO_LINKSCRIPTS)$(SUBMDL)-RAM.ld
else
LDFLAGS +=-T$(PATH_TO_LINKSCRIPTS)$(SUBMDL)-ROM.ld
endif



# Flags for C and C++ (arm-elf-gcc/arm-elf-g++)
CFLAGS = -g$(DEBUG)
CFLAGS += $(CDEFS) $(CINCS)
CFLAGS += -O$(OPT)
CFLAGS += -Wall -Wcast-align -Wimplicit
CFLAGS += -Wpointer-arith -Wswitch
CFLAGS += -Wredundant-decls -Wreturn-type -Wshadow -Wunused
########CFLAGS += -Wa,-adhlns=$(subst $(suffix $<),.lst,$<)
CFLAGS += $(patsubst %,-I%,$(EXTRAINCDIRS))
CFLAGS += -ffunction-sections -fdata-sections

#AT91-lib warnings with:
##CFLAGS += -Wcast-qual

# flags only for C
CONLYFLAGS += -Wnested-externs
CONLYFLAGS += $(CSTANDARD)



# flags only for C++ (arm-elf-g++)
# CPPFLAGS = -fno-rtti -fno-exceptions
CPPFLAGS = -g$(DEBUG)
CPPFLAGS += $(CDEFS) $(CINCS)
CPPFLAGS += -O$(OPT)
CPPFLAGS += -Wall -Wno-cast-align -Wimplicit -Wno-non-virtual-dtor
CPPFLAGS += -Wpointer-arith -Wswitch
CPPFLAGS += -Wredundant-decls -Wreturn-type -Wshadow -Wunused
########CPPFLAGS += -Wa,-adhlns=$(subst $(suffix $<),.lst,$<)
CPPFLAGS += $(patsubst %,-I%,$(EXTRAINCDIRS))
CPPFLAGS += -ffunction-sections -fdata-sections
#AT91-lib warnings with:
##CPPFLAGS += -Wcast-qual



Any thoughts on what might I might be doing wrong?  I checked the GCC
doc that came with WinARM for those flags but I couldn't find them.

von Martin Thomas (Guest)


Rate this post
useful
not useful
A "multi-reply" and some additional information which might be
interesting for all readers:
---

-> Andreas
Thanks for the reminder. A makefile with this settings has been on the
todo-list for some time.
---


I have modified an example from the LPC213x/4x "driver collection".
Download:
http://www.siwawi.arubi.uni-kl.de/avr_projects/arm_projects/lpc2k_bundle_port/index.html
So far only the example "uart_polled" integrates a modified makefile.
---


--> Ralawa
I have extended the example to use stdio-(i)printf and collected the
binary-sizes for several setups. The basic configuration creates
thumb/thumb-interwork binaries. All settings can be seen in the makefile
of the example. It seems that you compile a ARM-binary. Did you try to
build a thumb-binary so the "thumb-libc" gets linked? Does the
IAR-linker link with "thumb-libraries" or "ARM-libraries"?
---


For those who do not want to download the archive with the example a
copy of the table in uart_polled/main.c. It's not the best example for a
"size-benchmark" but should give an idea about the stdio-memory-usage.

/*
 20060726 - mt
   preserve MEMMAP and restore startup-value after memory-dump
 20060727 - mt
   - modified Makfile to enable gcc 4 "remove unused code"-feature
     added (.bss.*) to linker-scripts
   - demo for stdio: new file syscalls.c, USR stack-size had to be
     increased for floating-points -> startup.S)
*/

/*
   some "statistics" for this demo-application in revision 20060727 when
   using the tools from WinARM 6/06 (gcc 4.1.1 et al)

   r.u.c.(*) DEMO_STDIO   DEMO_STDIO_FP     .text    .data    .bss
[bytes]

   no        not defined  not defined        2908      360       8
   yes       not defined  not defined        2700       "        "
                                      diff:   208

   no        defined      not defined       12156     2444      64
   yes       defined      not defined       11976       "        "
                                      diff:   180

   no        defined      defined           27172     2444      64
   yes       defined      defined           26992       "        " (#)
                                      diff:   180

    "OPT"-settings for configuration (#) (stdio/FP)
   OPT  .text    .data    .bss [bytes]
   s    26992     2444      64
   0    29108       "        "
   1    27296       "        "
   2    27272       "        "
   3    29360       "        "

(*) r.u.c = remove unused code with -ffunction-sections -fdata-sections
for
   the compilation and --gc-section for linking
   yes : -ffunction-sections -fdata-sections used in CLFAGS
   no  : -ffunction-sections -fdata-sections not used in CFLAGS
*/
---


-> Jim Kaz wrote
> Any thoughts on what might I might be doing wrong?
Yes, at least an idea: please check your linker-script. The script from
the example mentioned above should give an idea. It should be rather
easy to port this to your AT91SAM7A linker-script.
---


Martin Thomas

von Andreas S. (andreas) (Admin)


Rate this post
useful
not useful
Jim Kaz wrote:
>> LDFLAGS:
>> -Wl,-Map=$(TARGET).map,--cref,--gc-section

That should be "--gc-sections".

von Jim K. (ancaritha)


Rate this post
useful
not useful
I'm little confused about these flags....  In the GCC doc it says for
-ffunction-secionts and -fdata-sections:

Only use these options when there are significant benefits from doing
so. When
you specify these options, the assembler and linker will create larger
object and
executable files and will also be slower.

I must be misunderstanding this, because to me it sounds like the files
should be getting larger, where as in Martin's example they are clearly
getting smaller.  Also, does anyone know how much of a performance hit
an application may take by doing this option? 1%, 50%?  A broad range in
between that varies program to program that can only be checked by
myself?  I'm figuring the third one, but I thought I'd ask just incase
:)

Oh, and the link to the zip file containing the examples is a dead link.

von Martin Thomas (Guest)


Rate this post
useful
not useful
Jim Kaz wrote:
> I'm little confused about these flags....  In the GCC doc it says for
> -ffunction-secionts and -fdata-sections:
>
> Only use these options when there are significant benefits from doing
> so. When
> you specify these options, the assembler and linker will create larger
> object and
> executable files and will also be slower.
>
> I must be misunderstanding this, because to me it sounds like the files
> should be getting larger, where as in Martin's example they are clearly
> getting smaller.  Also, does anyone know how much of a performance hit
> an application may take by doing this option? 1%, 50%?  A broad range in
> between that varies program to program that can only be checked by
> myself?  I'm figuring the third one, but I thought I'd ask just incase
> :)
>

I have not read enough of the documentation but this is how I understand
it so far: The idea behind these options (for compiler & linker) is to
eleminate unused code. I.e. if you have placed several functions in a
source-file which gets compiled to one object. The linking is normaly
(without the additional options) object-based so the complete object
gets linked even if only one of the functions in it is used by the
application. With the "new" options of GCC 4/binutils unused functions
are not linked even if they are in a object-file which contains a used
functions. The method seems to be: place every function in a an extra
section (which should lead to an inceased size of the object-code for
the additional information) and link "section-based" (not object-based).
I do not understand why this would lead to a "larger executable" too.
Maybe for an exectutable for a "real PC" which includes
debug-information. I think the "will be slower" is not ment for the code
of the target but for the build-process: The compiler, assembler and
linker have "more work to do" and will be slower.

As already written: my example is not a good "size benchmark". All
functions in the user code are rather small. So far I have not tried to
find which unused code has been removed. The c- and gcc-library should
be implemented as "one function per object" so only used functions of
the library get linked even when using the "old" method. This is why I
don't think the new options will lead to a decrease in the memory-demand
of functions from the library. But the libraries as in WinARM are not
compiled with the new options anyway, which would be needed to "measure"
this correctly.


> Oh, and the link to the zip file containing the examples is a dead link.
Sorry. The link has been fixed.

Martin Thomas

von Ralawa (Guest)


Rate this post
useful
not useful
Martin Thomas wrote:
> --> Ralawa
> I have extended the example to use stdio-(i)printf and collected the
> binary-sizes for several setups. The basic configuration creates
> thumb/thumb-interwork binaries. All settings can be seen in the makefile
> of the example. It seems that you compile a ARM-binary. Did you try to
> build a thumb-binary so the "thumb-libc" gets linked? Does the
> IAR-linker link with "thumb-libraries" or "ARM-libraries"?

You are right, I did a mistake. libefsl was compiled with your makefile,
so it was compiled with the thumb mode, but my source code was not
compiled in thumb mode. I used SRCARM instead of SRC to compile.

I put all my C files in the SRC variable, also the interrupt services
routines. It works fine but I read that ISR should not be compiled in
ARM mode. Can you explain me why?

Know, the size of the binary file has been reduced by approximately 3kB
but it is still 1,6 bigger than the binary produced by the IAR compiler.

I am trying to use CFLAGS += -ffunction-sections -fdata-sections and
LDFLAGS = -nostartfiles -Wl,-Map=$(TARGET).map,--cref,--gc-sections to
reduce the binary size, but I have the following error:
Linking: main.elf

arm-elf-gcc -mthumb -mcpu=arm7tdmi -mthumb-interwork -I. -gdwarf-2
-DROM_RUN -D__WinARM__  -Os -Wall -Wcast-align -Wimplicit
-Wpointer-arith -Wswitch -Wredundant-decls -Wreturn-type -Wshadow
-Wunused -Wa,-adhlns=../compil/SrcWinARM/Cstartup.lst
-I../../efsl-0.2.7/conf -I../../efsl-0.2.7/inc -I../compil/SrcWinARM
-I../.. -ffunction-sections -fdata-sections -MD -MP -MF .dep/main.elf.d
../compil/SrcWinARM/Cstartup.o   main.o ../compil/SrcWinARM/syscalls.o
uart0.o uart1.o commands.o one-wire.o nmea.o sdcard.o xmodem.o radars.o
s-cli.o clock.o ../../efsl-0.2.7/src/interfaces/at91_spi.o
../../efsl-0.2.7/src/interfaces/efsl_dbg_printf_arm.o
../../efsl-0.2.7/src/interfaces/sd.o ../../efsl-0.2.7/src/debug.o
../../efsl-0.2.7/src/dir.o ../../efsl-0.2.7/src/disc.o
../../efsl-0.2.7/src/efs.o ../../efsl-0.2.7/src/extract.o
../../efsl-0.2.7/src/fat.o ../../efsl-0.2.7/src/file.o
../../efsl-0.2.7/src/fs.o ../../efsl-0.2.7/src/ioman.o
../../efsl-0.2.7/src/ls.o ../../efsl-0.2.7/src/mkfs.o
../../efsl-0.2.7/src/partition.o ../../efsl-0.2.7/src/plibc.o
../../efsl-0.2.7/src/time.o ../../efsl-0.2.7/src/ui.o
../compil/SrcWinARM/Cstartup_SAM7.o     --output main.elf -nostartfiles
-Wl,-Map=main.map,--cref,--gc-sections   -lm -lc -lgcc
-T../compil/SrcWinARM/AT91SAM7S64-ROM.ld

c:/winarm/bin/../lib/gcc/arm-elf/4.1.1/../../../../arm-elf/bin/ld.exe:
internal error c:/winarms/binutils-060606/ld/ldlang.c 4260
collect2: ld returned 1 exit status

make.exe: *** [main.elf] Error 1

Do you have an idea about this error?

Thank you.

von Martin Thomas (Guest)


Rate this post
useful
not useful
Ralawa wrote:

> I put all my C files in the SRC variable, also the interrupt services
> routines. It works fine but I read that ISR should not be compiled in
> ARM mode. Can you explain me why?

The core switches to ARM mode on excpetions (see the technical manual of
the used core-architecture). So the function to which a exception-vector
points to has to be build with ARM-instructions. When the
excpetion-vector entry does use the VIC (or AIC or ...) vector-address
directly the handlers have to be build in ARM-Mode. If you are using a
"wrapper" you can call functions build with thumb-instructions from
inside this wrapper (the wrapper itself has to be ARM-code). In most of
the codes which are based on the examples provided by Atmel a wrapper
written in Assembler is used (see *startup.S) so the "user functions"
can be compiled with thumb/thumb-iw.

> I am trying to use CFLAGS += -ffunction-sections -fdata-sections and
> LDFLAGS = -nostartfiles -Wl,-Map=$(TARGET).map,--cref,--gc-sections to
> reduce the binary size, but I have the following error:
> Linking: main.elf
> ...
> c:/winarm/bin/../lib/gcc/arm-elf/4.1.1/../../../../arm-elf/bin/ld.exe:
> internal error c:/winarms/binutils-060606/ld/ldlang.c 4260
> collect2: ld returned 1 exit status
> Do you have an idea about this error?

Please try to "rebuild": make clean; make all

Martin Thomas

von Ralawa (Guest)


Rate this post
useful
not useful
Martin Thomas wrote:
> Please try to "rebuild": make clean; make all

I did It of course. It does not work.

von Martin Thomas (Guest)


Rate this post
useful
not useful
Ralawa wrote:
> Martin Thomas wrote:
>> Please try to "rebuild": make clean; make all
>
> I did It of course. It does not work.

It might be an issue in the linker-script. Please try to compare your
setup with the on in code from
http://www.siwawi.arubi.uni-kl.de/avr_projects/arm_projects/lpc2k_bundle_port/index.html
(subdirectory uart_polled) and
http://www.siwawi.arubi.uni-kl.de/avr_projects/arm_projects/index_at91.html#at91_gamma
.
No warranty that everything is correct in the examples provided on my
pages but it seems to work as expected "here" and should at least give
an idea.

Martin Thomas

Please log in before posting. Registration is free and takes only a minute.
Existing account
Do you have a Google/GoogleMail account? No registration required!
Log in with Google account
No account? Register here.