Anna Kowalska wrote:
> Is this possible, that computing square root of 32-bit integer takes
> about 4000 cycles? I'm using sqrt function (math.h), and str912
> mikrocontroller (arm9 core). I tried that 4 cycle/bit C routine:
> http://www.finesse.demon.co.uk/steven/sqrt.html (just out of curosity
> because I need floating point result anyway) but still, it got only 2
> times quicker.
> I'm checking number of cycles with timer, which I start just before
> calculating sqrt, and after that I read its value.
> Does it really take so long?
So if you are running this at maximum speed for that part (96MHz) it is
taking about 42 microseconds? That sounds about right to me. sqrt() is a
double precision function and you have no FPU on that part, so it will
always be slow.
If you use the single precision version - sqrtf() - you will probably
find that it is about twice the speed because it has to move and process
half the amount of memory and a single precision value fits into a
single machine word, so there are fewer instructions required to process
them. Note if you use C++ and include <cmath> rather than <math.h> than
sqrt() is overloaded and will use the single or double precision
implementation as necessary determined by its argument type.
Note that Newlib is open source, if you want to look at the sqrt or
fsqrt implementations they are available, and you can replace them with
your own implementations. Simply link object code or a library
containing these functions ahead of the standard library and it will
override the standard implementation. Or you could of course just give
your function a different name, but overriding makes the code more
portable; on a platform with an FPU for example, you can simply remove
your override to use the native library.
If you have the potential for changing the device, the NXP LPC3xxx
series include VFP hardware, which will accelerate floating point
operations by 5 times in any case, but actually has double and single
precision SQRT instructions, which will obviously be far faster that a
software sqrt algorithm. The Newlib library will not use it by default,
so you will have to provide your own. I posted come code here to do that
a while back, but it seems to have been dropped from the forum. I can
dig it out if it is of use to you. A perhaps more expensive solution is
the Freescale iMX31 ARM11 with VFP.