Hi all. I test my application and see (osciloscope) so my function are 60us length. how can I run this function from RAM??? or check if use MAM???? regards
Sevc Dominik wrote: > Hi all. > I test my application and see (osciloscope) so my function are 60us > length. > how can I run this function from RAM??? or check if use MAM???? > > regards With respect to the MAM, read the manual! http://www.standardics.nxp.com/support/documents/microcontrollers/pdf/user.manual.lpc2141.lpc2142.lpc2144.lpc2146.lpc2148.pdf. Chapter 3 is what you need. Basically you ensure it is switch off (reset default) by writing to the MAMCR register, and then set the timing in the MAMTIM register before switching it back on again. The manual has the timing details. I suggest that you do this in the C Runtime Start-up (normally crt0.s) to get it running as soon as possible after setting up the clock. With this enabled you may find you need not run from RAM - it may make little difference for the extra complexity and RAM is more scarce on your part. The MAM is designed to allow code in Flash to execute with very few wait states. It will have no affect on code that is run from RAM (unless the code fetches data from Flash). Clifford
how many faster is execute code from Ram ??? In code I use calculation variable (double) and this is long time. I must reduce calculation. I meter time this: IOSET0 = (1<<5); IOCLR0 = (1<<5); time of this instruction is 250ns . and this : IOSET0 = (1<<5); T0MR0 = T0TC + Ax_p_x.Timer;//INTERVAL_XuS;// IOCLR0 = (1<<5); time of this is 1,12us; If I do this from RAM how many faster will be,like from Flash??? Now I use in my calculation big precision of motion, but it's more us. the acceleration and deceleration are in sinusoide , I thinks so it's not for ARM7 . If I use Toggle on match output then I can save some time, but I don't know if it's not spare. regards
Sevc Dominik wrote: > how many faster is execute code from Ram ??? > In code I use calculation variable (double) and this is long time. As I said, the MAM probably makes the improvement marginal. Your hardware does not have floating point hardware making floating point calculations especially slow. In my experience double precision floating point operation in software takes about ten times longer that an equivalent (or at least suitable) fixed-point calculation. I suggest that the greatest improvement can be made by removing the need for floating point. A quick gain may be achieved by using single precision rather than double, because a single precision value fits into a single machine word, it requires far fewer operations and memory accesses to manipulate. If you do this, and are using library functions, names sure you use the single precision versions, in C++ this is automatic when you include <cmath> because they are overloaded, but in C (<math.h>) you would for example have to use fsin() rather than sin(). Don't get hung up on the RAM execution thing, I think you will see little benefit. It is generally true that RAM execution is faster, but on your hardware specifically, the MAM is designed exactly to make that unnecessary. Note that the MAM is peculiar to the LPC2000 series, and if you were to port to another device you may need to run from RAM, but frankly I even doubt that, your use of floating point will swamp any performance hit from Flash execution. > I must reduce calculation. > I meter time this: > IOSET0 = (1<<5); > IOCLR0 = (1<<5); > time of this instruction is 250ns . > and this : > IOSET0 = (1<<5); > T0MR0 = T0TC + Ax_p_x.Timer;//INTERVAL_XuS;// > IOCLR0 = (1<<5); > time of this is 1,12us; > > If I do this from RAM how many faster will be,like from Flash??? Are those timings with or without MAM fully enabled? The answer is easiest to determine by disassembling the generated code and counting the instructions. At 60MHz you will achieve more-or-less 60 MIPS. However bear in mind that I/O operations are slower than RAM as well, and insert wait states. Again the LPC architecture has 'acceleration' features to mitigate this. Again read the manual, but to use the accelerated I/O you need to use the 'F' prefixed I/O registers - you are using the legacy code support registers, which are far slower. The 'F' registers include single bit operations that are even faster than FIOSET accesses. Chapter 8 of the manual. There really is no substitute for reading the documentation in these cases - that is what I am doing- I've never used the part! > Now I use in my calculation big precision of motion, but it's more us. > the acceleration and deceleration are in sinusoide , I thinks so it's > not for ARM7 . On the contrary, use fixed point arithmetic (scaled integers) and create a sinusoid look-up table. It is very unlikely that you need double precision. Use a 2^n value for your scaling so that you can use shift operations rather than division for scaling (the compiler will do that for you even if you use a divide if the RHS is a power of 2 constant). The easiset way to create the look up table is in a spreadsheet, export as CSV and wrap it int sin_table[] = { <insert CSV dats here> } ; or you might write your own code generator. Even if you persisted with floating point and made the lookup table type float, it will be faster than using the sin() function. The size of the table will depend on the necessary precision and available memory, a 512 element lookup table will give better than one degree accuracy and take 2K of Flash. You may not need that much resolution since the motor will naturally interpolate between calculated points with a linear approximation. You can reduce the table size by only encoding one quadrant, and then using reflection and inversion to obtain values in the other quadrants. If you are using degrees or radians, it may be simpler to make your table a multiple of 360 or 3145 (PI*1000 for four digit accuracy) > If I use Toggle on match output then I can save some time, but I don't > know if it's not spare. > Then you may as well use the PWM as I suggested. Anyway my conclusion would have to be that you don't need RAM execution or a faster processor or an FPU, you merely need to adapt your coding practices to suit the hardware. That means reading the manual, and learning to do without floating point (using fixed point and look-up tables). I can assure you that most commercial motion control and even more computationally complex applications do not use floating point. Some resources: http://www.embedded.com/98/9804fe2.htm http://en.wikipedia.org/wiki/Fixed-point_arithmetic If you were to post the expression you are trying to compute I could give more specific advice perhaps. Clifford
Hi Clifford. I meter some time of some function. Basic setting are in Startup.S file. This value I have set, Cristal is 12MHz> VPB Div 4 PLL M Multiplier 5-1 > 4 D Divider 1 > 2 MEM Fully Enabled Timing 4 I meter this code, it's timer1 interupt every 10Ms: IOSET0 = (1<<5); T1IR = 1; /* clear interrupt flag from MR0*/ IENABLE; /* handles nested interrupt */ IDISABLE; VICVectAddr = 0; /* Acknowledge Interrupt */ IOCLR0 = (1<<5); It's 790ns. Multiplier to 6: 660ns Multiplier to 7: 490ns Multiplier to 8: 430ns Multiplier to 9: 390ns I change Multiplier back to 5 and change VPB Div to 0 and meter again. Multiplier to 5: 520ns Multiplier to 6: 430ns Multiplier to 7: 320ns Multiplier to 8: 290ns In this value I don't know if MCU is stable or not , is more hot like normal setting. I thinks so will be good to write some code in asm not in c. regards
Sevc Dominik wrote: > > I thinks so will be good to write some code in asm not in c. > I think you are "sweating the small stuff". Why are you worried about a few hundred nanoseconds when elsewhere you are using floating point operations that take far longer!? That code is so simple, I doubt very much hand coding in assembler will make much difference. Bear in mind that given the way you are performing the timings, a significant amount of the time is actually be consumed by the output operations you are using to do the measurement! You earlier measured that at 250ns, which since the MAM would not affect the output timing, suggests that the execution time between the output toggles is very short. You would have to at least time the toggle on its own for each scenario and subtract that from all your timings, and even then that does not account for all the timing instrumentation overhead. I note that you are still using the slow I/O registers. Which makes me wonder why I bothered to investigate it for you! Another reason I suggest that you are sweating "the small stuff" is that I suspect that the interrupt latency is more significant that the time of execution of the lines you posted, and you can probably do little about that. A better way to estimate overall interrupt execution time is to create a busy-loop in main() (after setting up the timer interrupt) that constantly toggles the output, and then when the interrupt executes the activity will stop - time the period for which there is no activity to determine the true interrupt execution time. Your code is probably insignificant compared to the number of instructions required to preserve and restore registers and perform the processor mode switching to and from IRQ, and then there is the hardware latency as well. If your whole interrupt truly takes less than 10us I would be impressed. You have to measure correctly and consider the effect the act of measurement itself has on the timing. The user manual suggests a MAM clock of 3 for processors >40MHz, so you might be pushing it. What is worth more, shaving off 100ns or having your application reliable in the field? What timing do you actually need to achieve (and why)? I am struggling to understand what you are trying to achieve. Stepping a motor and generating motion profiles is well within the capability of the part you are using, and I suspect that you are worrying about the wrong thing. Remember it is only too slow if it fails to meet its deadlines - what are the deadlines? If you cannot quantify them you are wasting your time measuring - you have no means of determining success. I would suggest that if success depends a few hundreds of nanoseconds, then you will fail. Clifford
Hi Clifford. I'm work all time .I reduce calculation for acceleration,deceleration max.speed... etc. now My code is 16us long , if overlock cpu then les, but I meter my step motor (with driver) on MACH3 and I can't turn more then 12000step per sec. I'm surprised , I thinks so I can set more then 35K step per sec , but no.It's not import to generate more step how stepmotor can do. If test my electronic with new code then can step more then 20K step per sec. My code is match faster then mach3 on my PC. OK ,I need some reserve of free step speed, with this code I can do this. may be problem will be , if give 3 motors to machine (small cnc) and create any motion , then I see if motion is correct (in minimal step resolution) or error is more then one step. Now I use screwed shaft M6. Motor is 200step per, I use half step. On one turn generate 400step. minimal step is 0.0025mm . If error is less then this step , then All is good, if no then must calculate with match beter precision. thanks for all your sugestion and help. I write more, if create used code, and start test. regards. PS: Now I haw not problem with hardware (LPC2142) but with C language (I'm not good programer in this language).
Please log in before posting. Registration is free and takes only a minute.
Existing account
Do you have a Google/GoogleMail account? No registration required!
Log in with Google account
Log in with Google account
No account? Register here.