I am a little bothered by the answers that I have received to some of my posts. Please do not think that I am ungratefull for the answers it is just that for now the only reason that I am interested in fpgas is that I am trying to speed up a c prograam that runs fast, but not fast enough. I want to try and speed up this prgram by at least another order of magnitude. I think that I can and I believe the way to do it is with fpgas. The use of additional cpu cores and the use of GPGPU programming has been unable to accomplish this. They speed the program up, but not enough. I guess there are other reasons to work with fpgas, but this is my reason. Is there any out there (on the internet or otherwise) that can help in this? I have found many documents on the internet, but nothing for someone new to fpgas comming from another area of programming. Any help is greatly appreciated. Thanks in advance. Respectfully, Newport_j From what I hace read the
the best way to Speed up CPU designs is to partage the code into 4 parts and use 4 CPUs or DSPs. CPU designs in FPGAs are allways a place number 2 strategy.
newport_j wrote: > I think that I can and I believe the way to do it is with fpgas. This could be true, but I think with your knowledge about fpgas you should say "I hope the way to do it is with fpga" And as long you don't give us more information nobody can confirm or deny this. newport_j wrote: > but nothing for someone new > to fpgas comming from another area of programming. This is maybe the point where you are mistaken at most. FPGAs are not like microprocessors and are not programmed that way. FPAGs are a flexible piece of hardware and can implement a lot of adders, multipliers an so on which will then work in parallel. They are however not so well suited to deal for sorting data, or something where you have a big amount of data which does not fit completely in the FPGA-
Klaus Falser wrote: > or > something where you have a big amount of data which does not fit > completely in the FPGA- Not in general. Depends on data access pattern that is required (FPGAs can of course interface to large external memory and also stream giant amounts of data quite fast). But the rest is true. You do NOT program FPGAs, you design a Hardware circuit (the code you write there is just another way to draw a schematic ;-) ).
Hi, you can determine, which operations are the ones that eat up the most CPU time. Then you can implement this into a FPGA. The FPGA will than be a external device like a display - you have to write sort of a driver, write the data to the fpga, start calculation, read out the results an so on. Its like havin some sort of co-processor. There is a lot of overhead involved. The logic has to be written in VHDL/VERILOG of cause. If you can write such a logic you can of cause implement it many times parallel. I have done this for a atan calculation in a system with a slow µC and a FPGA. But it only pays off, if you have some task which can be implemented as logic and is faster than the CPU. In my case it was possible, because i had to use a FPGA for other purposes anyway and there were a lot of free logic cells. Keep in mind, that a FPGA is "slow" (100MHz or so) an modern CPUs are very fast. The calculation of the atan also took a lot of Clockcycles (more than 10 if I remember correctly), so a fast µC or dsp could do it in the same time perhaps. Most time its cheaper, to buy a faster processor. In that case you dont have to learn VHDL, wich ist difficult if you are "contaminated" with C ;-) Its ...different.
Schreiberling wrote: > Keep in mind, that a FPGA is "slow" (100MHz or so) an modern CPUs are > very fast. So the only gain may be to stuff a lot of "simple" calculations in parallel in a FPGA (like: one ant isn't that mighty at all, but millions of them in parallel are). That only will work if the calculation itself is able to be parallelized. newport_j wrote: > I think that I can and I believe the way to do it is with fpgas. Why don't you simply ask an FAE of one of the PFGA suppliers. Those should have the knowledge whether your approach will work or not.
Schreiberling wrote: >> Keep in mind, that a FPGA is "slow" (100MHz or so) an modern CPUs are >> very fast. Modern FPGAs can reach 200MHz also for complex circuits ;-) For some things in a small area even 500MHz. But you do not need high clock speed in FPGAs... you trade of clock speed for parallelism.
: Edited by Moderator
Hi, This problem "how i can mix software and hardware in one design" entitled with Co-Design that have many lecture and material on the web try to google this "Hardware software Co-design" your answer probably can be found. Regards
I posted a question over the 4th of July weekend about speeding up a program using an fpga and received many answers. Thanks very much! However, I forget to add the obvious question which is what kind of program is subject to dramatic speedup by porting the c code to an fpga with all that entails? I mean I think I know what is a good candidate program for multicore speedup and what is a good candidate program for GPGPU speedup. But what is program is a good candidate for an fpga speedup? Exactly, what kind of program meets this criteria? Is it a program that cannt be speeded up using the other two methods (multicore and GPGPU) or must something be required. is fpga speeding up a last resort when the other two fail? Any help appreciated. Thanks in advance. Respectfully, Newport_j
: Edited by Moderator
A good candidate for FPGA compuzing looks like that: lots of "stupid", independent and parallelizable calculations. Look at it that way: in a FPGA you can implement a big bunch of "simple" processors. And therefore you have a kind of multiprocessor system with e.g. 100 "cores". That will be simple ones: they won't be as fast as a cheap CPU, they won't be as mighty as a cheap CPU, but there will be plenty of them... BTW: I will join the two threads, so anyone digging in will also see the previous discussion.
: Edited by Moderator
Newport_j wrote: > But what is program is a good candidate for an fpga speedup? Exactly, > what kind of program meets this criteria? In MHO you are still missing the target, because you insists to bring a C program to a FPGA. It's true, FPGA can implement a lot simple CPUs in parallel, and they can run all in parallel. But consider: - The CPUs will be very simple - They will run at, say, 100 - 200 MHz. A high end cpu will run at 2.0 - 3.0 GHz, and therefore at least 15-20 times faster. - A high end cpu has several cores too - A high end cpu is highly optimized, there were a lot of people working at it. - You have a lot of overhead synchronizing I may be wrong, but I would expect you need at least 50 cpus in a FPGA to reach the same computing power. If you really want a speedup from an fpga you need to change approach. You must stop to try to implement the C program on a cpu on the fpga, but you must implement a logic block on the fpga which does (part) of your computation.
Example: All sorts of Digital Signal Processing is very well suited for an FPGA. (Filters, FFT, feature extraction, ...) Most algorithms that run well on a GPU, can also run well on an FPGA (but not all, depends on memory subsystem usage and if floatingpoint or other bult-in features of the GPU are needed) Handling a complex Database and making complex decisions (what does these extracted features tell us?) is not that great to run on an FPGA, this is better for a CPU. What kind of processing do you have?
I am trying to sharply speedup a search algorithm involving underwater accoustic localization. So far with GPUs, I have achieved speedup of 8x, which is not enough. I want a speedup of 100x at least. I think this algorithm is suitable for an fpga speedup, but the only way to tell is by trying it on a fpga. I am also interest in using Opencl on fpgas and the Impulse C compiler. Any opinion on these tools? Thanks in advance. Resepctfully, NewpoRt_j
It is certainly possible to speed up search algorithms with FPGAs. But it's hard to say in general if such a speedup can be achieved (also depends on how much you want to spend on hardware). C to HDL Tools can at least be used for design space exploration and tell you if it is feasible.