Hi guys! I've written another IIR implementation after Directform 1 to implement a high pass filter in my current design, the IIR will be combined with a CIC that downsamples the sampling rate from 10MHz to ~79KHz. My filter coefficients for the IIR are as follows: > A1 = -1,93178 > A2 = 0,93403 > B0 = 0,96645 > B1 = -1,93291 > B2 = 0,96645 The coefficients are scaled by 2^30 and then divided by this value after the multiplication. My implementation works with 4 guard bits, as you can see Idle-State for nX0. Although my simulation looks OK (to me), it doesn't work in the real FPGA so frequencies below 1KHz are still in the signal. How do I properly test my IIR and is there maybe something I'm missing in my implementation?

maybe I don't see it but somehow I miss a statement like: "if rising_edge(iclk) then" or " wait until rising_edge(iclk);" in your VHDL...

process(iCLK,iRESET_N) begin if(iRESET_N = '0') then nZX0 <= (others => '0'); nZX1 <= (others => '0'); nZX2 <= (others => '0'); (...) nYOUT <= (others => '0'); else //-->>> e.g.: if rising_edge(iclk) then case state is |

examples here: https://stackoverflow.com/questions/32717040/wait-until-rising-edgeclk-vs-if-rising-edgeclk

Bernhard K. wrote: > maybe I don't see it but somehow I miss a statement like: > "if rising_edge(iclk) then" or " wait until rising_edge(iclk);" > in your VHDL... Holy crap you're right! Some code-blindness >_> I'll compile again and report if this was my (really) stupid mistake :D

:
Edited by User

So I've implemented the clocked version :D I did a test on the filter via an 1V amplitude and a frequency sweep from 0 to 3000Hz. It seems to filter a bit, but not nearly as good as I've simulated. It looks that I'm missing something, the curve looks like I'm on track but is my division flawed at some point?

:
Edited by User

Hi, 1. You should sum up the multiplication results and not the truncated multiplication results. Using the truncated values will increase rounding noise. 2. probably this line will lead to a timing violation: nYOUT <= std_logic_vector(signed(nDX0)+signed(nDX1)+signed(nDX2)-signed(nDY1)-sig ned(nDY2)); Tom

> 1. You should sum up the multiplication results and not the truncated > multiplication results. Using the truncated values will increase > rounding noise. Done, values are a bit different and simulation looks ok, the curve looks identical. Will try this in my FPGA, new code and testbench are added if someone wants a closer look. > 2. probably this line will lead to a timing violation: > nYOUT <= > std_logic_vector(signed(nDX0)+signed(nDX1)+signed(nDX2)-signed(nDY1)-sig ned(nDY2)); According to Quartus it seems OK though, no timing problems mentioned by TimeQuest.

So far I've made two versions of the IIR, one is more async (IIR_Biquad_II.vhd) and one uses a state machine (IIRDF1.vhd). The more async one was derived from a project on the internet and it seems to work as a high pass (but gives a weird curve as low pass). I've implemented both in the FPGA and the async gives a really nice curve, while the one with the state machine has some weird effects. Can someone tell me why this is? I really have no clue what should cause this :-/ By the way, my own state machine implementation works fairly well as a low pass, as you can see on the picture with the two additional signals (2nd and 4th order LP).

"it seems to work as a high pass (but gives a weird curve as low pass)" As I did before I would recommend that you implement a testbench to check every bit of the computation and compare it for example with an implementation in lua, java or what so ever. If everything is checked in that way there will be no more "It seems". It's a little bit of work for the first filter, but you will easily check other filters afterwards.

Martin O. wrote: > "it seems to work as a high pass (but gives a weird curve as low pass)" > > As I did before I would recommend that you implement a testbench to > check every bit of the computation and compare it for example with an > implementation in lua, java or what so ever. That's what I did, the values look quite nice though and it behaves like a filter should do (from my understanding).

63336.30075 -126673.25685 63336.30075 -126599.2023 61211.65605 nil K= 0 Input= 140737488355328 Output= -85901 K= 1 Input= 140737488355328 Output= -85906 K= 2 Input= 140737488355328 Output= -85906 K= 3 Input= 140737488355328 Output= -85906 K= 4 Input= 140737488355328 Output= -85906 K= 5 Input= 140737488355328 Output= -85906 K= 6 Input= 140737488355328 Output= -85906 K= 7 Input= 140737488355328 Output= -85906 K= 8 Input= 140737488355328 Output= -85906 K= 9 Input= 140737488355328 Output= -85906 K= 10 Input= 140737488355328 Output= -85906 K= 11 Input= 140737488355328 Output= -85906 K= 12 Input= 140737488355328 Output= -85906 K= 13 Input= 140737488355328 Output= -85906 K= 14 Input= 140737488355328 Output= -85906 K= 15 Input= 140737488355328 Output= -85906 K= 16 Input= 140737488355328 Output= -85906 K= 17 Input= 140737488355328 Output= -85906 K= 18 Input= 140737488355328 Output= -85906 K= 19 Input= 140737488355328 Output= -85906 K= 20 Input= 140737488355328 Output= -85906 K= 21 Input= 0 Output= -6 K= 22 Input= 0 Output= -1 K= 23 Input= 0 Output= -1 K= 24 Input= 0 Output= -1 K= 25 Input= 0 Output= -1 K= 26 Input= 0 Output= -1 K= 27 Input= 0 Output= -1 K= 28 Input= 0 Output= -1 K= 29 Input= 0 Output= -1 K= 30 Input= 0 Output= -1 K= 31 Input= 0 Output= -1 K= 32 Input= 0 Output= -1 K= 33 Input= 0 Output= -1 K= 34 Input= 0 Output= -1 K= 35 Input= 0 Output= -1 K= 36 Input= 0 Output= -1 K= 37 Input= 0 Output= -1 K= 38 Input= 0 Output= -1 K= 39 Input= 0 Output= -1 K= 40 Input= 140737488355328 Output= -85901 K= 41 Input= 140737488355328 Output= -85906 K= 42 Input= 140737488355328 Output= -85906 K= 43 Input= 140737488355328 Output= -85906 K= 44 Input= 140737488355328 Output= -85906 K= 45 Input= 140737488355328 Output= -85906 K= 46 Input= 140737488355328 Output= -85906 K= 47 Input= 140737488355328 Output= -85906 K= 48 Input= 140737488355328 Output= -85906 K= 49 Input= 140737488355328 Output= -85906 K= 50 Input= 140737488355328 Output= -85906 K= 51 Input= 140737488355328 Output= -85906 K= 52 Input= 140737488355328 Output= -85906 K= 53 Input= 140737488355328 Output= -85906 K= 54 Input= 140737488355328 Output= -85906 K= 55 Input= 140737488355328 Output= -85906 K= 56 Input= 140737488355328 Output= -85906 K= 57 Input= 140737488355328 Output= -85906 K= 58 Input= 140737488355328 Output= -85906 K= 59 Input= 140737488355328 Output= -85906 K= 60 Input= 140737488355328 Output= -85906 |

:
Edited by User

So I did a new VHDL Testbench and my simulation shows that both implementations are not as equal as I've thought with the jumps as test values. I'm using a DDS with 100KHz and full amplitude as Input. As you can see in the picture, the DF1 has kind of a sign in it, which should explain the behaviour of the filter. Now I need to find out why this is :-/

It's probably not sufficient to qualitatively compare output values. I know it's a pain, but sometimes it's the only cure: Compare every bit of every arithmetic operation and variable for some (initial) steps.

It seems that an overflow is happening in the simulation, since the most negative part of the sine is folded to a positive value.

module biquad1 #( parameter coeffWidth = 1, parameter productWidth = 1, parameter sig1Width = 1 ) ( output wire [sig1Width-1: 0] IIRoutput_o , input IIRclk_i , input wire modeReset_i , input wire signed [sig1Width-1: 0] IIRinput_i , input wire signed [coeffWidth-1: 0] b0_i , input wire signed [coeffWidth-1: 0] b1_i , input wire signed [coeffWidth-1: 0] b2_i , input wire signed [coeffWidth-1: 0] NEGa1_i , input wire signed [coeffWidth-1: 0] NEGa2_i ); reg signed [sig1Width-1: 0] s1 ; reg signed [sig1Width-1: 0] s2 ; wire signed [sig1Width-1: 0] vk ; reg signed [sig1Width-1: 0] vkReg ; reg signed [sig1Width-1: 0] xk ; reg signed [sig1Width-1: 0] v1 ; reg signed [sig1Width-1: 0] v2 ; reg signed [sig1Width-1: 0] ykReg ; wire signed [productWidth-1: 0] productB0; wire signed [productWidth-1: 0] productB1 ; wire signed [productWidth-1: 0] productB2 ; wire signed [productWidth-1: 0] productA1; wire signed [productWidth-1: 0] productA2 ; assign vk = IIRinput_i+s1 ; assign productA2 = NEGa2_i*vk ; assign productA1 = NEGa1_i*vk ; assign productB0 = b0_i*vkReg ; assign productB1 = b1_i*vkReg ; assign productB2 = b2_i*vkReg ; wire sampleStrobe ; assign sampleStrobe = 1 ; assign IIRoutput_o = ykReg ; always @(posedge IIRclk_i ) begin if ( sampleStrobe ) begin xk <= IIRinput_i ; if ( modeReset_i ) begin s1<=25'h0 ; s2<=25'h0 ; v1<=25'h0 ; v2<=25'h0 ; vkReg <= 0 ; ykReg <= 0 ; end else begin s1<=s2+ productA1[25-1+16:0+16] ; s2<= productA2[25-1+16:0+16] ; v1<=v2 + productB1[25-1+16:0+16] ; v2<= productB2[25-1+16:0+16] ; vkReg <= vk ; ykReg <= v1 + productB0[25-1+16:0+16] ; end end end |

So sieht meine IIR-Biquad Berechnung in Verilog aus. Mi jedem Takt wird ein Sample berechnet.

Did something like that before, this is still directform 1 with just some async calculations and the filter curves look the same as with my state machine implementation. Maybe I'll try your implementation of the DF2, but as far as I can see there is no overflow in the registers, even with 8 guard bits there seems to be no issue. Can you provide a proper testbench for your filter?

LIBRARY ieee; USE ieee.std_logic_1164.ALL; use ieee.NUMERIC_STD.ALL; use ieee.std_logic_signed.all; entity IIRDF1 is generic ( INPUT_WIDTH : integer := 64; QFORMAT : integer := 30; B0 : integer := 409494; B1 : integer := 818988; B2 : integer := 409494; A1 : integer := -3954428; A2 : integer := 1398100 ); port ( iCLK : in std_logic; iRESET_N : in std_logic; inewValue : in std_logic; -- indicates a new input value iIIR_RX : in std_logic_vector (INPUT_WIDTH-1 downto 0); -- singed is expected oDone : out std_logic; -- Done Flag for next Filter oIIR_TX : out std_logic_vector (INPUT_WIDTH-1 downto 0)-- Output ); end entity IIRDF1; architecture BEH_FixCoefficientIIR of IIRDF1 is constant cA1 : signed(QFORMAT+1 downto 0) := to_signed(A1,QFORMAT+2);-- A1 constant cA2 : signed(QFORMAT+1 downto 0) := to_signed(A2,QFORMAT+2);-- A2 constant cB0 : signed(QFORMAT+1 downto 0) := to_signed(B0,QFORMAT+2);-- B1 constant cB1 : signed(QFORMAT+1 downto 0) := to_signed(B1,QFORMAT+2);-- B1 constant cB2 : signed(QFORMAT+1 downto 0) := to_signed(B2,QFORMAT+2);-- B1 signal productA1,productA2,productB1,productB2,productB0 : std_logic_vector(INPUT_WIDTH+1+QFORMAT+2 downto 0) := (others => '0'); signal nZX0,nZX1,nZX2,nZY1,nZY2 : std_logic_vector(INPUT_WIDTH+1 downto 0) := (others => '0'); begin productB0 <= std_logic_vector(cB0 * signed(nZX0)); productB1 <= std_logic_vector(cB1 * signed(nZX1)); productB2 <= std_logic_vector(cB2 * signed(nZX2)); productA1 <= std_logic_vector(cA1 * signed(nZY1)); productA2 <= std_logic_vector(cA2 * signed(nZY2)); process(iCLK,iRESET_N) begin if(rising_edge(iCLK)) then if(iRESET_N = '0') then nZX0 <= (others => '0'); nZX1 <= (others => '0'); nZX2 <= (others => '0'); nZY1 <= (others => '0'); nZY2 <= (others => '0'); else oDone <= '0'; if(inewValue = '1') then nZX0 <= iIIR_RX(iIIR_RX'left) & iIIR_RX(iIIR_RX'left) & iIIR_RX; nZX1 <= nZX0; nZX2 <= nZX1; nZY1 <= productB0(productB0'left-2 downto QFORMAT)+productB1(productB1'left-2 downto QFORMAT)+productB2(productB2'left-2 downto QFORMAT)-productA1(productA1'left-2 downto QFORMAT)-productA2(productA2'left-2 downto QFORMAT); oIIR_TX <= productB0(productB0'left-4 downto QFORMAT)+productB1(productB1'left-4 downto QFORMAT)+productB2(productB2'left-4 downto QFORMAT)-productA1(productA1'left-4 downto QFORMAT)-productA2(productA2'left-4 downto QFORMAT); nZY2 <= nZY1; oDone <= '1'; end if; end if; end if; end process; end architecture; |

So sieht meine Testbench aus. Die Koeffizienten sind besonders einfach und ich teste damit die grundlegende Arithmetik. Ob Überläufe stattfinden teste ich immer am echten Objekt. Das theoretisch im Vorhinein zu machen ist für mich schwierig, weil ich nicht weiss, welche Eingangssequenz die "gefährlichste" ist.

module IIR1_tb; reg TBclk ; parameter tck = 10; ///< clock tick always #(tck/2) TBclk <= ~TBclk; // clocking device integer fd ; parameter productWidth = 42 ; parameter coeffWidth = 18 ; parameter sig1Width = 25 ; reg signed [sig1Width-1: 0] IIR1inPrepare ; reg signed [sig1Width-1: 0] IIR1in ; wire signed [sig1Width-1: 0] IIR1out ; reg [ 32-1: 0] debug1 ; wire sampleStrobe ; assign sampleStrobe = 1 ; reg [ 8-1: 0] timer ; always @(posedge TBclk) begin if ( sampleStrobe ) begin IIR1in <= IIR1inPrepare ; debug1 <= IIR1out ; timer <= timer+1 ; end end reg signed [coeffWidth-1: 0] IIR1b0 ; reg signed [coeffWidth-1: 0] IIR1b1 ; reg signed [coeffWidth-1: 0] IIR1b2 ; reg signed [coeffWidth-1: 0] IIR1a1 ; reg signed [coeffWidth-1: 0] IIR1a2 ; reg modeReset ; red_pitaya_asg_biquad1 #( .coeffWidth ( coeffWidth ), .productWidth (productWidth), .sig1Width (sig1Width ) ) biquadModule1 ( .IIRoutput_o ( IIR1out ) , .IIRclk_i ( TBclk ) , .modeReset_i ( modeReset ) , .IIRinput_i ( IIR1in ) , .b0_i ( IIR1b0 ) , .b1_i ( IIR1b1 ) , .b2_i ( IIR1b2 ) , .a1_i ( IIR1a1 ) , .a2_i ( IIR1a2 ) ) ; /* always @(posedge TBclk) begin $fwrite(fd, "TBdout1=%3d TBdout2=%3d TBrst1=%1d TBrst2=%1d\n", TBdout1,TBdout2,TBrst1,TBrst2) ; end */ initial begin fd = $fopen("output.txt", "w"); $dumpfile("test1.vcd"); $dumpvars(-1, IIR1_tb); $monitor("%d in=%10d out=%10d ",timer, IIR1in ,IIR1out ); //$monitor("%d in=%10d vk=%10d vkReg=%10d",timer, IIR1in ,vk,vkReg ); end reg signed [sig1Width-1: 0] in1 ; reg signed [sig1Width-1: 0] in2 ; reg signed [sig1Width-1: 0] in3 ; always @(posedge TBclk) begin in1<=IIR1in ; in2<=in1 ; end always @(posedge TBclk) begin in3<=in2 ; end real rho ; real cosphi ; // testbench actions initial begin timer=0 ; // IIR1b0=1<<16 ; // 1.0 // IIR1b1=( 0) ; // IIR1b2=( 0) ; IIR1b0=1<<16 ; // 1.0 IIR1b1=( (1.0)*(2**16)) ; IIR1b2=( -(0.5)*(2**16)) ; // rho=0.5 ; // cosphi=-0.5 ; // IIR1a1= ((-2*cosphi*rho)*(2**16)) ; // IIR1a2= ((rho*rho)*(2**16)) ; IIR1a1= ((-1.0)*(2**16)) ; // cos phi=sqrt(2)/2 IIR1a2= ((0.5)*(2**16)) ; // rho=sqrt(2)/2 // IIR1a1= 0 ; // IIR1a2= 0 ; IIR1inPrepare <= 0 ; TBclk = 0; modeReset = 0; #(tck); modeReset = 1; repeat(5) @(posedge TBclk); #(tck); modeReset = 0; repeat(5) @(posedge TBclk); @(negedge TBclk); // IIR1in <= (1<<23) ; IIR1inPrepare <= (1600000) ; @(negedge TBclk); IIR1inPrepare <= 0 ; repeat(50) @(posedge TBclk); $fclose(fd); $finish; end endmodule |