Hi everyone, I tried to implement a VHDL program that add two signed numbers. The description of the algorithm is as follows, I receive a signed 32 bits value, let's say A. This value will be added to the previous addition and then the result will be: Result_now= A + Result_before. So, the first thing i do is to resize A and Result_before to be 33 bits, in order to avoid overflow, Result_now is 33 bits. I create a test bench to test my code, but i face a strange problem, the result is not as expected, for example when i add the value -1,26562 to 8.06250, I get 1.49691e-038. can you help to resolve this problem please ? The codes are below:

library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.NUMERIC_STD.ALL; Library altera_mf; USE altera_mf.all; -- Add the library and use clauses before the design unit declaration library altera; use altera.altera_primitives_components.all; Entity Sum_Position is Generic ( Accu_lenght : integer -- in µs ); port ( Clk: in std_logic; Reset: in std_logic; Raz_position: in std_logic; Position_In: in std_logic_vector(Accu_lenght-1 downto 0); Position_Out: out std_logic_vector(Accu_lenght-1 downto 0) ); end Sum_Position; Architecture Arch_position of sum_Position is signal position_before: signed (Accu_lenght-1 downto 0):= (OTHERS => '0'); -- both signals have one more bit than the original signal Position_s : SIGNED(Accu_lenght downto 0):= (OTHERS => '0'); signal Position_Before_s : SIGNED(Accu_lenght downto 0):= (OTHERS => '0'); signal Sum_Pos_s : SIGNED(Accu_lenght downto 0):= (OTHERS => '0'); signal temp : std_logic_vector(2 downto 0):= (OTHERS => '0'); Begin -- begin of architecture -- convert type and perform a sign-extension Position_s <=resize(signed(Position_In), Position_s'length); Position_Before_s <= resize(signed(position_before), Position_Before_s'length); Sum_of_position: process(Clk, Reset) begin IF (Reset='0') THEN -- when reset is selected -- initialize all values Sum_Pos_s<= (OTHERS => '0'); ELSIF (Clk'event and Clk = '1') then -- addition of two 33 bit values Sum_Pos_s <= Position_s + Position_Before_s; END IF; end process Sum_of_position; -- resize to require size and type conversion position_before <= (OTHERS => '0') WHEN Raz_position='1' else -- Reset to zero when Raz_position='1' signed(resize(Sum_Pos_s, position_before'length)); Position_Out <= (OTHERS => '0') WHEN Raz_position='1' else -- Reset to zero when Raz_position='1' std_logic_vector(resize(Sum_Pos_s, Position_Out'length)); end Arch_position; |

And the test Bench is this one:

LIBRARY IEEE; USE IEEE.STD_LOGIC_1164.ALL; USE IEEE.NUMERIC_STD.ALL; ENTITY TBH IS END TBH; ARCHITECTURE TBH_ARCH OF TBH IS COMPONENT Sum_Position -- to declare the block of ADC_CNL Generic ( Accu_lenght : integer -- in µs ); PORT( Clk: in std_logic; Reset: in std_logic; Raz_position: in std_logic; Position_In: in std_logic_vector(Accu_lenght-1 downto 0); Position_Out: out std_logic_vector(Accu_lenght-1 downto 0) ); END COMPONENT; CONSTANT Accu_size: integer := 32; SIGNAL CLK : STD_LOGIC := '1'; SIGNAL RESET : STD_LOGIC := '0'; SIGNAL Raz : STD_LOGIC := '0'; SIGNAL Position_In : STD_LOGIC_VECTOR(Accu_size-1 DOWNTO 0); SIGNAL Position_Out : STD_LOGIC_VECTOR(Accu_size-1 DOWNTO 0); BEGIN --to define the signal and the block's relationship Position : Sum_Position generic map (Accu_size)PORT MAP( Clk => CLK, --COMPONENT PORT => ACTUAL SIGNAL Reset => RESET, Raz_position => Raz, Position_In => Position_In, Position_Out => Position_Out ); PROCESS(CLK) --to produce CLK BEGIN CLK <= NOT CLK AFTER 10 NS; END PROCESS; PROCESS --to simulation the signal the AD timing BEGIN Position_In <= (OTHERS => '0'); WAIT FOR 20 NS; RESET <='1'; WAIT FOR 20 NS; Position_In <= X"bfa1ffd6"; -- -1,26562 WAIT FOR 20 NS; Position_In <= X"41010000"; --8,0625 WAIT FOR 20 NS; Position_In <= X"bf31003f"; -- -0,69141 WAIT FOR 20 NS; Position_In <= X"3fb9ffd6"; -- +1,45312 WAIT FOR 20 NS; Position_In <= X"c0b80000"; -- -5,75 WAIT FOR 20 NS; Position_In <= X"c0f10000"; -- -7,53125 WAIT ;--100 NS; END PROCESS; END TBH_ARCH; |

Thank you for taking time to help me. Best regards,

jeorges F. wrote: > for example when i add the value -1,26562 to 8.06250, I get 1.49691e-038. With the + operator on singed vectors you do not add any float values, but instead you add twos-complement integer values. And it seems to me you have some kind of very strange own number format. This here looks like the MSB alone is the negative sign: X"bfa1ffd6"; -- -1,26562 X"3fb9ffd6"; -- +1,45312 The binary representation of those float values look almost the same, but only the MSB is set and the whole value gets negative. Thats not how two's-complement binary numbers work!!! And therefore you cannot use a two's-complement addition to add your own number format! So: whats your numbers format? How do you calculate thebinaryrepresentation of those float numbers? > for example when i add the value -1,26562 to 8.06250, I get 1.49691e-038. Can you show those numbers in a test bench waveform? Or are those numbers only in your head or on a sheet of paper? > can you help to resolve this problem please ? The compiler does not read comments. It just does what the sourcecode tells him. So it looks very like the compiler does interpret X"bfa1ffd6" different from you...

Thank you Lothar for your answer. In fact my input comes from a Nios processor. The nios does computing in float, so the Position_In is float inside Nios, then it's converted to Integer 32 bits. It's arranged in my variable std_logic_vector (31 downto 0). To have the equivalent from hex to float number, i use the Floating Point to Hex Converter. You find enclosed the test bench waveform, the values are represented in Hex, but when i change the radix, i get the correct float equivalent. Best regards

jeorges F. wrote: > To have the equivalent from hex to float number, i use the Floating > Point to Hex Converter. Of course you cannot add two float numbers with an two's-complement-adder (what you are doing when adding twosignedvectors). > but when i change the radix, i get the correct float equivalent. Read a few lines about IEEE754 then you will obviously see that you never simply can add two float values this way. You will need to normalize the two float values, then you can perform the addition, then you must convert it back to a valid float number.

Ok, i see, thank you Lothar. In my C code, i do this: /* Variable for position */ union position { alt_32 I_position; float F_position; }; When i calculate the position, I put it into F_position, then i can have the decimal value in I_position. The I_position is then exploited as signed in my VHDL code. So normally,i manipulate decimal representation, It's correct ? Thanks in advance,

jeorges F. wrote: > alt_32 This is a 32 bit integer data type? > When i calculate the position, I put it into F_position, then i can have > the decimal value in I_position. No chance! A union doesnot convertanything! This way you tell the compiler only to look at thevery samebit pattern in a different manner. Just as an example: The bit pattern 0xbf31003f is the float number -0,69141 The same pattern 0xbf31003f is the integer number 3207659583 But -0,69141 is a -1 when rounded to an integer... Do you see the problem?

:
Edited by Moderator

alt_32 = typedef signed long. Yes i see the problem, thank you. But, what's the best approach to manipulate my float data ?

Hi Lothar, I understand the issue Now, thank you very much for your explanation. Now, I have to focus on the implementation of floating addition. Do you know some links where i can find examples or documentation about this subject? Thanks in advance, Best regards,

jeorges F. wrote: > But, what's the best approach to manipulate my float data ? Activate that thing between the ears... ;-) If thats not desired then try google: This here: https://www.google.de/?#q=float+addition+bitwise Gets you: https://www.cs.umd.edu/class/sum2003/cmsc311/Notes/BinMath/addFloat.html