EmbDev.net

Forum: FPGA, VHDL & Verilog Signed Addition overflow in VHDL


von jeorges F. (Company: xlue) (khal1985)


Rate this post
useful
not useful
Hi everyone,

I tried to implement a VHDL program that add two signed numbers.
The description of the algorithm is as follows, I receive a signed 32 
bits value, let's say A. This value will be added to the previous 
addition and then the result will be: Result_now= A + Result_before.
So, the first thing i do is to resize A and Result_before to be 33 bits, 
in order to avoid overflow, Result_now is 33 bits.
I create a test bench to test my code, but i face a strange problem, the 
result is not as expected, for example when i add the value -1,26562 to 
8.06250, I get 1.49691e-038.
can you help to resolve this problem please ?

The codes are below:
1
library IEEE;
2
use IEEE.STD_LOGIC_1164.ALL;
3
use IEEE.NUMERIC_STD.ALL;
4
5
Library altera_mf;
6
USE altera_mf.all;
7
-- Add the library and use clauses before the design unit declaration
8
 
9
library altera;
10
use altera.altera_primitives_components.all;
11
12
Entity Sum_Position is 
13
   Generic ( Accu_lenght : integer  -- in µs 
14
             ); 
15
  port 
16
  
17
  (
18
    Clk: in std_logic;
19
    Reset: in std_logic;
20
    Raz_position: in std_logic;
21
    Position_In: in std_logic_vector(Accu_lenght-1 downto 0);
22
    Position_Out: out std_logic_vector(Accu_lenght-1 downto 0)
23
  );
24
end Sum_Position;
25
26
27
Architecture Arch_position of sum_Position is 
28
29
signal position_before: signed (Accu_lenght-1 downto 0):= (OTHERS => '0');
30
31
-- both signals have one more bit than the original
32
signal Position_s   : SIGNED(Accu_lenght downto 0):= (OTHERS => '0');
33
signal Position_Before_s   : SIGNED(Accu_lenght downto 0):= (OTHERS => '0');
34
signal Sum_Pos_s : SIGNED(Accu_lenght downto 0):= (OTHERS => '0');
35
signal temp        : std_logic_vector(2 downto 0):= (OTHERS => '0');
36
37
Begin  -- begin of architecture
38
39
-- convert type and perform a sign-extension
40
41
Position_s <=resize(signed(Position_In), Position_s'length);
42
Position_Before_s <= resize(signed(position_before), Position_Before_s'length);
43
44
Sum_of_position: process(Clk, Reset) 
45
46
begin 
47
  
48
  IF (Reset='0') THEN        -- when reset is selected
49
  -- initialize all values 
50
   Sum_Pos_s<= (OTHERS => '0');
51
  ELSIF (Clk'event and Clk = '1') then
52
     -- addition of two 33 bit values
53
  Sum_Pos_s <= Position_s + Position_Before_s;
54
55
  END IF;  
56
57
end process Sum_of_position;
58
59
-- resize to require size and type conversion
60
position_before <= (OTHERS => '0') WHEN Raz_position='1' else  -- Reset to zero when Raz_position='1'
61
                                         signed(resize(Sum_Pos_s, position_before'length));
62
Position_Out  <= (OTHERS => '0') WHEN Raz_position='1' else  -- Reset to zero when Raz_position='1'
63
               std_logic_vector(resize(Sum_Pos_s, Position_Out'length));
64
end Arch_position;

And the test Bench is this one:
1
LIBRARY IEEE;
2
USE IEEE.STD_LOGIC_1164.ALL;
3
USE IEEE.NUMERIC_STD.ALL;
4
5
ENTITY TBH IS
6
END TBH;
7
8
ARCHITECTURE TBH_ARCH OF TBH IS
9
COMPONENT Sum_Position                          -- to declare the block of ADC_CNL
10
  Generic ( Accu_lenght : integer  -- in µs 
11
        ); 
12
  
13
  PORT(
14
    Clk: in std_logic;
15
    Reset: in std_logic;
16
    Raz_position: in std_logic;
17
    Position_In: in std_logic_vector(Accu_lenght-1 downto 0);
18
    Position_Out: out std_logic_vector(Accu_lenght-1 downto 0)
19
  );
20
END COMPONENT;
21
22
CONSTANT Accu_size: integer := 32;
23
SIGNAL CLK   : STD_LOGIC := '1';
24
25
SIGNAL RESET   : STD_LOGIC := '0';
26
SIGNAL Raz : STD_LOGIC := '0';
27
SIGNAL Position_In  : STD_LOGIC_VECTOR(Accu_size-1 DOWNTO 0);
28
SIGNAL Position_Out  : STD_LOGIC_VECTOR(Accu_size-1 DOWNTO 0);
29
30
BEGIN
31
--to define the signal and the block's relationship
32
Position : Sum_Position generic map (Accu_size)PORT MAP(
33
  Clk   => CLK,                       --COMPONENT PORT => ACTUAL SIGNAL
34
  Reset => RESET,
35
  Raz_position        => Raz,
36
  Position_In  => Position_In,
37
  Position_Out => Position_Out
38
);
39
40
  PROCESS(CLK)        --to produce CLK
41
  BEGIN 
42
    CLK <= NOT CLK AFTER 10 NS;
43
  END PROCESS;
44
  
45
  PROCESS             --to simulation the signal the AD timing
46
  BEGIN
47
    Position_In <= (OTHERS => '0');
48
    WAIT FOR 20 NS;
49
    RESET <='1';
50
    WAIT FOR 20 NS;
51
    Position_In <= X"bfa1ffd6";  -- -1,26562
52
    WAIT FOR 20 NS;
53
    Position_In <= X"41010000";  --8,0625
54
    WAIT FOR 20 NS;
55
    Position_In <= X"bf31003f";  -- -0,69141
56
    WAIT FOR 20 NS;
57
    Position_In <= X"3fb9ffd6";  -- +1,45312
58
    WAIT FOR 20 NS;
59
    Position_In <= X"c0b80000";  -- -5,75
60
    WAIT FOR 20 NS;
61
    Position_In <= X"c0f10000";  -- -7,53125
62
    WAIT  ;--100 NS;
63
  END PROCESS;
64
END TBH_ARCH;


Thank you for taking time to help me.
Best regards,

von Lothar M. (Company: Titel) (lkmiller) (Moderator)


Rate this post
useful
not useful
jeorges F. wrote:
> for example when i add the value -1,26562 to 8.06250, I get 1.49691e-038.
With the + operator on singed vectors you do not add any float values, 
but instead you add twos-complement integer values.
And it seems to me you have some kind of very strange own number format. 
This here looks like the MSB alone is the negative sign:
X"bfa1ffd6";  -- -1,26562
X"3fb9ffd6";  -- +1,45312
The binary representation of those float values look almost the same, 
but only the MSB is set and the whole value gets negative.
Thats not how two's-complement binary numbers work!!! And therefore you 
cannot use a two's-complement addition to add your own number format!

So: whats your numbers format?
How do you calculate the binary representation of those float numbers?

> for example when i add the value -1,26562 to 8.06250, I get 1.49691e-038.
Can you show those numbers in a test bench waveform?
Or are those numbers only in your head or on a sheet of paper?

> can you help to resolve this problem please ?
The compiler does not read comments. It just does what the sourcecode 
tells him. So it looks very like the compiler does interpret X"bfa1ffd6" 
different from you...

von jeorges F. (Company: xlue) (khal1985)


Attached files:

Rate this post
useful
not useful
Thank you Lothar for your answer.
In fact my input comes from a Nios processor. The nios does computing in 
float, so the Position_In is float inside Nios, then it's converted to 
Integer 32 bits. It's arranged in my variable std_logic_vector (31 
downto 0).

To have the equivalent from hex to float number, i use the Floating 
Point to Hex Converter.

You find enclosed the test bench waveform, the values are represented in 
Hex, but when i change the radix, i get the correct float equivalent.

Best regards

von Lothar M. (Company: Titel) (lkmiller) (Moderator)


Rate this post
useful
not useful
jeorges F. wrote:
> To have the equivalent from hex to float number, i use the Floating
> Point to Hex Converter.
Of course you cannot add two float numbers with an 
two's-complement-adder (what you are doing when adding two signed 
vectors).

> but when i change the radix, i get the correct float equivalent.
Read a few lines about IEEE754 then you will obviously see that you 
never simply can add two float values this way. You will need to 
normalize the two
float values, then you can perform the addition, then you must convert 
it back to a valid float number.

von jeorges F. (Company: xlue) (khal1985)


Rate this post
useful
not useful
Ok, i see, thank you Lothar.
In my C code, i do this:
/* Variable for position */
union position
{
   alt_32   I_position;
    float      F_position;
};
When i calculate the position, I put it into F_position, then i can have 
the decimal value in I_position.
The I_position is then exploited as signed in my VHDL code. So 
normally,i manipulate decimal representation, It's correct ?

Thanks in advance,

von Lothar M. (Company: Titel) (lkmiller) (Moderator)


Rate this post
useful
not useful
jeorges F. wrote:
> alt_32
This is a 32 bit integer data type?

> When i calculate the position, I put it into F_position, then i can have
> the decimal value in I_position.
No chance! A union does not convert anything! This way you tell the 
compiler only to look at the very same bit pattern in a different 
manner.

Just as an example:
The bit pattern  0xbf31003f is the float number -0,69141
The same pattern 0xbf31003f is the integer number 3207659583
But -0,69141 is a -1 when rounded to an integer...
Do you see the problem?

: Edited by Moderator
von jeorges F. (Company: xlue) (khal1985)


Rate this post
useful
not useful
alt_32 = typedef signed long.
Yes i see the problem, thank you. But, what's the best approach to 
manipulate my float data ?

von jeorges F. (Company: xlue) (khal1985)


Rate this post
useful
not useful
Hi Lothar,

I understand the issue Now, thank you very much for your explanation.
Now,  I have to focus on the implementation of floating addition.

Do you know some links where i can find examples or documentation about 
this subject?

Thanks in advance,

Best regards,

von Lothar M. (Company: Titel) (lkmiller) (Moderator)


Rate this post
useful
not useful
jeorges F. wrote:
> But, what's the best approach to manipulate my float data ?
Activate that thing between the ears...  ;-)
If thats not desired then try google:
This here: https://www.google.de/?#q=float+addition+bitwise
Gets you: 
https://www.cs.umd.edu/class/sum2003/cmsc311/Notes/BinMath/addFloat.html

von jeorges F. (Company: xlue) (khal1985)


Rate this post
useful
not useful
Thank you very much.
Best regards;

Please log in before posting. Registration is free and takes only a minute.
Existing account
Do you have a Google/GoogleMail account? No registration required!
Log in with Google account
No account? Register here.