Computers represent real values in a form similar to that of scientific notation. Consider the value
1.23 x 10^4
The number has a sign (+ in this case)
The significand (1.23) is written with one non-zero digit
to the left of the decimal point.
The base (radix) is 10.
The exponent (an integer value) is 4. It too must have a sign.
There are standards which define what the representation means, so that across computers there will be consistancy.
Note that this is not the only way to represent floating point numbers, it is just the IEEE standard way of doing it.
Here is what we do:
the representation has three fields:
---------------------------- | S | E | F | ----------------------------
the decimal value represented is:
S e (-1) x f x 2where
e = E - bias f = ( F/(2^n) ) + 1
for single precision representation (the emphasis in this class)
n = 23
bias = 127
for double precision representation (a 64-bit representation)
n = 52 (there are 52 bits for the mantissa field)
bias = 1023 (there are 11 bits for the exponent field)
Now, what does all this mean?
An example: Put the decimal number 64.2 into the IEEE standard single precision floating point representation.
first step: get a binary representation for 64.2 to do this, get unsigned binary representations for the stuff to the left and right of the decimal point separately. 64 is 1000000 .2 can be gotten using the algorithm: .2 x 2 = 0.4 0 .4 x 2 = 0.8 0 .8 x 2 = 1.6 1 .6 x 2 = 1.2 1 .2 x 2 = 0.4 0 now this whole pattern (0011) repeats. .4 x 2 = 0.8 0 .8 x 2 = 1.6 1 .6 x 2 = 1.2 1 so a binary representation for .2 is .001100110011. . . ---- or .0011 (The bar over the top shows which bits repeat.) Putting the halves back together again: 64.2 is 1000000.0011001100110011. . . second step: Normalize the binary representation. (make it look like scientific notation) 6 1.000000 00110011. . . x 2 third step: 6 is the true exponent. For the standard form, it needs to be in 8-bit, biased-127 representation. 6 + 127 ----- 133 133 in 8-bit, unsigned representation is 1000 0101 This is the bit pattern used for E in the standard form. fourth step: the mantissa stored (F) is the stuff to the right of the radix point in the normalized form. We need 23 bits of it. 000000 00110011001100110 put it all together (and include the correct sign bit): S E F 0 10000101 00000000110011001100110 the values are often given in hex, so here it is 0100 0010 1000 0000 0110 0110 0110 0110 0x 4 2 8 0 6 6 6 6
Some extra details:
We take the bit patterns 0x0000 0000 and 0x8000 0000 to represent the value 0.
(What floating point numbers cannot be represented because of this?)
Note that the hardware that does arithmetic on floating point numbers must be constantly checking to see if it needs to use a hidden bit of a 1 or a hidden bit of 0 (for 0.0).
Values that are very close to 0.0, and would require the hidden bit to be a zero are called denormalized or subnormal numbers.
S E F 0.0 0 or 1 00000000 00000000000000000000000 (hidden bit is a 0) subnormal 0 or 1 00000000 not all zeros (hidden bit is a 0) normalized 0 or 1 > 0 any bit pattern (hidden bit is a 1)
S E F +infinity 0 11111111 00000... (0x7f80 0000) -infinity 1 11111111 00000... (0xff80 0000) NaN (Not a Number) ? 11111111 ?????... (S is either 0 or 1, E=0xff, and F is anything but all zeros)
For double precision:
Copyright © Karen Miller, 2006 |