Floating Point Numbers - Yr 2 Only
Floating point numbers are a method of dynamic binary numerical representation, allowing for a customizable range and accuracy using the same number of digits. Floating point consists of 2 parts, a mantissa which contains the binary value of the represented number, and the exponent which shifts the decimal point according to the size of the number. For a floating point number to be normalized and make the best use of available memory, it must begin with "0.1" for a positive number and "1.0" for a negative number. Any deviation with this could be a waste of bits, as the same number could be represented with a smaller mantissa.
For example, the number 32 could be represented by a floating point number with an 8 bit mantissa and a 5 bit exponent.
The mantissa would be as follows: 0.1000000
The exponent must shift the decimal point to shift 1 into the value of 32, it must therefore have a value of 6: 00110
Converting from Binary to Denary
- . Write down the mantissa, with the point inserted after the sign bit. (Miss off trailing 0’s)
- . If the mantissa is negative (sign bit = 1) then
- .find the twos complement of the mantissa
- . If the exponent is negative (sign bit = 1) then
- .find the twos complement of the exponent
- . Calculate the value of the exponent in denary
- . If the exponent is positive then
- .move the point in the mantissa to the right the number of places given by the exponent
- .else {if the exponent is negative}
- .move the point in the mantissa to the left the number of places given by the exponent
- . Convert the mantissa to denary to obtain the answer