Floating Point Numbers - Yr 2 Only

Floating point numbers are a method of dynamic binary numerical representation, allowing for a customizable range and accuracy using the same number of digits. Floating point consists of 2 parts, a mantissa which contains the binary value of the represented number, and the exponent which shifts the decimal point according to the size of the number. For a floating point number to be normalized and make the best use of available memory, it must begin with "0.1" for a positive number and "1.0" for a negative number. Any deviation with this could be a waste of bits, as the same number could be represented with a smaller mantissa.

For example, the number 32 could be represented by a floating point number with an 8 bit mantissa and a 5 bit exponent.

The mantissa would be as follows: 0.1000000

The exponent must shift the decimal point to shift 1 into the value of 32, it must therefore have a value of 6: 00110

Converting from Binary to Denary

. Write down the mantissa, with the point inserted after the sign bit. (Miss off trailing 0’s)
. If the mantissa is negative (sign bit = 1) then
1. .find the twos complement of the mantissa
. If the exponent is negative (sign bit = 1) then
1. .find the twos complement of the exponent
. Calculate the value of the exponent in denary
. If the exponent is positive then
1. .move the point in the mantissa to the right the number of places given by the exponent
2. .else {if the exponent is negative}
3. .move the point in the mantissa to the left the number of places given by the exponent
. Convert the mantissa to denary to obtain the answer

Floating Point Numbers - Yr 2 Only

Converting from Binary to Denary

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Main Page

AL Paper 1

AL Paper 2

Project

Tools

Changes