Floating Point Numbers - Yr 2 Only

From TRCCompSci - AQA Computer Science
Revision as of 11:43, 16 June 2017 by Admin (talk | contribs) (Converting from Denary to Binary)
Jump to: navigation, search

Floating point numbers are a method of dynamic binary numerical representation, allowing for a customizable range and accuracy using the same number of digits. Floating point consists of 2 parts, a mantissa which contains the binary value of the represented number, and the exponent which shifts the decimal point according to the size of the number. For a floating point number to be normalized and make the best use of available memory, it must begin with "0.1" for a positive number and "1.0" for a negative number. Any deviation with this could be a waste of bits, as the same number could be represented with a smaller mantissa.

For example, the number 32 could be represented by a floating point number with an 8 bit mantissa and a 5 bit exponent.

The mantissa would be as follows: 0.1000000

The exponent must shift the decimal point to shift 1 into the value of 32, it must therefore have a value of 6: 00110

Converting from Binary to Denary

  1. . Write down the mantissa, with the point inserted after the sign bit. (Miss off trailing 0’s)
  2. . If the mantissa is negative (sign bit = 1) then
    1. .find the twos complement of the mantissa
  3. . If the exponent is negative (sign bit = 1) then
    1. .find the twos complement of the exponent
  4. . Calculate the value of the exponent in denary
  5. . If the exponent is positive then
    1. .move the point in the mantissa to the right the number of places given by the exponent
    2. .else {if the exponent is negative}
    3. .move the point in the mantissa to the left the number of places given by the exponent
  6. . Convert the mantissa to denary to obtain the answer

Converting from Denary to Binary

  1. . Convert the denary number to an unsigned binary number (the mantissa)
  2. . Normalise this (move the point to in front of the leading 1)
  3. . If the number is negative then
    1. .represent it as its twos complement equivalent
  4. . Count the number of places the point has been moved to give exponent
  5. .If point moved left then
      1. . exponent is positive
    1. .else {if point moved right}
      1. . exponent is negative
  6. . Convert exponent to twos complement binary (6-bits in this case)
  7. . Add 0’s to the mantissa if necessary (to give 10 bits in this case)