• Representation of infinity in IEEE 754 format

    Depending upon the sign bit infinity is classified into positive or negative.

    To be qualified for infinity, mantissa or fractional part of IEEE 754 format has to be all zeroes.

  • Evaluating a Binary number in IEEE 754 format

    In this video, you will see how to manually Evaluate a Binary number which is represented in IEEE 754 format.

  • Express a number in IEEE-754 single-precision floating-point format

    Sign

    0 = positive number
    1 = negative number

    Biased Exponent(E)

    A bias of 127 is to be added to the exponent number.
    The biased exponent can range from -126 to +128
    Which can fit in 8-bit’s.

    Mantissa(F)

    There is a one-bit towards the left of the floating point.
    It must be noted here that is the number is shorter than 23 bits in length. Zeroes are padded towards the right side. to fit in the 23-bit field.

  • Boolean Algebra

    • A + 0 = A
    • A + 1 = A
    • A . 0 = 0
    • A . 1 = A
    • A + A’ = 1
    • A . A’ = 0
    • Commutative Law
      A + B = B + A
    • Associative Law
      A + ( B + C ) = (A + B) + C
    • Distributive Law
      A . (B+C) = A.B + A.C
    • (A’)’ = A
    • De Morgan Theorem
      (A + B)’ = A’ . B’
      (A.B)’ = A’ + B’

  • IEEE-754 32-bit Single Precision Floating Point Representation

    IEEE 754 format for the representation of 32 bit single-precision floating-point numbers
    IEEE 754 format for the representation of 32 bit single-precision floating-point numbers

    In an ARM Cortex, M4F processor-based microcontroller such as STM32L476vg; the floating-point number is stored in accordance to the IEEE-754.

    In the above video, I have written a small code using IAR workbench. When I debugged the program using IAR. I have observed that the floating-point number is converted by the compiler into IEEE 754 format and it is being stored in S# registers of the STM32L476vg. Which are the register present in the FPU(Floating Point Unit) or VFP(Vectored Floating Point) unit.

    If you want to go into details, here is the abstract

    Abstract: This standard specifies interchange and arithmetic formats and methods for binary and decimal floating-point arithmetic in computer programming environments. This standard specifies exception conditions and their default handling. An implementation of a floating-point system conforming to this standard may be realized entirely in software, entirely in hardware, or in any combination of software and hardware. For operations specified in the normative part of this standard, numerical results and exceptions are uniquely determined by the values of the input data, sequence of operations, and destination formats, all under user control.

    “IEEE Standard for Floating-Point Arithmetic,” in IEEE Std 754-2008 , vol., no., pp.1-70, 29 Aug. 2008, doi: 10.1109/IEEESTD.2008.4610935.

    A great tool to have is the online IEEE 754 convertor

    https://www.h-schmidt.net/FloatConverter/IEEE754.html

  • Logic Gates

    The logic gates are basic building blocks for any digital electronic computer.

    The AND, OR and NOT gates can be made from types of technology such as TTL, MOSFET, FET, or vacuum Tubes etc.

    AND Gate

    ABoutput (Y = A.B)
    001
    100
    010
    110
    The truth table for AND gate

    OR Gate

    AB output (Y = A + B)
    000
    011
    101
    111
    The truth table for OR gate

    NOT Gate

    AO/p = A’
    01
    10
    The truth table for NOT gate

    NAND Gate

    AB output (Y = (A.B)’ )
    001
    011
    101
    110
    The truth table for the NAND gate

    NAND gate is known as a universal gate. All the other logic circuits can be derived from it.

    NOR Gate

    AB output (Y = (A+B)’ )
    001
    010
    100
    110
    The truth table for NOR gate

    NOR gate is known as a universal gate. All the other logic circuits can be derived from it.

    XOR Gate

    AB output (Y = A’B + AB’ )
    000
    011
    101
    110
    The truth table for XOR gate

    XNOR Gate

    AB output (Y = AB + A’B’ )
    001
    010
    100
    111
    The truth table for XNOR gate
  • Subtraction using 2’s complement method

    S.N0.DecimalSign MagnitudeOne ComplementTwo Complement
    00000000000000
    11000100010001
    22001000100010
    33001100110011
    44010001000100
    55010101010101
    66011001100110
    77011101110111
    8-010001111=(0000)’+1
    =1111 + 1
    =0000(discard MSB)
    9-110011110=(0001)’ + 1
    =(1110) + 1
    =1111
    10-210101101 =(0010)’ + 1
    =(1101) + 1
    =1110
    11-310111100 =(0011)’ + 1
    =(1100) + 1
    =1101
    12-411001011 =(0100)’ + 1
    =(1011) + 1
    =1100
    13-511011010 =(0101)’ + 1
    =(1010) + 1
    =1011
    14-611101001 =(0110)’ + 1
    =(1001) + 1
    =1010
    15-711111000 =(0111)’ + 1
    =(1000) + 1
    =1001
    16-8(not allowed in 4-bit) (not allowed in 4-bit) =(1000)’+1
    =(0111)+1
    =1000
    Table showing Decimal Numbers representation in sign-magnitude, one complement and two complement

    Subtraction using 2’s complement method is performed in microcontroller processors.

    To perform subtraction using 2’s complement method we need to do the following steps.

    step 1: Find the 2’s complement of subtrahend number.

    step 2: Add the 2’s complement of subtrahend to the minuend.

    step 3: Check the result for the carry.

    If there is carry generated.
    Then the result obtained is positive. And there is no further processing needed.

    If there is no carry generated.
    Then the result obtained is negative. And you have to do 2’s complement of the result obtained.

    NOTE:
    When doing subtraction using 2’s complement. Pay close attention to the sign of the subtrahend and minuend and the result.

    EXAMPLE:

    Ques.) 5 – 2 = ?
    Ans.) 5 – 2 = 3 (in decimal)

    using 2’s complement method
    (-2) representation in two’s complement form = 1110

    (5) in 2’s complement number = 0101

    0101
    + 1110
    ——-
    1 0011 = 3 (by discarding the MSB 1, Here MSB is the carry)
    There is a carry generated. So, the number is positive and no further processing is required.
    We can discard the carry.

    Ques.) -8 – 2 = ? in a 4-bit system
    Ans.) -8 – 2 = -10 (in decimal)
    Remeberwe have only 4 bit to store the number.
    using 2’s complement method
    (-2) representation in two’s complement form = 1110

    (-8) in 2’s complement number = 1000

    1000
    + 1110
    ——-
    1 0110 = 6 (by discarding the MSB 1, Here MSB is the carry)
    There is a carry generated. So, the number is positive and no further processing is required.
    But, the result obtained is wrong. Because if we add two numbers of the same sign result must be of the same sign. This condition creates an overflow.

    Overflow in 2’complement subtraction

    When two numbers of the same sign are added together and they produce a result with an opposite sign; An overflow has occurred and the result is not valid.

  • Convert Negative Decimal Number to Signed Binary number

    I have explained how negative signed decimal numbers are converted into binary equivalent.

  • Subtraction using 1’s complement method

    In this video, you will see the subtraction of two numbers using the one complement method. This one has a method in which the result contains an overflow bit. If after adding there is no overflow carry. Then the result is negative and you have to complement it again.