Can you do it QUESTION #5 for me please?

Give the shortest algorithm for calculating (4A + 6B). (v) How many clock cycles are needed to calculate value in (iv). Give reasons. NB:Normal operation necessitates M = 0 to be set. However, if M = 1 is set the next state will be D'[3:0] = (IN1[3:0])’, i.e. 1’s complement of the input values will be stored at the next clock cycle to the Register A instead of D[3:0]. Given the datapath in Q4, Develop an algorithm for the calculation of C = A – B using 2’s complement arithmetic. (i) Write down the fastest algorithm for the above calculation. (ii) State how many clock cycles are needed for the result to be outputted. (iii) Draw the timing diagram to show the calculation of A – B for the input data values of A = 6 and B = 3. (iv) Develop the algorithm for calculating C = 2A – B.

## Expert Answer

Single cycle datapath:The single-cycle datapath is not used in modern processors, because it is inefficient. The critical path (longest propagation sequence through the datapath) is five components for the load instruction. The cycle time t_{c} is limited by the settling time t_{s} of these components. For a circuit with no feedback loops, t_{c} > 5t_{s}. In practice, t_{c} = 5kt_{s}, with large proportionality constant k, due to feedback loops, delayed settling due to circuit noise, etc. Additionally, it is possible to compute the required execution time for each instruction class from the critical path information. The result is that the Load instruction takes 5 units of time, while the Store and R-format instructions take 4 units of time. All the other types of instructions that the datapath is designed to execute run faster, requiring three units of time.

The problem of penalizing addition, subtraction, and comparison operations to accomodate loads and stores leads one to ask if multiple cycles of a much faster clock could be used for each part of the fetch-decode-execute cycle. In practice, this technique is employed in CPU design and implementation, as discussed in the following sections on multicycle datapath design, we will show that datapath actions can be interleaved in time to yield a potentially fast implementation of the fetch-decode-execute cycle that is formalized in a technique called pipelining.

Multicycle datapath: we use the single-cycle datapath components to create a multi-cycle datapath, where each step in the fetch-decode-execute sequence takes one cycle. This approach has two advantages over the single-cycle datapath:

- Each functional unit (e.g., Register File, Data Memory, ALU) can be used more than once in the course of executing an instruction, which saves hardware (and, thus, reduces cost); and
- Each instruction step takes one cycle, so different instructions have different execution times. In contrast, the single-cycle datapath that we designed previously required every instruction to take one cycle, so all the instructions move at the speed of the slowest.

ii)Timing Diagram: