Sunday 5 August 2012

Microprocessor: its working and concept of pipelining and superscalar

My last article was on microprocessors and the current and future trends of multi-core microprocessors and its drawbacks. I was pointed out by one of my readers that without the proper knowledge of the working of microprocessors, the concepts of pipelining and superscalar, it’s hard to understand the shift from single core microprocessors to multi-core microprocessors. So this is what this article is about. I am going to describe you the working of microprocessors then we will discuss about the concept of superscalar and pipelining.

Let’s first talk about the working of the microprocessors. A basic microprocessor has a CPU, Memory storage Unit and I/O devices which is connected through a data bus and an address bus. The CPU contains ALU, control Unit and a clock which describes the frequency at which instructions are processed. The execution of a single machine instruction can be divided into a sequence of operations called the instruction execution cycle. Before executing, a program is loaded into the memory. The instruction pointer (IP) contains the address of the next instruction. The instruction queue holds the group of instructions which is about to be executed. When the next instruction is called, the address is taken from the IP and the IP is updated with the new address from the instruction queue. Executing a machine instruction requires three basic steps: fetch -> decode -> execute. Taking the data from the memory uses two more steps: fetch operands and store output operands. Therefore there are six steps in the Instruction Execution Cycle for a single machine instruction:
Fetch instruction -> decode -> fetch operands -> execute -> store output operands -> next instruction -> fetch instruction again and so on.



Instruction Execution Cycle

After completely understanding the working of the microprocessors it will be easy to under the concept of pipelining and superscalar architecture. What is pipelining? Pipelining is the technique used to make CPU run faster. Pipelining is the implementation technique whereby multiple instructions are overlapped in execution. It takes advantage of parallelism where the instructions are not dependent on each other. The six-staged non-pipelined execution will take n*k cycle to process when n is the number of instructions and k is the number of execution stages. On the other hand, the six-staged pipelined execution will take k+ (n-1) cycles to process. The concept can be further explained using the following diagram.
Six Stage non-pipelined Execution


Six Stage pipelined Execution

A superscalar CPU architecture implements a form of parallelism called instruction level parallelism within a single processor. It can be thought of having 2 pipelines. It therefore helps provide a faster CPU throughput than would otherwise be possible at a given clock rate. The throughput of an instruction pipeline is the rate at which an instruction exits the pipeline. In a single pipeline execution if the execution stage requires more than one cycle then we have wasted cycles. In general for k stages (where one stage requires 2 cycles), n instructions require (k+ 2n- 1), cycles to proceed. On the other hand in 6-stage pipelined processors wasted cycles are removed, so n instructions can be executed in (k+ n) cycles, where k indicates the number of stages. This can be further explained using the following diagram.
Six stage single pipelined non-scalar execution
Superscalar six-stage pipelined execution
The concept of parallel computation is as old as 1958 when IBM researchers discussed the use of parallelism in numerical calculation for the first time. The difference between multi-core processors and superscalar is that superscalar architecture has multiple fetch, decode and execute block in the same processors whereas a multi-core processor is a single computing component with two or more independent actual processors (called cores). These multiple cores can run multiple programs at the same time, increasing overall speed of the computer. This is the main reason why there was the use to shift from single core to multi-core microprocessors which was discuss in the previous article.
Written by:
- Ahmed Ahsan Khan.

No comments:

Post a Comment