Wednesday, December 14

CHAPTER 10: PARALLEL PROCESSING

PIPELINED?

Instruction pipelining is a technique that implements a form of parallelism called instruction-level parallelism within a single processor.

It therefore allows faster CPU throughput (the number of instructions that can be executed in a unit of time) than would otherwise be possible at a given clock rate.

PIPELINING LESSONS


  •          Pipelining doesn’t help latency of single task, it helps throughput of entire workload.
  •          Multiple task operating simultaneously using different resources.
  •          Potential speedup = Number of stapes
  •          Time to “fill” pipeline and time to “drain” it reduces speeduP


PIPELINE SPEEDUP

1.      If all stages are balanced
Time between instruction (pipelined) = Time between instruction (non-pipelined) I p I    n e e I   d    n    k d k    f y j                                                      Number of stages
2.      If not balanced, speedup is less
3.      Speedup due to increased throughput
4.      Ideally 5 stage pipeline should offer nearly five fold improvement over the 800ps non-pipelined time.

SINGLE CYCLE STAGES

Five stages, one step per stage:                




 


PIPELINE REGISTER


DATAPATH WALKTHROUGHS


e.g:
    slti $t2, $t1, 17  # if (t1 < 17) t2 = 1 else t2 + 0

STAGE 1: fetch this instruction, increment PC
STAGE 2: decode to determine it is an slti, then read register $t1
STAGE 3: compare value retrieved in Stage 2 with the integer 17
STAGE 4: idle
STAGE 5: write the result of Stage 3(1 if reg source was less than signed immediate, 0 otherwise) into register $t2
      
      e.g:                                                         
                                                 sw $t3, 17($t1)  # Mem [t1 + 17] = t3                                                      

STAGE 1: fetch this instruction, increment PC
STAGE 2: decode to determine it is a sw, then read register $t1 and $t3
STAGE 3: add 17 to value in register $t1(retrieved in Stage 2) to compute address
STAGE 4: write value in register $t3(retrieved in Stage 2) into memory address computed in Stage 3
STAGE 5: idle(nothing to write into a register)

WHY FIVE STAGES?

Could we have a different number of stages?
= Yes

So why do MIPS have five if instructions tend to idle for at least one stage?
= Five stages are the union of all the operations needede by all th instructions.


SINGLE CYCLE PERFORMANCE

Assume time for stages is
  • 100ps for register read or write
  • 200ps for other stages

Compare pipelined datapath with single-cycle datapath



PIPELINED HAZARD

A pipeline hazard refers to a situation in which a correct program stop to work correctly due to implementing the processor with a pipeline.
There are 3 fundamental types of hazard:

1.      Data Hazard
Arise when an instruction depends on the results of a previous instruction in a way that is exposed by overlapping of instruction in pipeline.

2.      Structural Hazard
Arise from resource conflicts when the hardware can’t support all possible combinations of overlapping instructions.

3.      Control Hazard
Arise from the pipelining of branches and other instructions that change the PC (Program Counter).

Tuesday, December 13

CHAPTER 7: THE PROCESSOR

SINGLE CYCLE CPU DATAPATH

That sure sounds complicated. So let's try making it sounds as simple as we possibly can. 

Before that, you need to know... what is a combinational circuit??

Credits to Google


A combinational circuit is only right if it comes together with some combinational elements.

Introducing the Combinational Logic - half adder & full adder.

Half Adder
- Two inputs and two outputs (carry and sum)


One output will represent sum and the other will represent carry.

Full Adder
- Three inputs and two outputs (carry and sum)



Now, something to not be confuse about - sequential circuit.

What is a sequential circuit??

Credits to Google

Sequential logic circuit is said to be a combinational logic circuit but with the addition of state (memory).

In sequential logic, we have 2 types - synchronous and asynchronous logic. I'm gonna be focusing on the synchronous logic circuit since most CPUs are that type. In synchronous logic, a sequence of repetitive pulses known as the clock signal is distributed to all the memory elements in the circuit. There is a technique known as pipeline, which purpose is to improve the clock rate.

Now, you need to know CPU clocking. Keep in mind that

single cycle = one instruction per second

Clock cycle is typically long to ensure each instruction to complete all stages without interruption. So, each cycle will take a constant amount of time to execute every instruction (in one cycle), regardless of the complexity of the instructions. It can be a big disadvantage of single cycle CPU as it operates at the speed of the slowest instruction to complete the exeution in one clock tick.

One cycle CPU consists of the following:-

Step 1: Instruction fetch (fetched from memory)
Step 2: Decode/ Register read (read the opcode to determine instruction types and field lengths)
Step 3: Arithmetic-Logic Unit (the real work; arithmetic, shifting, logic and comparisons)
Step 4: Memory Access (only for the load and store instructions as the others remain idle)
Step 5: Register write (write the result of some computation into a register)


Multiple cycles CPU on the other hand requires multiple cycles to execute a single instruction. The number of cycles needed to execute a certain instruction is flexible depending on the complexity of the instruction.

In the CPU, there are 2 other parts which is Control Unit and Datapath.

Control Unit
It does not execute any program instructions but instead it tells the datapath what needs to be done. It deals with the ALU and memory.

Datapath
It perform operations required by the program instructions.


To summarize, processor has 4 main functions which are fetch, decode, execute and write back. And the basic elements of a processor are:

1. Arithmetic Logic Unit (ALU)
- which is the core to arithmetic and logic operations

2. Floating Point Unit (FPU)
- specializing in solving numbers quicker than a basic microprocessor circuitry can

3. L1 and L2 cache memory
- saves time compared to retrieving data from the RAM

Briefly, processor is most commonly referred to as the Central Processing Unit (CPU) but

DO YOU KNOW?

CPU isn't the only processor in a computer. There is also Graphics Processing Unit (GPU) and other hardwares performing some processing independently within a computer!

The main competitors for processor in the current market - Intel and AMD

Pheww! Hope you find this post to be somewhat helpful. If there seems to be any mistake, don't hesitate to hit me up! Til then, ALL THE BEST!!


Glossary:

state - In computer science, the state of a digital logic circuit is a term used for all stored information.
clock rate - An indicator to the frequency of the processor's speed.
clock cycle - Measurement for a clock rate is clock cycles per second or the SI unit is known as hertz (Hz).
throughput - Number of instructions completed per second.
CPI - Cycles per instruction


Reference 1
Reference 2
Reference 3
Reference 4

Thursday, December 8

CHAPTER 5: MIPS SIMULATOR

As we know, MIPS is the language for assembly language. But how do we use this language?

Like a pen, we need a piece of paper to write. Paper is the tools for the pen to be functioned. So is the MIPS. MIPS need a software as its platform to function.

5.1 SPIM
   
Spim is a self-contained simulator that runs MIPS32 programs. It reads and executes assembly language(MIPS) programs written for this processor and provides a simple debugger and minimal set of operating system services.
   
There is a few types of Spim and the one that we will learn is QtSpim.

List of Spim
1. spimsal (older version of spim)
2. Spim
3. xSpim
4. PCSpim
5. QtSpim

5.1 QtSpim

QtSpim is the latest version of spim and have been widely used nowaday. It is simpler and easier than previous version. Let's watch a video on how to use QtSpim Simulator!

Hopefully you guys find this information useful. That's all for me, thank you.

Reference 1
Reference 2