presentazione del corso -...
TRANSCRIPT
![Page 1: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/1.jpg)
Maurizio Palesi 1
PipeliningPipelining**
Maurizio Palesi
* Adapted from David A. Patterson’s CS252 lecture slides,
http://www.cs.berkeley/~pattrsn/252S98/index.html
Copyright 1998 UCB
![Page 2: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/2.jpg)
Maurizio Palesi 2
ReferencesReferencesJohn L. Hennessy and David A. Patterson,
Computer Architecture a Quantitative Approach, second edition, Morgan KaufmannChapter 3
![Page 3: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/3.jpg)
Maurizio Palesi 3
Basic Steps of Execution in DLXBasic Steps of Execution in DLX
1. Instruction fetch step (IF) 2. Instruction decode/register fetch step (ID) 3. Execution/effective address step (EX) 4. Memory access/branch completion step (MEM) 5. Register write-back step (WB)
![Page 4: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/4.jpg)
Maurizio Palesi 4
DLX DatapathDLX Datapath
![Page 5: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/5.jpg)
Maurizio Palesi 5
DLX Instruction FormatDLX Instruction Format
Op
0 5
Rs1 Rd Immediate
31161511106
Op
0 5
Target
316
Op
0 5
Rs1 Rs2 Opx
31161511106
Rd
Register-Register – Register ALU operations
26272120
Register-Immediate – Load/Store, ALU immed, Jump
Op
0 5
Rs1 Rs2 Immediate
31161511106
Branch
Jump / Call
R-Type
I-Type
J-Type
![Page 6: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/6.jpg)
Maurizio Palesi 6
Basic Steps of ExecutionBasic Steps of Execution1. Instruction fetch (IF) IR ← Mem[PC], NPC ← PC + 4
![Page 7: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/7.jpg)
Maurizio Palesi 7
Basic Steps of ExecutionBasic Steps of Execution2. Instruction decode/register fetch (ID) A ← Regs[IR(6:10)], B ← Regs[IR(11:15)]
Imm ← (IR(16)16#IR(16:31))
![Page 8: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/8.jpg)
Maurizio Palesi 8
Basic Steps of ExecutionBasic Steps of Execution3. Execution/effective adddress (EX) ALU output ← A + Imm
ALU output ← A func BALU output ← A op ImmALU output ← NPC + Imm; Cond ← A op 0
![Page 9: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/9.jpg)
Maurizio Palesi 9
Basic Steps of ExecutionBasic Steps of Execution4. Memory access/Branch completion (MEM)
- PC ← NPC, LMD ← Mem[ALU output] or Mem[ALU output] ← B- if (cond) PC ← ALU ouput
![Page 10: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/10.jpg)
Maurizio Palesi 10
Basic Steps of ExecutionBasic Steps of Execution5. Register write-back (WB) Regs[IR(16:20)] ← ALU output
Regs[IR(11:15)] ← ALU outputRegs[IR(11:15)] ← LMD
![Page 11: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/11.jpg)
Maurizio Palesi 11
DiscussionDiscussion Branch & Store instructions: 4 cycles All other instructions: 5 cycles Example
Branch 12%, Store 5%CPI = (12% + 5%)*4 + 83%*5 = 4.83
ImprovementComplete ALU instruction during MEMEx., 47% ALU instructions
CPI = (12% + 5% + 47%)*4 + 36%*5 = 4.36Speedup = 4.83/4.36 = 1.1
Any other attempts to decrease CPI may increase the clock cycle time
![Page 12: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/12.jpg)
Maurizio Palesi 12
Multicycle ImplementationMulticycle Implementation A simple FSM could be used to implement the control logic
Microcode control could be used for a much more complex machine
Hardware redoundancies Two ALU
Separate instruction and data memories
![Page 13: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/13.jpg)
Maurizio Palesi 13
Pipelining: Its Natural!Pipelining: Its Natural!Laundry ExampleAnn, Brian, Cathy, Dave
each have one load of clothes to wash, dry, and fold
Washer takes 30 minutesDryer takes 40 minutes“Folder” takes 20 minutes
A B C D
![Page 14: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/14.jpg)
Maurizio Palesi 14
Sequential LaundrySequential Laundry
Sequential laundry takes 6 hours for 4 loads If they learned pipelining, how long would laundry take?
A
B
C
D
30 40 2030 40 2030 40 2030 40 20
6 PM 7 8 9 10 11 Midnight
Task
Order
Time
![Page 15: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/15.jpg)
Maurizio Palesi 15
Pipelined Laundry Start work ASAPPipelined Laundry Start work ASAP
Pipelined laundry takes 3.5 hours for 4 loads
A
B
C
D
6 PM 7 8 9 10 11 Midnight
Task
Order
Time
30 40 40 40 40 20
![Page 16: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/16.jpg)
Maurizio Palesi 16
Pipelining LessonsPipelining Lessons Pipelining doesn’t help
latency of single task, it helps throughput of entire workload
Pipeline rate limited by slowest pipeline stage
Multiple tasks operating simultaneously
Potential speedup = Number pipe stages
Unbalanced lengths of pipe stages reduces speedup
Time to “fill” pipeline and time to “drain” it reduces speedup
A
B
C
D
6 PM 7 8 9
Task
Order
Time
30 40 40 40 40 20
![Page 17: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/17.jpg)
Maurizio Palesi 17
PipeliningPipelining Technique for having multiple instructions execute
in an overlapped manner Assembly Line Throughput
Determined by how often an instruction exits the pipeline
Time to move one step is machine cycle timeDetermined by the slowest step in the pipelineOften it is clock cycle though clocks may have multiple
phases Length of Pipe - No of stages
Determine latency
![Page 18: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/18.jpg)
Maurizio Palesi 18
Pipeline PerformancePipeline PerformanceUnder ideal conditions
Time per instruction = Time per instruction on non-pipelined machine/Number of pipe stages
Speedup = Number of stages
But…Stages are not balancedOverhead (10%)
![Page 19: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/19.jpg)
Maurizio Palesi 19
Basic Performance IssuesBasic Performance IssuesPipeline increases instruction throughput
It does not decrease the time of execution of any single instructionIt may increase it!
Stage imbalance may yield further inefficiencies
Overheads due toRegister and latches adding to delays and
clock skew
![Page 20: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/20.jpg)
Maurizio Palesi 20
Pipeline Example: 3 stagesPipeline Example: 3 stages
10 16 14 2 10 16 14 2
16 210 2 14 2
16 210 2 14 2
18 ns 18 ns 18 ns 18 ns
Assume 2 nsec latch delay
Unpipelined (mono cycle)
Pipelined
Latency = 42 nsec, Throughput = 1/42
Latency = (3 x 18) = 54 nsec, Throughput = 1/18 (2.3x, not 3x)
![Page 21: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/21.jpg)
Maurizio Palesi 21
CPU TimeCPU TimeCPU Time = # Instructions × CPI × TCK
Strategy # instructions CPI Tck
Single long clock cycle = 1 big
Multiple cycles = many small
Pipelining = 1 (ideal) small
![Page 22: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/22.jpg)
Maurizio Palesi 22
Basic Pipeline DLXBasic Pipeline DLXClock cycle
Instruction number 1 2 3 4 5 6 7 8
Instruction i IF ID EX MEM WB
Instruction i+1 IF ID EX MEM WB
Instruction i+2 IF ID EX MEM WB
Instruction i+3 IF ID EX MEM WB
Instruction i+4 IF ID EX MEM
![Page 23: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/23.jpg)
Maurizio Palesi 23
Making Pipeline WorkMaking Pipeline WorkDetermine what happens at each clock tickAssure no resource conflicts
Can not use one ALU for address calculation and ALU functions
All operations in a pipe stage must complete in one clock cycleMay have to elongate the clock cycle to
accommodate this
![Page 24: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/24.jpg)
Maurizio Palesi 24
DLX Pipeline: DatapathDLX Pipeline: Datapath
![Page 25: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/25.jpg)
Maurizio Palesi 25
Pipeline StructurePipeline Structure Major functional units are used in different cycles Three observations
Basic datapath uses separate instruction and data memoryHave separate instruction and data cachesMemory bandwidth increases in a pipelined system
Register file is used in two stages - ID and WBTo start a new instruction every clock we must
increment and store the PC every clock cycle during IF stageWhat happens when branches occur
![Page 26: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/26.jpg)
Maurizio Palesi 26
Pipeline StructurePipeline Structure Every pipe stage is active in every cycle
Values passed from one stage to next must be placed in registersUse pipeline registers or pipeline latches between stages
Registers used to transfer information from one stage to the next
Pipeline registers carry both control and data from stage to stage
Values may have to be copied from one to the next stage if it is needed at a later stage
An instruction is active in exactly one stage of the pipe at any moment
![Page 27: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/27.jpg)
Maurizio Palesi 27
DLX Datapath w/o PipeliningDLX Datapath w/o Pipelining
![Page 28: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/28.jpg)
Maurizio Palesi 28
DLX Pipeline StagesDLX Pipeline Stages
![Page 29: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/29.jpg)
Maurizio Palesi 29
Events in DLX PipeEvents in DLX Pipe IF Stage
IF/ID.IR ← Mem[PC] IF/ID.NPC, PC ← if (EX/MEM.Opcode == Branch && EX/MEM.cond) ← EX/MEM.ALU output
else ← PC+4
ID Stage ID/EX.A ← Regs[IF/ID.IR(6:10)]; ID/EX.B ← Regs[IF/ID.IR(11:15)]; ID/EX.NPC ← IF/ID.NPC; ID/EX.IR ← IF/ID.IR; ID/EX.Imm ← (IF/ID.IR(16))16#IF/ID.IR(16:31)
![Page 30: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/30.jpg)
Maurizio Palesi 30
Events in DLX PipeEvents in DLX Pipe
![Page 31: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/31.jpg)
Maurizio Palesi 31
Events in DLX PipeEvents in DLX Pipe
![Page 32: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/32.jpg)
Maurizio Palesi 32
Events in DLX PipeEvents in DLX Pipe
![Page 33: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/33.jpg)
Maurizio Palesi 33
Pipeline Hazards: Major HurdlesPipeline Hazards: Major Hurdles
Structural HazardsResource conflicts
Data HazardsData dependencies
Control HazardsBranches and other instructions that change PC
![Page 34: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/34.jpg)
Maurizio Palesi 34
Pipeline Hazards: Major HurdlesPipeline Hazards: Major Hurdles
Structural HazardsResource conflicts
Data HazardsData dependencies
Control HazardsBranches and other instructions that change PC
![Page 35: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/35.jpg)
Maurizio Palesi 35
Structural HazardStructural HazardSome combination of instructions result in
resource conflictsSome functional units are not fully pipelinedSome resource is not duplicated enough
Register file write portSingle memory pipeline for instruction and data
Stall pipeline for one cycleAlso called a Bubble
![Page 36: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/36.jpg)
Maurizio Palesi 36
Structural HazardStructural Hazard
![Page 37: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/37.jpg)
Maurizio Palesi 37
Pipeline StallPipeline Stall
Clock cycleInstruction number 1 2 3 4 5 6 7 8 9
Load instruction IF ID EX MEM WB
Instruction i+1 IF ID EX MEM WB
Instruction i+2 IF ID EX MEM WB
Instruction i+3 Stall IF ID EX MEM WB
Instruction i+4 IF ID EX MEM
Single memory for data and instructions
![Page 38: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/38.jpg)
Maurizio Palesi 38
Structural Hazard and BubblesStructural Hazard and Bubbles
![Page 39: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/39.jpg)
Maurizio Palesi 39
Solutions to Structural HazardSolutions to Structural Hazard
Resource DuplicationExample
Separate I and D caches for memory access conflictTime-multiplexed or multiport register file for register
file access conflict
![Page 40: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/40.jpg)
Maurizio Palesi 40
Pipeline Hazards: Major HurdlesPipeline Hazards: Major Hurdles
Structural HazardsResource conflicts
Data HazardsData dependencies
Control HazardsBranches and other instructions that change PC
![Page 41: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/41.jpg)
Maurizio Palesi 41
Data HazardsData HazardsOrder of access to operands is changed by
the pipeline vs the normal orderADD R1, R2, R3SUB R4, R1, R5
Clock cycleInstruction number 1 2 3 4 5 6
ADD R1, R2, R3 IF ID EX MEM WB
SUB R4, R1, R5 IF ID EX MEM WB
![Page 42: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/42.jpg)
Maurizio Palesi 42
Types of Data HazardTypes of Data Hazard RAW (read after write)
j tries to read a source before i writes it, so j may get wrong value
WAR (write after read)j tries to write a destination before it is read by iAs reads are early this can not happen in our examplesAutoincrement addressing can create it
WAW (write after write)j tries to write an operand before it is written by i. Wrong
value may remain in the registerOccurs in pipes which write in more than one stage
![Page 43: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/43.jpg)
Maurizio Palesi 43
Data HazardData Hazard WAR
SW 0(R1),R2 IF ID EX MEM1 MEM2 WBADD R2,R3,R4 IF ID EX WB
WAWLW R1,0(R2) IF ID EX MEM1 MEM2 WBADD R1,R2,R3 IF ID EX WB
![Page 44: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/44.jpg)
Maurizio Palesi 44
Solutions to Data HazardSolutions to Data Hazard(Internal) Forwarding
Extra hardware
Freezing the pipelineStalls/bubbles; pipeline interlock
Compiler (instruction) schedulingDelay slot
![Page 45: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/45.jpg)
Maurizio Palesi 45
Data HazardData Hazard
ADD R1, R2, R3
SUB R4, R5, R1
AND R6, R1, R7
OR R8, R1, R9
XOR R10, R1, R11
![Page 46: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/46.jpg)
Maurizio Palesi 46
Data HazardData Hazard
![Page 47: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/47.jpg)
Maurizio Palesi 47
Forwarding PathsForwarding Paths
![Page 48: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/48.jpg)
Maurizio Palesi 48
Another ExampleAnother Example ADD R1, R2, R3LW R4, 0(R1)SW 12(R1), R4
![Page 49: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/49.jpg)
Maurizio Palesi 49
ForwardingForwarding
![Page 50: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/50.jpg)
Maurizio Palesi 50
Data Hazards Requiring StallsData Hazards Requiring Stalls
Situations were forwarding is not possibleLoad interlock
LW R1, 0(R2)SUB R4, R1, R5AND R6, R1, R7OR R8, R1, R9
Pipeline interlock hardwareDetects HazardStalls the pipe until hazard is cleared
![Page 51: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/51.jpg)
Maurizio Palesi 51
Load cannot Bypass Results to SUBLoad cannot Bypass Results to SUB
![Page 52: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/52.jpg)
Maurizio Palesi 52
Pipeline Interlock SolutionPipeline Interlock Solution
![Page 53: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/53.jpg)
Maurizio Palesi 53
Load InterlocksLoad Interlocks
Matching operand fieldsID/EX.IR(0:5) IF/ID.IR(0:5)
Load Reg-reg ALU ID/EX.IR(11:15)==IF/ID.IR(6:10)Load Reg-reg ALU ID/EX.IR(11:15)==IF/ID.IR(11:15)
Load ID/EX.IR(11:15)==IF/ID.IR(6:10)
Opcode field of ID/EX
Opcode field of IF/ID
Load, Store, ALU imm, Branch
![Page 54: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/54.jpg)
Maurizio Palesi 54
Pipeline Hazards: Major HurdlesPipeline Hazards: Major Hurdles
Structural HazardsResource conflicts
Data HazardsData dependencies
Control HazardsBranches and other instructions that change PC
![Page 55: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/55.jpg)
Maurizio Palesi 55
Control HazardsControl Hazards Can cause a greater performance loss than do data hazards
Recall that if instruction i is a taken branchPC is not changed until the end of MEM
![Page 56: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/56.jpg)
Maurizio Palesi 56
Stalling the PipelineStalling the Pipeline The simplest method to deal with branches is stall the
pipeline
We do not want to stall the pipeline until we know that the instruction is a branchThe stall does not occur until after the ID stage
3 cycles penalty
![Page 57: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/57.jpg)
Maurizio Palesi 57
Reducing the Branch PenaltyReducing the Branch PenaltyThe number of clock cycles in a branch stall
can be reduced in two stepsFind out whether the branch is taken or not
taken earlier in the pipelineCompute the taken PC (i.e., the address of the
branch target) earlier
![Page 58: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/58.jpg)
Maurizio Palesi 58
Reducing the Branch PenaltyReducing the Branch Penalty
![Page 59: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/59.jpg)
Maurizio Palesi 59
Reducing the Branch PenaltyReducing the Branch Penalty
![Page 60: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/60.jpg)
Maurizio Palesi 60
Reducing the Branch PenaltyReducing the Branch PenaltyWith a separate adder and a branch decision
made during ID, there is only 1 clock cycle stall on branches
1 cycle penalty
![Page 61: Presentazione del Corso - utenti.dieei.unict.itutenti.dieei.unict.it/users/smonteleone/corsi/aa2017-18/ce/slides/... · Maurizio Palesi 25 Pipeline Structure Major functional units](https://reader031.vdocumenti.com/reader031/viewer/2022022722/5c6800ba09d3f22d638cd10c/html5/thumbnails/61.jpg)
Maurizio Palesi 61
Reducing the Branch PenaltyReducing the Branch Penalty Predict-not-taken scheme Treat every branch as not taken
Allowing the hw to continue as if the branch were not executed If the branch is taken, it needs to turn the fetched instruction into a nop
Branch taken
Branch not taken