SlideShare a Scribd company logo
1 of 20
VLIW PROCESSORS
Department of E &TC, MITCOE, Pu
Introduction
o Very long instruction word or VLIW refers to a
processor architecture designed to take advantage of
instruction level parallelism
o Instruction of a VLIW processor consists of multiple
independent operations grouped together.
o There are Multiple Independent Functional Units in
VLIW processor architecture.
o Each operation in the instruction is aligned to a
functional unit.
o All functional units share the use of a common large
register file.
o This type of processor architecture is intended to allow
higher performance without the inherent complexity of
Different Approaches
Other approaches to improving performance in processor architectures
:
o Pipelining
Breaking up instructions into sub-steps so that instructions can be
executed partially at the same time
o Superscalar architectures
Dispatching individual instructions to be executed completely
independently in different parts of the processor
o Out-of-order execution
Executing instructions in an order different from the program
Instruction Level Parallelism (ILP )
o Instruction-level parallelism (ILP) is a measure of how
many of the operations in a computer program can be
performed simultaneously.
o The overlap among instructions is called instruction level
parallelism.
o Ordinary programs are typically written under a sequential
execution model where instructions execute one after the
other and in the order specified by the programmer.
o Goal of compiler and processor designers implementing ILP
is to identify and take advantage of as much ILP as possible.
What is ILP? (Example)
Consider the following program:
op 1 e = a + b
op2 f = c + d
op3 m = e * f
o Operation 3 depends on the results of operations 1 and 2, so
it cannot be calculated until both of them are completed
o However, operations 1 and 2 do not depend on any other
operation, so they can be calculated simultaneously
o If we assume that each operation can be completed in one
unit of time then these three instructions can be completed
in a total of two units of time
o giving an ILP of 3/2.
VLIW Compiler
o Compiler is responsible for static scheduling of instructions in
VLIW processor.
o Compiler finds out which operations can be executed in parallel
in the program.
o It groups together these operations in single instruction which is
the very large instruction word.
o Compiler ensures that an operation is not issued before its
operands are ready.
VLIW Instruction
o One VLIW instruction word encodes multiple operations which
allows them to be initiated in a single clock cycle.
o The operands and the operation to be performed by the various
functional units are specified in the instruction itself.
o One instruction encodes at least one operation for each execution
unit of the device.
o So length of the instruction increases with the number of execution
units
o To accommodate these operation fields, VLIW instructions are
usually at least 64 bits wide, and on some architectures are much
wider up to 1024 bits.
VLIW Instruction
Add r1,r2,r3; Sub r4,r5,r6; Ld r7,data; St r8,data
REGESTER FILES
ALU ALU
LOAD
/STORE
LOAD
/STORE
ILP in VLIW
o Consider the computation of y = a1x1 + a2x2 + a3x3
On a sequential processor On the VLIW processor with
2 load/store units, 1 multiply unit
and 1 add unit
cycle 1: load a1
cycle 2: load x1
cycle 3: load a2
cycle 4: load x2
cycle 5: multiply z1 a1 x1
cycle 6: multiply z2 a2 x2
cycle 7: add y z1 z2
cycle 8: load a3
cycle 9: load x3
cycle 10: multiply z1 a3 x3
cycle 11: add y y z2
cycle 1: load a1
load x1
cycle 2: load a2
load x2
Multiply z1 a1 x1
cycle 3: load a3
load x3
Multiply z2 a2 x2
cycle 4: multiply z3 a3 x3
add y z1 z2
cycle 5: add y y z3
requires 11 cycles. requires 5 cycles.
Block Diagram
Diagram (Conceptual Instruction
Execution)
Working
o Long instruction words are fetched from the memory
o A common multi-ported register file for fetching the operands
and storing the results.
o Parallel random access to the register file is possible through
the read/write cross bar.
o Execution in the functional units is carried out concurrently
with the load/store operation of data between RAM and the
register file.
o One or multiple register files for FX and FP data.
o Rely on compiler to find parallelism and schedule dependency
free program code.
Difference Between VLIW &
Superscalar Architecture
VLIW vs. Superscalar
Architecture
o Instruction formulation
 Superscalar:
⁻ Receive conventional instructions conceived for sequential
processors.
 VLIW:
⁻ Receive long instruction words, each comprising a field (or
opcode) for each execution unit.
⁻ Instruction word length depends number of execution units and
code length to control each unit (such as opcode length,
registers).
⁻ Typical word length is 64 – 1024 bits, much longer than
conventional machine word length.
VLIW vs. Superscalar
Architecture
o Instruction scheduling
 Superscalar:
⁻ Done dynamically at run-time by the hardware.
⁻ Data dependency is checked and resolved in
hardware.
⁻ Need a look ahead hardware window for instruction
fetch.
 VLIW:
⁻ Done statically at compile time by compiler.
⁻ Data dependency is checked by compiler.
⁻ In case of un-filled opcodes in a VLIW, memory
space and instruction bandwidth are wasted.
Comparison: CISC, RISC,
VLIW
ARCHITECTURE
CHARACTERIST
C
CISC RISC VLIW
Instruction Size Varies One size, usually 32 bits One size
Instruction
Semantics
Varies from simple to
complex; possibly
many
dependent operations
per instruction
Almost always one
simple
operation
Many simple,
independent
operations
Registers Few, sometimes
special
Many, general-purpose Many, general-
purpose
Hardware Design Exploit microcode
implementations
Exploit implementations
with one pipeline and &
no microcode
Exploit
implementations with
multiple pipelines, no
microcode &
no complex dispatch
logic
Advantages of VLIW
o Dependencies are determined by compiler and used to
schedule according to function unit latencies .
o Function units are assigned by compiler and correspond
to the position within the instruction packet.
o Reduces hardware complexity.
• Tasks such as decoding, data dependency detection,
instruction issues etc. becoming simple.
• Ensures potentially higher Clock Rate.
• Ensures Low power consumption
Disadvantages of VLIW
o Higher complexity of the compiler
o Compatibility across implementations : Compiler optimization
needs to consider technology dependent parameters such as
latencies and load-use time of cache.
o Unscheduled events (e.g. cache miss) stall entire processor .
o Code density: In case of un-filled opcodes in a VLIW, memory
space and instruction bandwidth are wasted i.e. low slot utilization.
o Code expansion: Causes high power consumption
Applications
o VLIW architecture is suitable for Digital Signal Processing
applications.
o Processing of media data like compression/decompression
of Image and speech data.
Examples of VLIW processor
o VLIW Mini supercomputers:
Multiflow TRACE 7/300, 14/300, 28/300
Multiflow TRACE /500
Cydrome Cydra 5
IBM Yorktown VLIW Computer
o Single-Chip VLIW Processors:
Intel iWarp, Philip’s LIFE Chips
o Single-Chip VLIW Media (through-put) Processors:
Trimedia, Chromatic, Micro-Unity
o DSP Processors (TI TMS320C6x )

More Related Content

What's hot

Computer architecture pipelining
Computer architecture pipeliningComputer architecture pipelining
Computer architecture pipeliningMazin Alwaaly
 
multiprocessors and multicomputers
 multiprocessors and multicomputers multiprocessors and multicomputers
multiprocessors and multicomputersPankaj Kumar Jain
 
Embedded firmware
Embedded firmwareEmbedded firmware
Embedded firmwareJoel P
 
SOC Processors Used in SOC
SOC Processors Used in SOCSOC Processors Used in SOC
SOC Processors Used in SOCA B Shinde
 
Unit 5 Advanced Computer Architecture
Unit 5 Advanced Computer ArchitectureUnit 5 Advanced Computer Architecture
Unit 5 Advanced Computer ArchitectureBalaji Vignesh
 
Pipeline hazard
Pipeline hazardPipeline hazard
Pipeline hazardAJAL A J
 
SOC Verification using SystemVerilog
SOC Verification using SystemVerilog SOC Verification using SystemVerilog
SOC Verification using SystemVerilog Ramdas Mozhikunnath
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture Haris456
 
Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) A B Shinde
 
Real time Operating System
Real time Operating SystemReal time Operating System
Real time Operating SystemTech_MX
 
Pipelining powerpoint presentation
Pipelining powerpoint presentationPipelining powerpoint presentation
Pipelining powerpoint presentationbhavanadonthi
 
Network layer - design Issues
Network layer - design IssuesNetwork layer - design Issues
Network layer - design Issuesقصي نسور
 
Flynns classification
Flynns classificationFlynns classification
Flynns classificationYasir Khan
 
INSTRUCTION LEVEL PARALLALISM
INSTRUCTION LEVEL PARALLALISMINSTRUCTION LEVEL PARALLALISM
INSTRUCTION LEVEL PARALLALISMKamran Ashraf
 
Peephole optimization techniques in compiler design
Peephole optimization techniques in compiler designPeephole optimization techniques in compiler design
Peephole optimization techniques in compiler designAnul Chaudhary
 

What's hot (20)

DMA and DMA controller
DMA and DMA controllerDMA and DMA controller
DMA and DMA controller
 
Computer architecture pipelining
Computer architecture pipeliningComputer architecture pipelining
Computer architecture pipelining
 
Loops in flow
Loops in flowLoops in flow
Loops in flow
 
multiprocessors and multicomputers
 multiprocessors and multicomputers multiprocessors and multicomputers
multiprocessors and multicomputers
 
Embedded firmware
Embedded firmwareEmbedded firmware
Embedded firmware
 
SOC Processors Used in SOC
SOC Processors Used in SOCSOC Processors Used in SOC
SOC Processors Used in SOC
 
Unit 5 Advanced Computer Architecture
Unit 5 Advanced Computer ArchitectureUnit 5 Advanced Computer Architecture
Unit 5 Advanced Computer Architecture
 
Pipeline hazard
Pipeline hazardPipeline hazard
Pipeline hazard
 
SOC Verification using SystemVerilog
SOC Verification using SystemVerilog SOC Verification using SystemVerilog
SOC Verification using SystemVerilog
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
 
Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism)
 
Real time Operating System
Real time Operating SystemReal time Operating System
Real time Operating System
 
Pipelining powerpoint presentation
Pipelining powerpoint presentationPipelining powerpoint presentation
Pipelining powerpoint presentation
 
Network layer - design Issues
Network layer - design IssuesNetwork layer - design Issues
Network layer - design Issues
 
Flynns classification
Flynns classificationFlynns classification
Flynns classification
 
FPGA
FPGAFPGA
FPGA
 
Pipelining
PipeliningPipelining
Pipelining
 
INSTRUCTION LEVEL PARALLALISM
INSTRUCTION LEVEL PARALLALISMINSTRUCTION LEVEL PARALLALISM
INSTRUCTION LEVEL PARALLALISM
 
Peephole optimization techniques in compiler design
Peephole optimization techniques in compiler designPeephole optimization techniques in compiler design
Peephole optimization techniques in compiler design
 
Arm instruction set
Arm instruction setArm instruction set
Arm instruction set
 

Similar to VLIW Processors

Vliw and superscaler
Vliw and superscalerVliw and superscaler
Vliw and superscalerRafi Dar
 
Shown below is a VLIW system in which each long instruction word gen.pdf
Shown below is a VLIW system in which each long instruction word gen.pdfShown below is a VLIW system in which each long instruction word gen.pdf
Shown below is a VLIW system in which each long instruction word gen.pdfARCHANASTOREKOTA
 
Advanced Techniques for Exploiting ILP
Advanced Techniques for Exploiting ILPAdvanced Techniques for Exploiting ILP
Advanced Techniques for Exploiting ILPA B Shinde
 
Advanced processor principles
Advanced processor principlesAdvanced processor principles
Advanced processor principlesDhaval Bagal
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel ComputingMohsin Bhat
 
VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)Pragnya Dash
 
SOC System Design Approach
SOC System Design ApproachSOC System Design Approach
SOC System Design ApproachA B Shinde
 
Crussoe proc
Crussoe procCrussoe proc
Crussoe proctyadi
 
Processor Verification Using Open Source Tools and the GCC Regression Test Suite
Processor Verification Using Open Source Tools and the GCC Regression Test SuiteProcessor Verification Using Open Source Tools and the GCC Regression Test Suite
Processor Verification Using Open Source Tools and the GCC Regression Test SuiteDVClub
 
Fpga based 128 bit customised vliw processor for executing dual scalarvector ...
Fpga based 128 bit customised vliw processor for executing dual scalarvector ...Fpga based 128 bit customised vliw processor for executing dual scalarvector ...
Fpga based 128 bit customised vliw processor for executing dual scalarvector ...eSAT Publishing House
 
Pipeline Mechanism
Pipeline MechanismPipeline Mechanism
Pipeline MechanismAshik Iqbal
 
Alley vsu functional_coverage_1f
Alley vsu functional_coverage_1fAlley vsu functional_coverage_1f
Alley vsu functional_coverage_1fObsidian Software
 
Algoritmi e Calcolo Parallelo 2012/2013 - OpenMP
Algoritmi e Calcolo Parallelo 2012/2013 - OpenMPAlgoritmi e Calcolo Parallelo 2012/2013 - OpenMP
Algoritmi e Calcolo Parallelo 2012/2013 - OpenMPPier Luca Lanzi
 

Similar to VLIW Processors (20)

Vliw and superscaler
Vliw and superscalerVliw and superscaler
Vliw and superscaler
 
Lec1 final
Lec1 finalLec1 final
Lec1 final
 
Shown below is a VLIW system in which each long instruction word gen.pdf
Shown below is a VLIW system in which each long instruction word gen.pdfShown below is a VLIW system in which each long instruction word gen.pdf
Shown below is a VLIW system in which each long instruction word gen.pdf
 
1.My Presentation.pptx
1.My Presentation.pptx1.My Presentation.pptx
1.My Presentation.pptx
 
Advanced Techniques for Exploiting ILP
Advanced Techniques for Exploiting ILPAdvanced Techniques for Exploiting ILP
Advanced Techniques for Exploiting ILP
 
Advanced processor principles
Advanced processor principlesAdvanced processor principles
Advanced processor principles
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
 
VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)
 
SOC System Design Approach
SOC System Design ApproachSOC System Design Approach
SOC System Design Approach
 
Assembly p1
Assembly p1Assembly p1
Assembly p1
 
Cisc(a022& a023)
Cisc(a022& a023)Cisc(a022& a023)
Cisc(a022& a023)
 
Crussoe proc
Crussoe procCrussoe proc
Crussoe proc
 
Difficulties in Pipelining
Difficulties in PipeliningDifficulties in Pipelining
Difficulties in Pipelining
 
Processor Verification Using Open Source Tools and the GCC Regression Test Suite
Processor Verification Using Open Source Tools and the GCC Regression Test SuiteProcessor Verification Using Open Source Tools and the GCC Regression Test Suite
Processor Verification Using Open Source Tools and the GCC Regression Test Suite
 
Fpga based 128 bit customised vliw processor for executing dual scalarvector ...
Fpga based 128 bit customised vliw processor for executing dual scalarvector ...Fpga based 128 bit customised vliw processor for executing dual scalarvector ...
Fpga based 128 bit customised vliw processor for executing dual scalarvector ...
 
Pipeline Mechanism
Pipeline MechanismPipeline Mechanism
Pipeline Mechanism
 
Alley vsu functional_coverage_1f
Alley vsu functional_coverage_1fAlley vsu functional_coverage_1f
Alley vsu functional_coverage_1f
 
Algoritmi e Calcolo Parallelo 2012/2013 - OpenMP
Algoritmi e Calcolo Parallelo 2012/2013 - OpenMPAlgoritmi e Calcolo Parallelo 2012/2013 - OpenMP
Algoritmi e Calcolo Parallelo 2012/2013 - OpenMP
 
Vliw or epic
Vliw or epicVliw or epic
Vliw or epic
 
W04505116121
W04505116121W04505116121
W04505116121
 

More from Sudhanshu Janwadkar

Keypad Interfacing with 8051 Microcontroller
Keypad Interfacing with 8051 MicrocontrollerKeypad Interfacing with 8051 Microcontroller
Keypad Interfacing with 8051 MicrocontrollerSudhanshu Janwadkar
 
ASIC design Flow (Digital Design)
ASIC design Flow (Digital Design)ASIC design Flow (Digital Design)
ASIC design Flow (Digital Design)Sudhanshu Janwadkar
 
Fpga architectures and applications
Fpga architectures and applicationsFpga architectures and applications
Fpga architectures and applicationsSudhanshu Janwadkar
 
Introduction to 8051 Timer/Counter
Introduction to 8051 Timer/CounterIntroduction to 8051 Timer/Counter
Introduction to 8051 Timer/CounterSudhanshu Janwadkar
 
Architecture of the Intel 8051 Microcontroller
Architecture of the Intel 8051 MicrocontrollerArchitecture of the Intel 8051 Microcontroller
Architecture of the Intel 8051 MicrocontrollerSudhanshu Janwadkar
 
Introduction to Embedded Systems
Introduction to Embedded SystemsIntroduction to Embedded Systems
Introduction to Embedded SystemsSudhanshu Janwadkar
 
Interconnects in Reconfigurable Architectures
Interconnects in Reconfigurable ArchitecturesInterconnects in Reconfigurable Architectures
Interconnects in Reconfigurable ArchitecturesSudhanshu Janwadkar
 
Design and Implementation of a GPS based Personal Tracking System
Design and Implementation of a GPS based Personal Tracking SystemDesign and Implementation of a GPS based Personal Tracking System
Design and Implementation of a GPS based Personal Tracking SystemSudhanshu Janwadkar
 
Embedded Logic Flip-Flops: A Conceptual Review
Embedded Logic Flip-Flops: A Conceptual ReviewEmbedded Logic Flip-Flops: A Conceptual Review
Embedded Logic Flip-Flops: A Conceptual ReviewSudhanshu Janwadkar
 

More from Sudhanshu Janwadkar (20)

DSP Processors versus ASICs
DSP Processors versus ASICsDSP Processors versus ASICs
DSP Processors versus ASICs
 
Keypad Interfacing with 8051 Microcontroller
Keypad Interfacing with 8051 MicrocontrollerKeypad Interfacing with 8051 Microcontroller
Keypad Interfacing with 8051 Microcontroller
 
ASIC design Flow (Digital Design)
ASIC design Flow (Digital Design)ASIC design Flow (Digital Design)
ASIC design Flow (Digital Design)
 
Fpga architectures and applications
Fpga architectures and applicationsFpga architectures and applications
Fpga architectures and applications
 
LCD Interacing with 8051
LCD Interacing with 8051LCD Interacing with 8051
LCD Interacing with 8051
 
Interrupts in 8051
Interrupts in 8051Interrupts in 8051
Interrupts in 8051
 
Serial Communication in 8051
Serial Communication in 8051Serial Communication in 8051
Serial Communication in 8051
 
SPI Bus Protocol
SPI Bus ProtocolSPI Bus Protocol
SPI Bus Protocol
 
I2C Protocol
I2C ProtocolI2C Protocol
I2C Protocol
 
Introduction to 8051 Timer/Counter
Introduction to 8051 Timer/CounterIntroduction to 8051 Timer/Counter
Introduction to 8051 Timer/Counter
 
Intel 8051 Programming in C
Intel 8051 Programming in CIntel 8051 Programming in C
Intel 8051 Programming in C
 
Hardware View of Intel 8051
Hardware View of Intel 8051Hardware View of Intel 8051
Hardware View of Intel 8051
 
Architecture of the Intel 8051 Microcontroller
Architecture of the Intel 8051 MicrocontrollerArchitecture of the Intel 8051 Microcontroller
Architecture of the Intel 8051 Microcontroller
 
Introduction to Embedded Systems
Introduction to Embedded SystemsIntroduction to Embedded Systems
Introduction to Embedded Systems
 
CMOS Logic
CMOS LogicCMOS Logic
CMOS Logic
 
Interconnects in Reconfigurable Architectures
Interconnects in Reconfigurable ArchitecturesInterconnects in Reconfigurable Architectures
Interconnects in Reconfigurable Architectures
 
Introduction to FPGAs
Introduction to FPGAsIntroduction to FPGAs
Introduction to FPGAs
 
Design and Implementation of a GPS based Personal Tracking System
Design and Implementation of a GPS based Personal Tracking SystemDesign and Implementation of a GPS based Personal Tracking System
Design and Implementation of a GPS based Personal Tracking System
 
Embedded Logic Flip-Flops: A Conceptual Review
Embedded Logic Flip-Flops: A Conceptual ReviewEmbedded Logic Flip-Flops: A Conceptual Review
Embedded Logic Flip-Flops: A Conceptual Review
 
Pass Transistor Logic
Pass Transistor LogicPass Transistor Logic
Pass Transistor Logic
 

Recently uploaded

KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 

Recently uploaded (20)

KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 

VLIW Processors

  • 1. VLIW PROCESSORS Department of E &TC, MITCOE, Pu
  • 2. Introduction o Very long instruction word or VLIW refers to a processor architecture designed to take advantage of instruction level parallelism o Instruction of a VLIW processor consists of multiple independent operations grouped together. o There are Multiple Independent Functional Units in VLIW processor architecture. o Each operation in the instruction is aligned to a functional unit. o All functional units share the use of a common large register file. o This type of processor architecture is intended to allow higher performance without the inherent complexity of
  • 3. Different Approaches Other approaches to improving performance in processor architectures : o Pipelining Breaking up instructions into sub-steps so that instructions can be executed partially at the same time o Superscalar architectures Dispatching individual instructions to be executed completely independently in different parts of the processor o Out-of-order execution Executing instructions in an order different from the program
  • 4. Instruction Level Parallelism (ILP ) o Instruction-level parallelism (ILP) is a measure of how many of the operations in a computer program can be performed simultaneously. o The overlap among instructions is called instruction level parallelism. o Ordinary programs are typically written under a sequential execution model where instructions execute one after the other and in the order specified by the programmer. o Goal of compiler and processor designers implementing ILP is to identify and take advantage of as much ILP as possible.
  • 5. What is ILP? (Example) Consider the following program: op 1 e = a + b op2 f = c + d op3 m = e * f o Operation 3 depends on the results of operations 1 and 2, so it cannot be calculated until both of them are completed o However, operations 1 and 2 do not depend on any other operation, so they can be calculated simultaneously o If we assume that each operation can be completed in one unit of time then these three instructions can be completed in a total of two units of time o giving an ILP of 3/2.
  • 6. VLIW Compiler o Compiler is responsible for static scheduling of instructions in VLIW processor. o Compiler finds out which operations can be executed in parallel in the program. o It groups together these operations in single instruction which is the very large instruction word. o Compiler ensures that an operation is not issued before its operands are ready.
  • 7. VLIW Instruction o One VLIW instruction word encodes multiple operations which allows them to be initiated in a single clock cycle. o The operands and the operation to be performed by the various functional units are specified in the instruction itself. o One instruction encodes at least one operation for each execution unit of the device. o So length of the instruction increases with the number of execution units o To accommodate these operation fields, VLIW instructions are usually at least 64 bits wide, and on some architectures are much wider up to 1024 bits.
  • 8. VLIW Instruction Add r1,r2,r3; Sub r4,r5,r6; Ld r7,data; St r8,data REGESTER FILES ALU ALU LOAD /STORE LOAD /STORE
  • 9. ILP in VLIW o Consider the computation of y = a1x1 + a2x2 + a3x3 On a sequential processor On the VLIW processor with 2 load/store units, 1 multiply unit and 1 add unit cycle 1: load a1 cycle 2: load x1 cycle 3: load a2 cycle 4: load x2 cycle 5: multiply z1 a1 x1 cycle 6: multiply z2 a2 x2 cycle 7: add y z1 z2 cycle 8: load a3 cycle 9: load x3 cycle 10: multiply z1 a3 x3 cycle 11: add y y z2 cycle 1: load a1 load x1 cycle 2: load a2 load x2 Multiply z1 a1 x1 cycle 3: load a3 load x3 Multiply z2 a2 x2 cycle 4: multiply z3 a3 x3 add y z1 z2 cycle 5: add y y z3 requires 11 cycles. requires 5 cycles.
  • 12. Working o Long instruction words are fetched from the memory o A common multi-ported register file for fetching the operands and storing the results. o Parallel random access to the register file is possible through the read/write cross bar. o Execution in the functional units is carried out concurrently with the load/store operation of data between RAM and the register file. o One or multiple register files for FX and FP data. o Rely on compiler to find parallelism and schedule dependency free program code.
  • 13. Difference Between VLIW & Superscalar Architecture
  • 14. VLIW vs. Superscalar Architecture o Instruction formulation  Superscalar: ⁻ Receive conventional instructions conceived for sequential processors.  VLIW: ⁻ Receive long instruction words, each comprising a field (or opcode) for each execution unit. ⁻ Instruction word length depends number of execution units and code length to control each unit (such as opcode length, registers). ⁻ Typical word length is 64 – 1024 bits, much longer than conventional machine word length.
  • 15. VLIW vs. Superscalar Architecture o Instruction scheduling  Superscalar: ⁻ Done dynamically at run-time by the hardware. ⁻ Data dependency is checked and resolved in hardware. ⁻ Need a look ahead hardware window for instruction fetch.  VLIW: ⁻ Done statically at compile time by compiler. ⁻ Data dependency is checked by compiler. ⁻ In case of un-filled opcodes in a VLIW, memory space and instruction bandwidth are wasted.
  • 16. Comparison: CISC, RISC, VLIW ARCHITECTURE CHARACTERIST C CISC RISC VLIW Instruction Size Varies One size, usually 32 bits One size Instruction Semantics Varies from simple to complex; possibly many dependent operations per instruction Almost always one simple operation Many simple, independent operations Registers Few, sometimes special Many, general-purpose Many, general- purpose Hardware Design Exploit microcode implementations Exploit implementations with one pipeline and & no microcode Exploit implementations with multiple pipelines, no microcode & no complex dispatch logic
  • 17. Advantages of VLIW o Dependencies are determined by compiler and used to schedule according to function unit latencies . o Function units are assigned by compiler and correspond to the position within the instruction packet. o Reduces hardware complexity. • Tasks such as decoding, data dependency detection, instruction issues etc. becoming simple. • Ensures potentially higher Clock Rate. • Ensures Low power consumption
  • 18. Disadvantages of VLIW o Higher complexity of the compiler o Compatibility across implementations : Compiler optimization needs to consider technology dependent parameters such as latencies and load-use time of cache. o Unscheduled events (e.g. cache miss) stall entire processor . o Code density: In case of un-filled opcodes in a VLIW, memory space and instruction bandwidth are wasted i.e. low slot utilization. o Code expansion: Causes high power consumption
  • 19. Applications o VLIW architecture is suitable for Digital Signal Processing applications. o Processing of media data like compression/decompression of Image and speech data.
  • 20. Examples of VLIW processor o VLIW Mini supercomputers: Multiflow TRACE 7/300, 14/300, 28/300 Multiflow TRACE /500 Cydrome Cydra 5 IBM Yorktown VLIW Computer o Single-Chip VLIW Processors: Intel iWarp, Philip’s LIFE Chips o Single-Chip VLIW Media (through-put) Processors: Trimedia, Chromatic, Micro-Unity o DSP Processors (TI TMS320C6x )