Submit Search
Upload
Cell Technology for Graphics and Visualization
•
1 like
•
407 views
S
Slide_N
Follow
Cell Technology for Graphics and Visualization
Read less
Read more
Technology
Report
Share
Report
Share
1 of 44
Download now
Download to read offline
Recommended
Early Successes Debugging with TotalView on the Intel Xeon Phi Coprocessor
Early Successes Debugging with TotalView on the Intel Xeon Phi Coprocessor
Intel IT Center
Intro to Cell Broadband Engine for HPC
Intro to Cell Broadband Engine for HPC
Slide_N
The Best Programming Practice for Cell/B.E.
The Best Programming Practice for Cell/B.E.
Slide_N
101 cd 1415-1445
101 cd 1415-1445
Chiou-Nan Chen
Cell Broadband EngineTM: and Cell/B.E. based blade technology
Cell Broadband EngineTM: and Cell/B.E. based blade technology
Slide_N
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
Heiko Joerg Schick
Synergistic processing in cell's multicore architecture
Synergistic processing in cell's multicore architecture
Michael Gschwind
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
Intel® Software
Recommended
Early Successes Debugging with TotalView on the Intel Xeon Phi Coprocessor
Early Successes Debugging with TotalView on the Intel Xeon Phi Coprocessor
Intel IT Center
Intro to Cell Broadband Engine for HPC
Intro to Cell Broadband Engine for HPC
Slide_N
The Best Programming Practice for Cell/B.E.
The Best Programming Practice for Cell/B.E.
Slide_N
101 cd 1415-1445
101 cd 1415-1445
Chiou-Nan Chen
Cell Broadband EngineTM: and Cell/B.E. based blade technology
Cell Broadband EngineTM: and Cell/B.E. based blade technology
Slide_N
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
Heiko Joerg Schick
Synergistic processing in cell's multicore architecture
Synergistic processing in cell's multicore architecture
Michael Gschwind
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
Intel® Software
SoC: System On Chip
SoC: System On Chip
Santosh Verma
Technology (1)
Technology (1)
firstnameoRZLPPq3F lastnameoRZLPPq3F
Sakar jain
Sakar jain
Obsidian Software
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Slide_N
Intel Knights Landing Slides
Intel Knights Landing Slides
Ronen Mendezitsky
COST/BENEFIT CASE FOR IBM SYSTEM STORAGE DS8700: COMPARISONS WITH EMC SYMMETR...
COST/BENEFIT CASE FOR IBM SYSTEM STORAGE DS8700: COMPARISONS WITH EMC SYMMETR...
IBM India Smarter Computing
Scalable Elastic Systems Architecture (SESA)
Scalable Elastic Systems Architecture (SESA)
Eric Van Hensbergen
Intel tools to optimize HPC systems
Intel tools to optimize HPC systems
Intel Software Brasil
System on chip architectures
System on chip architectures
A B Shinde
AMulti-coreSoftwareHardwareCo-DebugPlatform_Final
AMulti-coreSoftwareHardwareCo-DebugPlatform_Final
Alan Su
Z4R: Intro to Storage and DFSMS for z/OS
Z4R: Intro to Storage and DFSMS for z/OS
Tony Pearson
Ibm cell
Ibm cell
Vyanktesh Dorlikar
Preparing Codes for Intel Knights Landing (KNL)
Preparing Codes for Intel Knights Landing (KNL)
AllineaSoftware
Methods and practices to analyze the performance of your application with Int...
Methods and practices to analyze the performance of your application with Int...
Intel Software Brasil
Introduction to BeagleBoard-xM
Introduction to BeagleBoard-xM
SysPlay eLearning Academy for You
Abstract chameleon chip
Abstract chameleon chip
Anugrah James
Ph.D. Thesis presentation
Ph.D. Thesis presentation
Guillermo Talavera
CCNxCon2012: Session 4: Caesar: a Content Router for High Speed Forwarding
CCNxCon2012: Session 4: Caesar: a Content Router for High Speed Forwarding
PARC, a Xerox company
Lakefield: Hybrid Cores in 3D Package
Lakefield: Hybrid Cores in 3D Package
inside-BigData.com
OMI - The Missing Piece of a Modular, Flexible and Composable Computing World
OMI - The Missing Piece of a Modular, Flexible and Composable Computing World
Allan Cantle
Hardware and Software Architectures for the CELL BROADBAND ENGINE processor
Hardware and Software Architectures for the CELL BROADBAND ENGINE processor
Slide_N
Enterprise power systems transition to power7 technology
Enterprise power systems transition to power7 technology
solarisyougood
More Related Content
What's hot
SoC: System On Chip
SoC: System On Chip
Santosh Verma
Technology (1)
Technology (1)
firstnameoRZLPPq3F lastnameoRZLPPq3F
Sakar jain
Sakar jain
Obsidian Software
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Slide_N
Intel Knights Landing Slides
Intel Knights Landing Slides
Ronen Mendezitsky
COST/BENEFIT CASE FOR IBM SYSTEM STORAGE DS8700: COMPARISONS WITH EMC SYMMETR...
COST/BENEFIT CASE FOR IBM SYSTEM STORAGE DS8700: COMPARISONS WITH EMC SYMMETR...
IBM India Smarter Computing
Scalable Elastic Systems Architecture (SESA)
Scalable Elastic Systems Architecture (SESA)
Eric Van Hensbergen
Intel tools to optimize HPC systems
Intel tools to optimize HPC systems
Intel Software Brasil
System on chip architectures
System on chip architectures
A B Shinde
AMulti-coreSoftwareHardwareCo-DebugPlatform_Final
AMulti-coreSoftwareHardwareCo-DebugPlatform_Final
Alan Su
Z4R: Intro to Storage and DFSMS for z/OS
Z4R: Intro to Storage and DFSMS for z/OS
Tony Pearson
Ibm cell
Ibm cell
Vyanktesh Dorlikar
Preparing Codes for Intel Knights Landing (KNL)
Preparing Codes for Intel Knights Landing (KNL)
AllineaSoftware
Methods and practices to analyze the performance of your application with Int...
Methods and practices to analyze the performance of your application with Int...
Intel Software Brasil
Introduction to BeagleBoard-xM
Introduction to BeagleBoard-xM
SysPlay eLearning Academy for You
Abstract chameleon chip
Abstract chameleon chip
Anugrah James
Ph.D. Thesis presentation
Ph.D. Thesis presentation
Guillermo Talavera
CCNxCon2012: Session 4: Caesar: a Content Router for High Speed Forwarding
CCNxCon2012: Session 4: Caesar: a Content Router for High Speed Forwarding
PARC, a Xerox company
Lakefield: Hybrid Cores in 3D Package
Lakefield: Hybrid Cores in 3D Package
inside-BigData.com
OMI - The Missing Piece of a Modular, Flexible and Composable Computing World
OMI - The Missing Piece of a Modular, Flexible and Composable Computing World
Allan Cantle
What's hot
(20)
SoC: System On Chip
SoC: System On Chip
Technology (1)
Technology (1)
Sakar jain
Sakar jain
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Intel Knights Landing Slides
Intel Knights Landing Slides
COST/BENEFIT CASE FOR IBM SYSTEM STORAGE DS8700: COMPARISONS WITH EMC SYMMETR...
COST/BENEFIT CASE FOR IBM SYSTEM STORAGE DS8700: COMPARISONS WITH EMC SYMMETR...
Scalable Elastic Systems Architecture (SESA)
Scalable Elastic Systems Architecture (SESA)
Intel tools to optimize HPC systems
Intel tools to optimize HPC systems
System on chip architectures
System on chip architectures
AMulti-coreSoftwareHardwareCo-DebugPlatform_Final
AMulti-coreSoftwareHardwareCo-DebugPlatform_Final
Z4R: Intro to Storage and DFSMS for z/OS
Z4R: Intro to Storage and DFSMS for z/OS
Ibm cell
Ibm cell
Preparing Codes for Intel Knights Landing (KNL)
Preparing Codes for Intel Knights Landing (KNL)
Methods and practices to analyze the performance of your application with Int...
Methods and practices to analyze the performance of your application with Int...
Introduction to BeagleBoard-xM
Introduction to BeagleBoard-xM
Abstract chameleon chip
Abstract chameleon chip
Ph.D. Thesis presentation
Ph.D. Thesis presentation
CCNxCon2012: Session 4: Caesar: a Content Router for High Speed Forwarding
CCNxCon2012: Session 4: Caesar: a Content Router for High Speed Forwarding
Lakefield: Hybrid Cores in 3D Package
Lakefield: Hybrid Cores in 3D Package
OMI - The Missing Piece of a Modular, Flexible and Composable Computing World
OMI - The Missing Piece of a Modular, Flexible and Composable Computing World
Similar to Cell Technology for Graphics and Visualization
Hardware and Software Architectures for the CELL BROADBAND ENGINE processor
Hardware and Software Architectures for the CELL BROADBAND ENGINE processor
Slide_N
Enterprise power systems transition to power7 technology
Enterprise power systems transition to power7 technology
solarisyougood
M. Gschwind, A novel SIMD architecture for the Cell heterogeneous chip multip...
M. Gschwind, A novel SIMD architecture for the Cell heterogeneous chip multip...
Michael Gschwind
Cell/B.E. Servers: A Platform for Real Time Scalable Computing and Visualization
Cell/B.E. Servers: A Platform for Real Time Scalable Computing and Visualization
Slide_N
Visão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Visão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Anderson Bassani
Multi Processor Architecture for image processing
Multi Processor Architecture for image processing
ideas2ignite
Features of modern intel microprocessors
Features of modern intel microprocessors
Krunal Siddhapathak
Track A-Compilation guiding and adjusting - IBM
Track A-Compilation guiding and adjusting - IBM
chiportal
Вадим Сухомлинов _Платформы Intel(r) Atom(tm) – новые возможности для социаль...
Вадим Сухомлинов _Платформы Intel(r) Atom(tm) – новые возможности для социаль...
Ingria. Technopark St. Petersburg
Track F- Designing the kiler soc - sonics
Track F- Designing the kiler soc - sonics
chiportal
iPhone Architecture - Review
iPhone Architecture - Review
Abdelrahman Hosny
System_on_Chip_SOC.ppt
System_on_Chip_SOC.ppt
zahixdd
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
Red_Hat_Storage
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
Colleen Corrice
IBM HPC Transformation with AI
IBM HPC Transformation with AI
Ganesan Narayanasamy
resume
resume
Ender Dai
Hosseini sv07
Hosseini sv07
Obsidian Software
Power 7 Overview
Power 7 Overview
lambertt
The Cell Processor
The Cell Processor
Heiko Joerg Schick
chameleon chip
chameleon chip
Sucharita Bohidar
Similar to Cell Technology for Graphics and Visualization
(20)
Hardware and Software Architectures for the CELL BROADBAND ENGINE processor
Hardware and Software Architectures for the CELL BROADBAND ENGINE processor
Enterprise power systems transition to power7 technology
Enterprise power systems transition to power7 technology
M. Gschwind, A novel SIMD architecture for the Cell heterogeneous chip multip...
M. Gschwind, A novel SIMD architecture for the Cell heterogeneous chip multip...
Cell/B.E. Servers: A Platform for Real Time Scalable Computing and Visualization
Cell/B.E. Servers: A Platform for Real Time Scalable Computing and Visualization
Visão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Visão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Multi Processor Architecture for image processing
Multi Processor Architecture for image processing
Features of modern intel microprocessors
Features of modern intel microprocessors
Track A-Compilation guiding and adjusting - IBM
Track A-Compilation guiding and adjusting - IBM
Вадим Сухомлинов _Платформы Intel(r) Atom(tm) – новые возможности для социаль...
Вадим Сухомлинов _Платформы Intel(r) Atom(tm) – новые возможности для социаль...
Track F- Designing the kiler soc - sonics
Track F- Designing the kiler soc - sonics
iPhone Architecture - Review
iPhone Architecture - Review
System_on_Chip_SOC.ppt
System_on_Chip_SOC.ppt
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
IBM HPC Transformation with AI
IBM HPC Transformation with AI
resume
resume
Hosseini sv07
Hosseini sv07
Power 7 Overview
Power 7 Overview
The Cell Processor
The Cell Processor
chameleon chip
chameleon chip
More from Slide_N
SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...
SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...
Slide_N
Parallel Vector Tile-Optimized Library (PVTOL) Architecture-v3.pdf
Parallel Vector Tile-Optimized Library (PVTOL) Architecture-v3.pdf
Slide_N
Experiences with PlayStation VR - Sony Interactive Entertainment
Experiences with PlayStation VR - Sony Interactive Entertainment
Slide_N
SPU-based Deferred Shading for Battlefield 3 on Playstation 3
SPU-based Deferred Shading for Battlefield 3 on Playstation 3
Slide_N
Filtering Approaches for Real-Time Anti-Aliasing
Filtering Approaches for Real-Time Anti-Aliasing
Slide_N
Chip Multiprocessing and the Cell Broadband Engine.pdf
Chip Multiprocessing and the Cell Broadband Engine.pdf
Slide_N
Cell Today and Tomorrow - IBM Systems and Technology Group
Cell Today and Tomorrow - IBM Systems and Technology Group
Slide_N
New Millennium for Computer Entertainment - Kutaragi
New Millennium for Computer Entertainment - Kutaragi
Slide_N
Sony Transformation 60 - Kutaragi
Sony Transformation 60 - Kutaragi
Slide_N
Sony Transformation 60
Sony Transformation 60
Slide_N
Moving Innovative Game Technology from the Lab to the Living Room
Moving Innovative Game Technology from the Lab to the Living Room
Slide_N
Industry Trends in Microprocessor Design
Industry Trends in Microprocessor Design
Slide_N
Translating GPU Binaries to Tiered SIMD Architectures with Ocelot
Translating GPU Binaries to Tiered SIMD Architectures with Ocelot
Slide_N
Cellular Neural Networks: Theory
Cellular Neural Networks: Theory
Slide_N
Network Processing on an SPE Core in Cell Broadband EngineTM
Network Processing on an SPE Core in Cell Broadband EngineTM
Slide_N
Deferred Pixel Shading on the PLAYSTATION®3
Deferred Pixel Shading on the PLAYSTATION®3
Slide_N
Developing Technology for Ratchet and Clank Future: Tools of Destruction
Developing Technology for Ratchet and Clank Future: Tools of Destruction
Slide_N
NVIDIA Tesla Accelerated Computing Platform for IBM Power
NVIDIA Tesla Accelerated Computing Platform for IBM Power
Slide_N
The Visual Computing Revolution Continues
The Visual Computing Revolution Continues
Slide_N
MLAA on PS3
MLAA on PS3
Slide_N
More from Slide_N
(20)
SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...
SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...
Parallel Vector Tile-Optimized Library (PVTOL) Architecture-v3.pdf
Parallel Vector Tile-Optimized Library (PVTOL) Architecture-v3.pdf
Experiences with PlayStation VR - Sony Interactive Entertainment
Experiences with PlayStation VR - Sony Interactive Entertainment
SPU-based Deferred Shading for Battlefield 3 on Playstation 3
SPU-based Deferred Shading for Battlefield 3 on Playstation 3
Filtering Approaches for Real-Time Anti-Aliasing
Filtering Approaches for Real-Time Anti-Aliasing
Chip Multiprocessing and the Cell Broadband Engine.pdf
Chip Multiprocessing and the Cell Broadband Engine.pdf
Cell Today and Tomorrow - IBM Systems and Technology Group
Cell Today and Tomorrow - IBM Systems and Technology Group
New Millennium for Computer Entertainment - Kutaragi
New Millennium for Computer Entertainment - Kutaragi
Sony Transformation 60 - Kutaragi
Sony Transformation 60 - Kutaragi
Sony Transformation 60
Sony Transformation 60
Moving Innovative Game Technology from the Lab to the Living Room
Moving Innovative Game Technology from the Lab to the Living Room
Industry Trends in Microprocessor Design
Industry Trends in Microprocessor Design
Translating GPU Binaries to Tiered SIMD Architectures with Ocelot
Translating GPU Binaries to Tiered SIMD Architectures with Ocelot
Cellular Neural Networks: Theory
Cellular Neural Networks: Theory
Network Processing on an SPE Core in Cell Broadband EngineTM
Network Processing on an SPE Core in Cell Broadband EngineTM
Deferred Pixel Shading on the PLAYSTATION®3
Deferred Pixel Shading on the PLAYSTATION®3
Developing Technology for Ratchet and Clank Future: Tools of Destruction
Developing Technology for Ratchet and Clank Future: Tools of Destruction
NVIDIA Tesla Accelerated Computing Platform for IBM Power
NVIDIA Tesla Accelerated Computing Platform for IBM Power
The Visual Computing Revolution Continues
The Visual Computing Revolution Continues
MLAA on PS3
MLAA on PS3
Recently uploaded
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Radu Cotescu
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Safe Software
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Puma Security, LLC
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
RTylerCroy
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Drew Madelung
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Principled Technologies
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
HampshireHUG
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Rafal Los
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
hans926745
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
Sinan KOZAK
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Miguel Araújo
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Martijn de Jong
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Delhi Call girls
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Gabriella Davis
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
V3cube
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Katpro Technologies
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
Allon Mureinik
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Malak Abu Hammad
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Delhi Call girls
Recently uploaded
(20)
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Cell Technology for Graphics and Visualization
1.
© 2002 IBM
CorporationAugust 11, 2005 Cell Technology for Graphics and VisualizationCell Technology for Graphics and Visualization Graphics Hardware Workshop 2005 Bruce D’Amora Emerging Systems Software IBM T.J. Watson Research Center http://www.research.ibm.com
2.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation2 AgendaAgenda Introduction Architecture Programming Graphics & Visualization Demos Questions
3.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation3 IntroductionIntroduction
4.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation4 What is Cell?What is Cell? Sony, Toshiba, IBM alliance established in March, 2001 − “STI” Design Center is located in Austin, Texas Goals − Outstanding performance, especially for game/multimedia − Real time responsiveness to the user and the network − Applicable to a wide range of platforms and applications Design Concepts − Compatibility with 64-bit Power Architecture™ − Increased efficiency and performance − Non-homogeneous architecture
5.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation5 ArchitectureArchitecture
6.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation6 Cell Processor ArchitectureCell Processor Architecture SPU LS SPU LS SPU LS SPU LS SPU LS SPU LS SPU LS EIB (up to 96B/cycle) SPU LS 16B/cycle 16B/cycle SPE0 SPE1 SPE2 SPE3 SPE4 SPE5 SPE6 SPE7 L2 L1 PPU 32B/cycle 16B/cycle MIC BIC 16B/cycle(2x)16B/cycle Dual XDR™ I/O I/O 64-bit Power Architecture w/VMX PPE AUC MFC MFC MFC MFC MFC MFC MFC MFC AUC AUC AUC AUC AUC AUC AUC
7.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation7 Element Interconnect BusElement Interconnect Bus EIB data ring for internal communication − Four 16 byte data rings, supporting multiple transfers − 96B/cycle peak bandwidth − Over 100 outstanding requests
8.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation8 Power Processor ElementPower Processor Element PPE handles operating system and control tasks − 64-bit Power ArchitectureTM with VMX − In-order, 2-way hardware symmetrical multi-threading (SMT) − Load/Store with 32KB I & D L1 and 512KB L2
9.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation9 Synergistic Processor ElementSynergistic Processor Element SPE provides computational performance − Dual issue, up to 16-way 128-bit SIMD − Dedicated resources: 128 128-bit register file, 256KB Local Store − Each can be dynamically configured to protect resources − Dedicated DMA engine: Up to 16 outstanding requests per SPE
10.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation10 SPE HighlightsSPE Highlights User-mode architecture − No translation/protection within SPU − DMA is full Power Architecture protect/x-late RISC like structures − 32 bit fixed instructions − Clean design – unified Register file VMX-like SIMD dataflow − Broad set of operations (8 / 16 / 32 Byte) − Graphics SP-Float − IEEE DP-Float Unified register file − 128 entry x 128 bit 256KB Local Store − Combined I & D − 16B/cycle L/S bandwidth − 128B/cycle DMA bandwidth LS LS LS LS GPR FXU ODD FXU EVN SFP DP CONTROL CHANNEL DMA SMM ATO SBI RTB BEB FWD 14.5mm2 (90nm SOI)
11.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation11 I/O and Memory InterfacesI/O and Memory Interfaces I/O Provides wide bandwidth − Two configurable interfaces − Up to 25.6 GB/s memory B/W – Up to 70+ GB/s I/O B/W – Practical ~ 50GB/s
12.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation12 ConfigurabilityConfigurability Direct Attach XDR Two I/O interfaces – Configurable number of Bytes – Coherent or I/O Protection CELL Processor XDRtm XDRtm IOIF0 IOIF1 CELL Processor XDRtm XDRtm IOIF BIF CELL Processor XDRtm XDRtm IOIF CELL Processor XDRtm XDRtm IOIF BIF CELL Processor XDRtm XDRtm IOIF CELL Processor XDRtm XDRtm IOIF BIF CELL Processor XDRtm XDRtm IOIF SW
13.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation13 ProgrammingProgramming
14.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation14 BE Features Exploited by SoftwareBE Features Exploited by Software Keeping Intermediate/Control Data on-Chip − MMU – SLBs, TLBs − DMA from L2 cache-> LS − LS to LS DMA − Cache <-> Cache transfers (atomic update) − SPE Signalling Registers − SPE <-> PPE Mailboxes Resource Reservation and Allocation − PPE can be shared across logical partitions − SPEs can be assigned to logical partitions − SPEs independently or Group Allocated PowerPC (PPE) L2 Cache DMA with Intervention Hardware Managed Cache Coherency BE Chip System Memory I/O BIF/IOIF MFC Local Store SPU AUC MFC Local Store SPU AUC MFC Local Store SPU AUC MFC Local Store SPU AUC MFC Local Store SPU AUC MFC Local Store SPU AUC Local Store to Local Store DMA Atomic Update Cache
15.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation15 Models for Programming the CellModels for Programming the Cell Function Offload Application Specific Accelerators Computation-Acceleration Streaming Shared Memory Multi-processor Heterogeneous Thread Runtime Model Single Source Compiler
16.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation16 Function OffloadFunction Offload Easiest way to port applications to Cell PPE can call procedure located on the SPE as if it were a local PPE function PPE and SPE use “stubs” as placeholders for local and remote procedures Main() image f1() PPE stub f2() image f1() SPE stub f1() image PPE SPE
17.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation17 Function Offload via IDLFunction Offload via IDL .idl file IDL compiler Program.c Function.cSPE_stub.cStub.hPPE_stub.c SPE compiler & linker SPE binary PPE compiler linker PPE binary RPC run-time library IDL compiler facilitates the RPC function offload – Defines interfaces between PPE main program and SPE RPC program Function loaded onto an SPE only once − PPE can make multiple calls to that procedure without having to reload it.
18.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation18 OS allocates and initializes SPE resources N N N N N N N Special Case: Application Specific AcceleratorsSpecial Case: Application Specific Accelerators OS services invokes SPE function System Memory SPE Function Graphics MPEGSSL Vertex shaders Encryption Decryption Realtime MPEG Encoding MPEG DecodeMPEG Decode PPE OS Application Services PPE_mpeg_encode() SPE_mpeg_encode()
19.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation19 Programming Models Computational AccelerationComputational Acceleration PowerPC (PPE) SPU Puts Results PPE Puts Initial Text Static Data Parameters System Memory SPC Independently Stages Text & Intermediate Data Transfers while executing PPE SPU Local Store MFC PPE Gets Results PPE passes: Text Static Data Parameters SPU executes User created RPC libraries − User acceleration routines − User compiles SPE code Local Data − Data and Parameters passed in call Global Data − Data and Parameters passed in call − Code manages global data SPU Local Store MFC
20.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation20 Streaming ModelStreaming Model SPE initiated DMA Input/output/kernel streams –DMA’d from system memory->LS or LS->LS PPE SPE1 Kernel() SPE0 Kernel() SPE7 Kernel() XDR In . I7 I6 I5 I4 I3 I2 I1 I0 On . O7 O6 O5 O4 O3 O2 O1 O0 ….. Data-parallel PPE SPE1 Kernel1() SPE0 Kernel0() SPE7 Kernel7() XDR In . . I6 I5 I4 I3 I2 I1 I0 On . . O6 O5 O4 O3 O2 O1 O0 ….. DMA DMA Task-parallel SPE7 Ki SPE1 Ki SPE0 Ki XDR In . I7 I6 I5 I4 I3 I2 I1 I0 ….. Data&Kernel Streams Kn . . K6 K5 K4 K3 K2 K1 K0 SPE7 Ki PPE On . O7 O6 O5 O4 O3 O2 O1 O0 SPE7 Ki
21.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation21 SharedShared--memory Multiprocessormemory Multiprocessor The Cell Processor can be programmed as a shared-memory multiprocessor, using two different instruction sets The SPEs and the PPE fully inter-operate in a cache-coherent Shared-Memory Multiprocessor Model – All DMA operations in the SPEs are cache-coherent – The DMA operations use an effective address that is common to all PPE and SPEs – Shared-memory store instructions are replaced by a store from the register file to the LS, followed by a DMA operation from LS to shared memory. A compiler or interpreter could manage part of the LS as a local cache for instructions and data obtained from shared memory.
22.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation22 Application Source & Libraries PPE object files SPE object files SPESPE SPE SPE Physical SPUs SPESPE SPE SPE Cell AwareOS ( Linux) SPE Virtualization / Scheduling Layer (m->n spc threads) New SPE tasks/threadsExisting PPE tasks/threads Physical PPE PPE MT1 MT2 PPE Threads, SPE Threads Shared effective address space SPE Virtualization in debug mode only Heterogeneous MultiHeterogeneous Multi--Threading ModelThreading Model
23.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation23 Single source approach to programming Cell Single Source Compiler − Auto parallelization ( treat target Cell as an SMP ) − Auto SIMD-ization ( SIMD-vectorization ) − Compiler management of Local Store as 2nd level register file / SW managed cache (I&D) − Most Cell unique piece Optimization − OpenMP pragmas − Vector.org SIMD intrinsics − Data/Code partitioning − Streaming / pre-specifying code/data use Prototype Single Source Compiler Developed in IBM Research
24.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation24 Graphics & VisualizationGraphics & Visualization
25.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation25 Graphics and Visualization on CellGraphics and Visualization on Cell Terrain Rendering Engine Multi-resolution Subdivision Surfaces Geometry Processing Courtesy of Alias Systems Cloth Simulation
26.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation26 The Cell and GPUsThe Cell and GPUs Extension to shader pipeline on GPUs − Vertex shaders for geometric modeling − NURBS − Subdivision surfaces − Continuous level of detail − Vertex shaders for a additional lighting models Physically Based Modeling − Soft-body dynamics − Rigid-body dynamics Image Processing − Background Subtraction for Video Surveillance Global Illumination – Raytracing − Raycasting Terrains − Rendering of DEMs − Surface shaders for lighting, shadows
27.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation27 The Cell and GPUsThe Cell and GPUs SPU LS MFC SPE 6 PPU L1 L2 PPE SPU LS MFC SPE 4 SPU LS MFC SPE2 SPU LS MFC SPE 0 SPU LS MFC SPE 7 SPU LS MFC SPE 5 SPU LS MFC SPE3 SPU LS MFC SPE 1 Physically-based modeling • Soft/rigid body • Fluid dynamics Vertex Shaders • Lighting models • Subdivision App Cell I/O&MemoryInterfaces GPU Pixel Processing Frame Buffer Vertex Processing Local or network connection Runtime generated textures, maps, vertices, etc. Geometry, textures, maps, etc. AI System Memory Terrain Generation • Incl. depth maps D ata/code flow Control Meshes Tesselated geom. Maps, Textures, etc.
28.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation28 Image Processing on CellImage Processing on Cell Video Surveillance Background subtraction compares the current image (left) with a reference image (middle) to find the changed regions (right) corresponding to objects of interest.
29.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation29 BGS Application Structure AGC & AWB compensation Kalman smoothing motion energy shadow & highlights pixel differences texture differences Image Background Mask Salience quiescent area addition slow update blending stationary healing determine add / remove statistical thresholding SIGNAL contour smoothing component pruning per Jonathan Connell
30.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation30 Alternative 1Alternative 1 –– Single SPE for Video StreamSingle SPE for Video Stream Each SPE dedicated to one or more video streams Code overlaying may be needed to overcome the local memory limitation Seems most compatible with existing application structure No code or data partitioning across SPEs SPE 0 Video Stream 1 SPE 1 Video Stream 2 SPE 7 Video Stream N
31.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation31 Alternative 2Alternative 2 –– Image Pipeline for TaskImage Pipeline for Task--levellevel ParallelismParallelism Functions are divided into groups Groups set up in pipelined fashion Each SPE dedicated to only one group Outputs from one group passed SPE-to-SPE as inputs to the next Code is partitioned – Data is not partitioned Efficient utilization of SPEs SPE 0 Video Streams SPE 1 SPE 7
32.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation32 Alternative 3Alternative 3 –– DataData--level Parallelismlevel Parallelism For a given function, each SPE processes part of a frame Not easily applied to all BGS functions 00 00 00 00 00 00 00 07 18 2D 43 5B 74 8D A2 B3 C5 D7 EC FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FA DE C5 AC 94 7B 62 4A 31 1C 0A 00 00 SPE 0 SPE 1 SPE 2 SPE 3
33.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation33 MultiMulti--resolution Subdivisionresolution Subdivision Detail vector Base Mesh Decompose Into 1-rings PPE SPE1 Kernel() SPE0 Kernel() SPE7 Kernel()….. Data-parallel System Memory N 1-rings … … Rn . R2 R1 R0 . D6 D5 D4 D3 D2 D1 D0 Tj . . T6 T5 T4 T3 T2 T1 T0 Store as Ring stream Triangles to GPU GPU Pixel Processing Frame Buffer Vertex Processing Local or network connection Dm
34.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation34 DemonstrationsDemonstrations Raytracing Terrain Rendering (TRE) Cloth Simulation
35.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation35 Demo Platform: Cell Blade PrototypeDemo Platform: Cell Blade Prototype Chassis Blade Deskside Unit Blade Server – Dual Cell Processors (SMP), Support Logic, Memory, Storage – PCI Express 4X option port – BladeCenter Interface ( Based on IBM JS20) – Infiniband 4x (10Gbps) interconnect Chassis – Standard IBM BladeCenter form factor with: – 7 Blades (1 blade/2 slots) – 2 internal switches (1Gb Ethernet) with 4 external ports each – Separate, external Infiniband Switch with optional FC port Software – Linux OS – Tool chain and compilation support – Additional IBM Software Development Tools and Cluster Management software – Client Systems – IBM T41 Thinkpad (1.7 Ghz Pentium-M) – Apple Mac-dual G5@2Ghz Blade BladeCenter Interface PCI-Exp Cell Processor South Bridge Memory Cell Processor South Bridge Memory IB 4X IB 4X HDD (HDD) GbE GbE Rack
36.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation36 Camera Shader Example: Raytracing ArchitectureExample: Raytracing Architecture OpenRT Generate Tiles Spatial Subdivision K-d tree Traversal Material Shading Intersection Testing Generate Rays View transorms Recursive/Adaptive Sorting Spatial Subdivision Structure K-d Tree Per object & top-level System Memory •framebuffer System Memory •Texture Maps •Run-time shaders Tile pixels Triangle-Ray pairs Rays SecondaryRays PU Code Potential execution on SPE SPU code Object Data SPE iPPE If intersect false true
37.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation37 Cell Optimized Ray Caster: Terrain Rendering EngineCell Optimized Ray Caster: Terrain Rendering Engine 30+ frames per second with only one Cell processor30+ frames per second with only one Cell processor –– No graphics adapter assistNo graphics adapter assist –– 1280x720 (HD 720P) resolution1280x720 (HD 720P) resolution HD 1080P at 30+ frames per second via 2 way SMP CellHD 1080P at 30+ frames per second via 2 way SMP Cell Advanced SPE shader functionAdvanced SPE shader function –– Ray/Terrain intersection computationRay/Terrain intersection computation –– Texture FilteringTexture Filtering –– Normal computationNormal computation –– Bump map computationBump map computation –– Diffuse + Ambient lighting modelDiffuse + Ambient lighting model –– Perlin Noise based cloudsPerlin Noise based clouds –– Atmosphere computation (haze, sun, halo)Atmosphere computation (haze, sun, halo) –– Dynamic multiDynamic multi--sampling (4sampling (4 –– 16 samples per pixel)16 samples per pixel) –– Image based input (16 bit height + 16 bit texture)Image based input (16 bit height + 16 bit texture) –– 29 KB of SPE object code29 KB of SPE object code –– 224 KB of SPE local store data224 KB of SPE local store data MM--JPEG compression via SPEJPEG compression via SPE Performance scales linearly with number of available SPEsPerformance scales linearly with number of available SPEs Written completely in C with intrinsicsWritten completely in C with intrinsics Currently up and running on bring up systems!Currently up and running on bring up systems!
38.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation38 Vertical Ray CoherenceVertical Ray Coherence
39.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation39 Algorithm Mapped to Cell
40.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation40 Physically Based Modeling: Alias research prototypePhysically Based Modeling: Alias research prototype Cell implementation of Alias cloth solver – Compute the displacement of 3-D cloth mesh vertices in response to various gravitational, collision and constaints – Each simulation can occur on a separate SPE Render results on commodity graphics adapter – locally or over network Courtesy of Alias Systems™ SPEN . . . input output SPE3 input output SPE2 input output SPE1 input output SPE0 input output System Memory PPE Main SocketMgr Create SPE Treads DMAIN DMAOUT VertexDisplacementsout Position, orientation input Triangle output Forcevectorsin
41.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation41 Cell beyond the PS3Cell beyond the PS3 IBM support for custom systems and applications based on Cell − Through IBM Engineering and Technology Services Planned release (next several months) of architecture, simulator, and compiler to enable Cell software development and evaluation in a large community − Linux OS, compilers, debugger, full system simulator − Open source and IBM alphaworks Prototype system implementations and evaluation of markets beyond console − Cell Processor Based Blades (acceleration and other apps.) − CE devices (HDTV, home media server) − Standard products for embedded applications − Prototype reference designs for smaller systems
42.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation42 SummarySummary Cell ushers in a new era of leading edge processors optimized for digital media and entertainment New levels of performance and power efficiency beyond what is achieved by PC processors Step towards HPC / game processor convergence
43.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation43 AcknowledgementsAcknowledgements Sony Computer Entertainment Toshiba Corporation IBM STI Design Center: Peter Hofstee Barry Minor Mark Nutter Gerhard Stenzel, IBM Boeblingen Development Labs Philipp Slusallek, Saarland University Alias Systems: Joyce Janczyn Francesco Iorio Jos Stam
44.
IBM T.J. Watson
Research Center Cell Technology for Graphics and Visualization © 2005 IBM Corporation44 Thank youThank you
Download now