SlideShare a Scribd company logo
1 of 20
Download to read offline
© 2013 IBM Corporation
PSS
A Prototype Storage Subsystem based on PCM
IBM Research – Zurich
Ioannis Koltsidas, Roman Pletka, Peter Mueller,
Thomas Weigold, Evangelos Eleftheriou
University of Patras
Maria Varsamou, Athina Ntalla, Elina Bougioukou,
Aspasia Palli, Theodore Antonakopoulos
© 2013 IBM Corporation
Phase Change Memory (PCM)
 Based on the thermal threshold switching effect of chalcogenidic meterials
 Two Phases:
2
Set
Reset
Amorphous Phase Crystalline Phase
 Phases have very different electrical resistances (ratio of 1:100 to 1:1000)
 Transition between phases by controlled heating and cooling
 Read time: 100-300 nsec
 Program time: 10-150 μsec
 PCM cells can be reprogrammed at least 106 times
 Performance and price characteristics between DRAM and Flash
© 2013 IBM Corporation
Placing PCM in Servers and Storage Systems
Server System Storage System
RAID ctrl
HBA
CPUs
DRAM
Cache
I/O
bus
RAID ctrl
CPUs
I/O
bus
HA
PCM
PCM
Hybrid PCIe
attached SSD
Hybrid
SAS/SATA
attached SSD
PCM
Hybrid PCIe
attached SSD
Hybrid
SAS/SATA
attached SSD
PCM
DRAM DRAM
PCM
DRAM
Cache PCMPCM
PCM PCM
3
PCMPCM
HDDSSD
All-Flash
Array
Metadata
Tables
© 2013 IBM Corporation
PSS: A PCM-based PCI-e Prototype Card
4
Goal:
Architect a PCM-based device and implement
a fully-functional, high-performance PCI-e card
 Take advantage of the characteristics of PCM and
mitigate its limitations
 Target is workloads dominated by 4kB requests
 Simple, lightweight hardware design
 System integration of multiple cards through
software
 Emphasis on consistently low, predictable latency
PCM PCI-e cardHost
PCM
chips
PCM
Channels
Lean
PCM
Controller
Multi-lane
PCI-e bus
Block
Device
Driver
PCM
pipe #1
PCM
pipe #n
512B
4kB
 Limited in capacity due to the density of commercially available PCM parts (as of early 2013)
 Use cases:
– Caching device
– Metadata store
– Backend for low-latency Key-Value store
– Tiered storage device in a hybrid configuration with Flash
© 2013 IBM Corporation
PCM Parts: Micron P5Q
 90nm technology node
 128 Mbit devices (NP5Q128AE3ESFC0E)
 SPI bus compatible serial interface
 Maximum clock frequency: 66 MHz
 64-byte write buffer
– 120 μsec average program time
– about 0.5 MB/s write bandwidth
5
Block transfer time: 8.24 usecs (64+4 Bytes)
Sector I/O (512B + 64B):
Write: 1.15 msecs (0.86 kIOPs)
Read: 75.24 usecs (13.29 kIOPs)
Very asymmetric
read / write performance
© 2013 IBM Corporation
2D PCM Channel Architecture
6
PCM
(1,NI)
PCM
(1, NI -1)
PCM
(1,1)
k
PCM
(2, NI)
PCM
(2, NI -1)
PCM
(2,1)
k
PCM
(NS, NI)
PCM
(NS, NI -1)
PCM
(NS,1)
k
FSM
Data Buffer
FSM FSMFSM
Sub-channel
PCM
Channel
Controller
Sub-bank
© 2013 IBM Corporation
Read vs. Write Performance Trade-Offs
 A high degree of pipelining:
– Increases the write performance
• By having the long programming times overlap
– May reduce the read performance
• Read times are anyway very short
• Less parallelism due to fewer I/O pins
 For a given budget of I/O pins:
– More sub-channels  Better read performance
– More sub-banks  Better write performance
 Application needs should drive the configuration
– Channels with different geometry in the same device
possible
 We chose a configuration that minimizes write latency
without severely penalizing reads
7
© 2013 IBM Corporation
3x3 Channel Architecture
8
PCM
(1,3)
PCM
(1, 2)
PCM
(1,1)
k
PCM
(2, 3)
PCM
(2, 2)
PCM
(2,1)
k
PCM
(3, 3)
PCM
(3, 2)
PCM
(3,1)
k
FSM
Data Buffer
FSM FSMFSM
PCM Channel Controller
1 Block = 64 bytes
3 blocks
3 blocks
3 blocks
For each user sector (512b= 8x64), we store 64bytes of metadata, i.e., 9 blocks in total
P
C
M
B
a
n
k
© 2013 IBM Corporation
PSS Channel Card
One PCM channel per side
Two PCM channels per card
Channel card
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
[1,1]
[1,3]
[3,1]
[3,3]
[1,2]
[2,1]
[2,3]
[2,2] [3,2]
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
[1,1]
[1,3]
[3,1]
[3,3]
[1,2]
[2,1]
[2,3]
[2,2] [3,2]
CS[3] CS[2] CS[1]
CLK D1[1:0] D2[1:0] D3[1:0]DIR
C B A Y0
Y1
Y2Y3Y4Y5
3 to 8
DECODER
PCM channel specs
Data transfer Rate: 49.5 MBps
Sector read time: 13.8 usecs
Sector read rate: 61.6 ksectors/sec
Sector write time: 133.8 usecs
Sector write rate: 14.8 ksectors/sec
1 PCM Channel = 2 Banks (2x3x3)
© 2013 IBM Corporation
PSS PCI-e Card
10
PCMChannelmodule
Xilinx Zynq-7045 FPGA Board
2XHost-attachedPSSCards
 Error correction based on simple BCH codes
o 6 BCH codewords per 512 bytes sector with 4 bits error correction capability per codeword.
 Wear leveling using a Start-Gap scheme
 512MB of DRAM, mostly used as a write cache
 Support for cached writes, direct writes with early completion, direct writes with late completion
8 PCM channels with pipeline support per card
© 2013 IBM Corporation
PSS Experimental Results
Throughput versus Offered Load (4kB pages)
© 2013 IBM Corporation
Random Read
PSS Experimental Results
I/O Completion Latency Distribution
PCM technology programming timeRandom Write (cached)
© 2013 IBM Corporation
Latency distribution comparison to Flash-based devices
 Devices
– PSS PCI-e Card
– MLC Flash PCI-e SSD 1
– MLC Flash PCI-e SSD 2
– TLC Flash SATA SSD
 Experiment
Per-I/O latency measurements for 2 hours of uniformly random 4kB writes at QD=1
(after 12 hours of preconditioning with the same workload)
13
© 2013 IBM Corporation
Latency Profile up to 1msec
PSS MLC1
MLC2 TLC
© 2013 IBM Corporation
Latency Profile up to 10msec
PSS MLC1
MLC2 TLC
© 2013 IBM Corporation
Total Latency Profile
PSS MLC1
MLC2 TLC
© 2013 IBM Corporation
PCM Endurance Measurements
17
minimum write time per sector
maximum write time per sector
mean write time per sector
The effect of PCM aging on
write time and BER
Experimental parameters:
• Random data
• 32 sectors per write cycle
• 4 PCM channels
• 8 PCM banks
• Pipeline is active
Experiment
- Perform 10K write cycles
with random data (x32
sectors)
- Write, read and compare a
set of 32 sectors (single
write cycle)
PCM Write Latency Distribution
Experimental parameters:
• Random data
• 32 sectors per write cycle
• 4 PCM channels
• 8 PCM banks/controllers
• Pipeline is active
Experiment
- Perform 10K write cycles with random data
(x32 sectors)
- Write, read and compare a set of 32 sectors
(single write cycle)
© 2013 IBM Corporation
Conclusions
 PCM is a promising new memory technology
 PSS is a PCI-e attached subsystem that mitigates the limitations of current PCM technology
 The 2D Channel Architecture allows the designer to trade-off read performance for write
performance and vice-versa
 PSS achieved good performace
– 65k Read IOPS @ 35 μsec
– 15k Write IOPS @ 61 μsec
 PSS achieved consistently low write latency
– 99.9% of the requests completed within 240 μsec
• 12x and 275x lower than MLC and TLC Flash SSDs, respectively
– Highest observed latency was 2 msec
• 7x and 61x lower than MLC and TLC Flash SSDs, respectively
19
© 2013 IBM Corporation20

More Related Content

What's hot

Track e low voltage sram - adam teman bgu
Track e   low voltage sram - adam teman bguTrack e   low voltage sram - adam teman bgu
Track e low voltage sram - adam teman bguchiportal
 
2012 benjamin klenk-future-memory_technologies-presentation
2012 benjamin klenk-future-memory_technologies-presentation2012 benjamin klenk-future-memory_technologies-presentation
2012 benjamin klenk-future-memory_technologies-presentationSaket Vihari
 
IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...
IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...
IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...IBM Research
 
Multiprocessor Architecture for Image Processing
Multiprocessor Architecture for Image ProcessingMultiprocessor Architecture for Image Processing
Multiprocessor Architecture for Image Processingmayank.grd
 
Hybrid Memory Cube: Developing Scalable and Resilient Memory Systems
Hybrid Memory Cube: Developing Scalable and Resilient Memory SystemsHybrid Memory Cube: Developing Scalable and Resilient Memory Systems
Hybrid Memory Cube: Developing Scalable and Resilient Memory SystemsMicronTechnology
 
Hybrid Memory Cubes offer Smart Designers and Buyers a Competitive Advantage
Hybrid Memory Cubes offer Smart Designers and Buyers a Competitive AdvantageHybrid Memory Cubes offer Smart Designers and Buyers a Competitive Advantage
Hybrid Memory Cubes offer Smart Designers and Buyers a Competitive AdvantageBill Kohnen
 
An introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale ComputersAn introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale ComputersAlessio Villardita
 
Memory map selection of real time sdram controller using verilog full project...
Memory map selection of real time sdram controller using verilog full project...Memory map selection of real time sdram controller using verilog full project...
Memory map selection of real time sdram controller using verilog full project...rahul kumar verma
 
Memory Interfaces & Controllers - Sandeep Kulkarni, Lattice
Memory Interfaces & Controllers - Sandeep Kulkarni, LatticeMemory Interfaces & Controllers - Sandeep Kulkarni, Lattice
Memory Interfaces & Controllers - Sandeep Kulkarni, LatticeFPGA Central
 
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...In-Memory Computing Summit
 
Multicore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiMulticore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiAnkit Raj
 
Reliability and yield
Reliability and yield Reliability and yield
Reliability and yield rohitladdu
 
customization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLAcustomization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLAShien-Chun Luo
 
Multi-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O SynchronizationMulti-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O Synchronizationrtsljekim
 

What's hot (20)

Track e low voltage sram - adam teman bgu
Track e   low voltage sram - adam teman bguTrack e   low voltage sram - adam teman bgu
Track e low voltage sram - adam teman bgu
 
SRAM
SRAMSRAM
SRAM
 
2012 benjamin klenk-future-memory_technologies-presentation
2012 benjamin klenk-future-memory_technologies-presentation2012 benjamin klenk-future-memory_technologies-presentation
2012 benjamin klenk-future-memory_technologies-presentation
 
Multicore Processors
Multicore ProcessorsMulticore Processors
Multicore Processors
 
IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...
IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...
IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...
 
Multiprocessor Architecture for Image Processing
Multiprocessor Architecture for Image ProcessingMultiprocessor Architecture for Image Processing
Multiprocessor Architecture for Image Processing
 
Hybrid Memory Cube: Developing Scalable and Resilient Memory Systems
Hybrid Memory Cube: Developing Scalable and Resilient Memory SystemsHybrid Memory Cube: Developing Scalable and Resilient Memory Systems
Hybrid Memory Cube: Developing Scalable and Resilient Memory Systems
 
Hybrid Memory Cubes offer Smart Designers and Buyers a Competitive Advantage
Hybrid Memory Cubes offer Smart Designers and Buyers a Competitive AdvantageHybrid Memory Cubes offer Smart Designers and Buyers a Competitive Advantage
Hybrid Memory Cubes offer Smart Designers and Buyers a Competitive Advantage
 
An introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale ComputersAn introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale Computers
 
Memory map selection of real time sdram controller using verilog full project...
Memory map selection of real time sdram controller using verilog full project...Memory map selection of real time sdram controller using verilog full project...
Memory map selection of real time sdram controller using verilog full project...
 
Memory Interfaces & Controllers - Sandeep Kulkarni, Lattice
Memory Interfaces & Controllers - Sandeep Kulkarni, LatticeMemory Interfaces & Controllers - Sandeep Kulkarni, Lattice
Memory Interfaces & Controllers - Sandeep Kulkarni, Lattice
 
Sram technology
Sram technologySram technology
Sram technology
 
Lecture2
Lecture2Lecture2
Lecture2
 
DDR2 SDRAM
DDR2 SDRAMDDR2 SDRAM
DDR2 SDRAM
 
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
 
Multicore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiMulticore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash Prajapati
 
2020 icldla-updated
2020 icldla-updated2020 icldla-updated
2020 icldla-updated
 
Reliability and yield
Reliability and yield Reliability and yield
Reliability and yield
 
customization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLAcustomization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLA
 
Multi-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O SynchronizationMulti-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O Synchronization
 

Viewers also liked

PHASE CHANGE MEMORY
PHASE CHANGE MEMORYPHASE CHANGE MEMORY
PHASE CHANGE MEMORYSrinivas H V
 
Opening slides to Warsaw Scala FortyFives on Testing tools
Opening slides to Warsaw Scala FortyFives on Testing toolsOpening slides to Warsaw Scala FortyFives on Testing tools
Opening slides to Warsaw Scala FortyFives on Testing toolsJacek Laskowski
 
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth LoganMulti Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth LoganSpark Summit
 
Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...
Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...
Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...Spark Summit
 
Beneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek LaskowskiBeneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek LaskowskiSpark Summit
 
A Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
A Scaleable Implementation of Deep Learning on Spark -Alexander UlanovA Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
A Scaleable Implementation of Deep Learning on Spark -Alexander UlanovSpark Summit
 
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaDeep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaSpark Summit
 
CaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark ClusterCaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark ClusterJen Aman
 
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...Helena Edelson
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Helena Edelson
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsAnyscale
 

Viewers also liked (11)

PHASE CHANGE MEMORY
PHASE CHANGE MEMORYPHASE CHANGE MEMORY
PHASE CHANGE MEMORY
 
Opening slides to Warsaw Scala FortyFives on Testing tools
Opening slides to Warsaw Scala FortyFives on Testing toolsOpening slides to Warsaw Scala FortyFives on Testing tools
Opening slides to Warsaw Scala FortyFives on Testing tools
 
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth LoganMulti Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
 
Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...
Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...
Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...
 
Beneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek LaskowskiBeneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek Laskowski
 
A Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
A Scaleable Implementation of Deep Learning on Spark -Alexander UlanovA Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
A Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
 
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaDeep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
 
CaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark ClusterCaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark Cluster
 
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
 

Similar to A Prototype Storage Subsystem based on Phase Change Memory

Microelectronics U4.pptx.ppt
Microelectronics U4.pptx.pptMicroelectronics U4.pptx.ppt
Microelectronics U4.pptx.pptPavikaSharma3
 
Decoupling Compute from Memory, Storage and IO with OMI
Decoupling Compute from Memory, Storage and IO with OMIDecoupling Compute from Memory, Storage and IO with OMI
Decoupling Compute from Memory, Storage and IO with OMIAllan Cantle
 
Ics21 workshop decoupling compute from memory, storage & io with omi - ...
Ics21 workshop   decoupling compute from memory, storage & io with omi - ...Ics21 workshop   decoupling compute from memory, storage & io with omi - ...
Ics21 workshop decoupling compute from memory, storage & io with omi - ...Vaibhav R
 
Kiến trúc máy tính - COE 301 - Memory.ppt
Kiến trúc máy tính - COE 301 - Memory.pptKiến trúc máy tính - COE 301 - Memory.ppt
Kiến trúc máy tính - COE 301 - Memory.pptTriTrang4
 
Automotive network and gateway simulation
Automotive network and gateway simulationAutomotive network and gateway simulation
Automotive network and gateway simulationDeepak Shankar
 
Performance analysis of 3D Finite Difference computational stencils on Seamic...
Performance analysis of 3D Finite Difference computational stencils on Seamic...Performance analysis of 3D Finite Difference computational stencils on Seamic...
Performance analysis of 3D Finite Difference computational stencils on Seamic...Joshua Mora
 
Heterogeneous Integration with 3D Packaging
Heterogeneous Integration with 3D PackagingHeterogeneous Integration with 3D Packaging
Heterogeneous Integration with 3D PackagingAMD
 
CXL Memory Expansion, Pooling, Sharing, FAM Enablement, and Switching
CXL Memory Expansion, Pooling, Sharing, FAM Enablement, and SwitchingCXL Memory Expansion, Pooling, Sharing, FAM Enablement, and Switching
CXL Memory Expansion, Pooling, Sharing, FAM Enablement, and SwitchingMemory Fabric Forum
 
A Simplied Bit-Line Technique for Memory Optimization
A Simplied Bit-Line Technique for Memory OptimizationA Simplied Bit-Line Technique for Memory Optimization
A Simplied Bit-Line Technique for Memory Optimizationijsrd.com
 
Q1 Memory Fabric Forum: Breaking Through the Memory Wall
Q1 Memory Fabric Forum: Breaking Through the Memory WallQ1 Memory Fabric Forum: Breaking Through the Memory Wall
Q1 Memory Fabric Forum: Breaking Through the Memory WallMemory Fabric Forum
 
Microchip: CXL Use Cases and Enabling Ecosystem
Microchip: CXL Use Cases and Enabling EcosystemMicrochip: CXL Use Cases and Enabling Ecosystem
Microchip: CXL Use Cases and Enabling EcosystemMemory Fabric Forum
 
Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...
Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...
Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...Codemotion
 
PIC32MX Microcontroller Family
PIC32MX Microcontroller FamilyPIC32MX Microcontroller Family
PIC32MX Microcontroller FamilyPremier Farnell
 
VJITSk 6713 user manual
VJITSk 6713 user manualVJITSk 6713 user manual
VJITSk 6713 user manualkot seelam
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCoburn Watson
 
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxQ1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxMemory Fabric Forum
 
Semiconductor overview
Semiconductor overviewSemiconductor overview
Semiconductor overviewNabil Chouba
 
Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5Steen Larsen
 

Similar to A Prototype Storage Subsystem based on Phase Change Memory (20)

7 eti pres
7 eti pres7 eti pres
7 eti pres
 
Microelectronics U4.pptx.ppt
Microelectronics U4.pptx.pptMicroelectronics U4.pptx.ppt
Microelectronics U4.pptx.ppt
 
Decoupling Compute from Memory, Storage and IO with OMI
Decoupling Compute from Memory, Storage and IO with OMIDecoupling Compute from Memory, Storage and IO with OMI
Decoupling Compute from Memory, Storage and IO with OMI
 
Ics21 workshop decoupling compute from memory, storage & io with omi - ...
Ics21 workshop   decoupling compute from memory, storage & io with omi - ...Ics21 workshop   decoupling compute from memory, storage & io with omi - ...
Ics21 workshop decoupling compute from memory, storage & io with omi - ...
 
Kiến trúc máy tính - COE 301 - Memory.ppt
Kiến trúc máy tính - COE 301 - Memory.pptKiến trúc máy tính - COE 301 - Memory.ppt
Kiến trúc máy tính - COE 301 - Memory.ppt
 
Automotive network and gateway simulation
Automotive network and gateway simulationAutomotive network and gateway simulation
Automotive network and gateway simulation
 
Performance analysis of 3D Finite Difference computational stencils on Seamic...
Performance analysis of 3D Finite Difference computational stencils on Seamic...Performance analysis of 3D Finite Difference computational stencils on Seamic...
Performance analysis of 3D Finite Difference computational stencils on Seamic...
 
Heterogeneous Integration with 3D Packaging
Heterogeneous Integration with 3D PackagingHeterogeneous Integration with 3D Packaging
Heterogeneous Integration with 3D Packaging
 
CXL Memory Expansion, Pooling, Sharing, FAM Enablement, and Switching
CXL Memory Expansion, Pooling, Sharing, FAM Enablement, and SwitchingCXL Memory Expansion, Pooling, Sharing, FAM Enablement, and Switching
CXL Memory Expansion, Pooling, Sharing, FAM Enablement, and Switching
 
A Simplied Bit-Line Technique for Memory Optimization
A Simplied Bit-Line Technique for Memory OptimizationA Simplied Bit-Line Technique for Memory Optimization
A Simplied Bit-Line Technique for Memory Optimization
 
Q1 Memory Fabric Forum: Breaking Through the Memory Wall
Q1 Memory Fabric Forum: Breaking Through the Memory WallQ1 Memory Fabric Forum: Breaking Through the Memory Wall
Q1 Memory Fabric Forum: Breaking Through the Memory Wall
 
Microchip: CXL Use Cases and Enabling Ecosystem
Microchip: CXL Use Cases and Enabling EcosystemMicrochip: CXL Use Cases and Enabling Ecosystem
Microchip: CXL Use Cases and Enabling Ecosystem
 
Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...
Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...
Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...
 
PIC32MX Microcontroller Family
PIC32MX Microcontroller FamilyPIC32MX Microcontroller Family
PIC32MX Microcontroller Family
 
VJITSk 6713 user manual
VJITSk 6713 user manualVJITSk 6713 user manual
VJITSk 6713 user manual
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performance
 
Zynq ultrascale
Zynq ultrascaleZynq ultrascale
Zynq ultrascale
 
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxQ1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
 
Semiconductor overview
Semiconductor overviewSemiconductor overview
Semiconductor overview
 
Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5
 

More from IBM Research

IBM Research - Zurich Celebrates 60 Years of Science and Innovation
IBM Research - Zurich Celebrates 60 Years of Science and InnovationIBM Research - Zurich Celebrates 60 Years of Science and Innovation
IBM Research - Zurich Celebrates 60 Years of Science and InnovationIBM Research
 
IBM/ASTRON DOME 64-bit Hot Water Cooled Microserver
IBM/ASTRON DOME  64-bit Hot Water Cooled MicroserverIBM/ASTRON DOME  64-bit Hot Water Cooled Microserver
IBM/ASTRON DOME 64-bit Hot Water Cooled MicroserverIBM Research
 
IBM and ASTRON 64bit μServer for DOME
IBM and ASTRON 64bit μServer for DOMEIBM and ASTRON 64bit μServer for DOME
IBM and ASTRON 64bit μServer for DOMEIBM Research
 
The Dilemmas of Innovation Management
The Dilemmas of Innovation ManagementThe Dilemmas of Innovation Management
The Dilemmas of Innovation ManagementIBM Research
 
Big Data and the Future of Storage
Big Data and the Future of StorageBig Data and the Future of Storage
Big Data and the Future of StorageIBM Research
 
The New Era of Cognitive Computing
The New Era of Cognitive ComputingThe New Era of Cognitive Computing
The New Era of Cognitive ComputingIBM Research
 
Das IBM Forschungslabor als Arbeitgeber
Das IBM Forschungslabor als ArbeitgeberDas IBM Forschungslabor als Arbeitgeber
Das IBM Forschungslabor als ArbeitgeberIBM Research
 
Nano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - Zurich
Nano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - ZurichNano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - Zurich
Nano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - ZurichIBM Research
 
Dechema Conference: Istanbul
Dechema Conference: IstanbulDechema Conference: Istanbul
Dechema Conference: IstanbulIBM Research
 
IBM Research: IBM 2010 Investor Briefing
IBM Research: IBM 2010 Investor BriefingIBM Research: IBM 2010 Investor Briefing
IBM Research: IBM 2010 Investor BriefingIBM Research
 

More from IBM Research (11)

IBM Research - Zurich Celebrates 60 Years of Science and Innovation
IBM Research - Zurich Celebrates 60 Years of Science and InnovationIBM Research - Zurich Celebrates 60 Years of Science and Innovation
IBM Research - Zurich Celebrates 60 Years of Science and Innovation
 
IBM/ASTRON DOME 64-bit Hot Water Cooled Microserver
IBM/ASTRON DOME  64-bit Hot Water Cooled MicroserverIBM/ASTRON DOME  64-bit Hot Water Cooled Microserver
IBM/ASTRON DOME 64-bit Hot Water Cooled Microserver
 
IBM and ASTRON 64bit μServer for DOME
IBM and ASTRON 64bit μServer for DOMEIBM and ASTRON 64bit μServer for DOME
IBM and ASTRON 64bit μServer for DOME
 
The Dilemmas of Innovation Management
The Dilemmas of Innovation ManagementThe Dilemmas of Innovation Management
The Dilemmas of Innovation Management
 
Big Data and the Future of Storage
Big Data and the Future of StorageBig Data and the Future of Storage
Big Data and the Future of Storage
 
The New Era of Cognitive Computing
The New Era of Cognitive ComputingThe New Era of Cognitive Computing
The New Era of Cognitive Computing
 
Das IBM Forschungslabor als Arbeitgeber
Das IBM Forschungslabor als ArbeitgeberDas IBM Forschungslabor als Arbeitgeber
Das IBM Forschungslabor als Arbeitgeber
 
Nano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - Zurich
Nano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - ZurichNano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - Zurich
Nano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - Zurich
 
Meet IBM Research
Meet IBM ResearchMeet IBM Research
Meet IBM Research
 
Dechema Conference: Istanbul
Dechema Conference: IstanbulDechema Conference: Istanbul
Dechema Conference: Istanbul
 
IBM Research: IBM 2010 Investor Briefing
IBM Research: IBM 2010 Investor BriefingIBM Research: IBM 2010 Investor Briefing
IBM Research: IBM 2010 Investor Briefing
 

Recently uploaded

Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 

Recently uploaded (20)

Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 

A Prototype Storage Subsystem based on Phase Change Memory

  • 1. © 2013 IBM Corporation PSS A Prototype Storage Subsystem based on PCM IBM Research – Zurich Ioannis Koltsidas, Roman Pletka, Peter Mueller, Thomas Weigold, Evangelos Eleftheriou University of Patras Maria Varsamou, Athina Ntalla, Elina Bougioukou, Aspasia Palli, Theodore Antonakopoulos
  • 2. © 2013 IBM Corporation Phase Change Memory (PCM)  Based on the thermal threshold switching effect of chalcogenidic meterials  Two Phases: 2 Set Reset Amorphous Phase Crystalline Phase  Phases have very different electrical resistances (ratio of 1:100 to 1:1000)  Transition between phases by controlled heating and cooling  Read time: 100-300 nsec  Program time: 10-150 μsec  PCM cells can be reprogrammed at least 106 times  Performance and price characteristics between DRAM and Flash
  • 3. © 2013 IBM Corporation Placing PCM in Servers and Storage Systems Server System Storage System RAID ctrl HBA CPUs DRAM Cache I/O bus RAID ctrl CPUs I/O bus HA PCM PCM Hybrid PCIe attached SSD Hybrid SAS/SATA attached SSD PCM Hybrid PCIe attached SSD Hybrid SAS/SATA attached SSD PCM DRAM DRAM PCM DRAM Cache PCMPCM PCM PCM 3 PCMPCM HDDSSD All-Flash Array Metadata Tables
  • 4. © 2013 IBM Corporation PSS: A PCM-based PCI-e Prototype Card 4 Goal: Architect a PCM-based device and implement a fully-functional, high-performance PCI-e card  Take advantage of the characteristics of PCM and mitigate its limitations  Target is workloads dominated by 4kB requests  Simple, lightweight hardware design  System integration of multiple cards through software  Emphasis on consistently low, predictable latency PCM PCI-e cardHost PCM chips PCM Channels Lean PCM Controller Multi-lane PCI-e bus Block Device Driver PCM pipe #1 PCM pipe #n 512B 4kB  Limited in capacity due to the density of commercially available PCM parts (as of early 2013)  Use cases: – Caching device – Metadata store – Backend for low-latency Key-Value store – Tiered storage device in a hybrid configuration with Flash
  • 5. © 2013 IBM Corporation PCM Parts: Micron P5Q  90nm technology node  128 Mbit devices (NP5Q128AE3ESFC0E)  SPI bus compatible serial interface  Maximum clock frequency: 66 MHz  64-byte write buffer – 120 μsec average program time – about 0.5 MB/s write bandwidth 5 Block transfer time: 8.24 usecs (64+4 Bytes) Sector I/O (512B + 64B): Write: 1.15 msecs (0.86 kIOPs) Read: 75.24 usecs (13.29 kIOPs) Very asymmetric read / write performance
  • 6. © 2013 IBM Corporation 2D PCM Channel Architecture 6 PCM (1,NI) PCM (1, NI -1) PCM (1,1) k PCM (2, NI) PCM (2, NI -1) PCM (2,1) k PCM (NS, NI) PCM (NS, NI -1) PCM (NS,1) k FSM Data Buffer FSM FSMFSM Sub-channel PCM Channel Controller Sub-bank
  • 7. © 2013 IBM Corporation Read vs. Write Performance Trade-Offs  A high degree of pipelining: – Increases the write performance • By having the long programming times overlap – May reduce the read performance • Read times are anyway very short • Less parallelism due to fewer I/O pins  For a given budget of I/O pins: – More sub-channels  Better read performance – More sub-banks  Better write performance  Application needs should drive the configuration – Channels with different geometry in the same device possible  We chose a configuration that minimizes write latency without severely penalizing reads 7
  • 8. © 2013 IBM Corporation 3x3 Channel Architecture 8 PCM (1,3) PCM (1, 2) PCM (1,1) k PCM (2, 3) PCM (2, 2) PCM (2,1) k PCM (3, 3) PCM (3, 2) PCM (3,1) k FSM Data Buffer FSM FSMFSM PCM Channel Controller 1 Block = 64 bytes 3 blocks 3 blocks 3 blocks For each user sector (512b= 8x64), we store 64bytes of metadata, i.e., 9 blocks in total P C M B a n k
  • 9. © 2013 IBM Corporation PSS Channel Card One PCM channel per side Two PCM channels per card Channel card CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout [1,1] [1,3] [3,1] [3,3] [1,2] [2,1] [2,3] [2,2] [3,2] CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout [1,1] [1,3] [3,1] [3,3] [1,2] [2,1] [2,3] [2,2] [3,2] CS[3] CS[2] CS[1] CLK D1[1:0] D2[1:0] D3[1:0]DIR C B A Y0 Y1 Y2Y3Y4Y5 3 to 8 DECODER PCM channel specs Data transfer Rate: 49.5 MBps Sector read time: 13.8 usecs Sector read rate: 61.6 ksectors/sec Sector write time: 133.8 usecs Sector write rate: 14.8 ksectors/sec 1 PCM Channel = 2 Banks (2x3x3)
  • 10. © 2013 IBM Corporation PSS PCI-e Card 10 PCMChannelmodule Xilinx Zynq-7045 FPGA Board 2XHost-attachedPSSCards  Error correction based on simple BCH codes o 6 BCH codewords per 512 bytes sector with 4 bits error correction capability per codeword.  Wear leveling using a Start-Gap scheme  512MB of DRAM, mostly used as a write cache  Support for cached writes, direct writes with early completion, direct writes with late completion 8 PCM channels with pipeline support per card
  • 11. © 2013 IBM Corporation PSS Experimental Results Throughput versus Offered Load (4kB pages)
  • 12. © 2013 IBM Corporation Random Read PSS Experimental Results I/O Completion Latency Distribution PCM technology programming timeRandom Write (cached)
  • 13. © 2013 IBM Corporation Latency distribution comparison to Flash-based devices  Devices – PSS PCI-e Card – MLC Flash PCI-e SSD 1 – MLC Flash PCI-e SSD 2 – TLC Flash SATA SSD  Experiment Per-I/O latency measurements for 2 hours of uniformly random 4kB writes at QD=1 (after 12 hours of preconditioning with the same workload) 13
  • 14. © 2013 IBM Corporation Latency Profile up to 1msec PSS MLC1 MLC2 TLC
  • 15. © 2013 IBM Corporation Latency Profile up to 10msec PSS MLC1 MLC2 TLC
  • 16. © 2013 IBM Corporation Total Latency Profile PSS MLC1 MLC2 TLC
  • 17. © 2013 IBM Corporation PCM Endurance Measurements 17 minimum write time per sector maximum write time per sector mean write time per sector The effect of PCM aging on write time and BER Experimental parameters: • Random data • 32 sectors per write cycle • 4 PCM channels • 8 PCM banks • Pipeline is active Experiment - Perform 10K write cycles with random data (x32 sectors) - Write, read and compare a set of 32 sectors (single write cycle)
  • 18. PCM Write Latency Distribution Experimental parameters: • Random data • 32 sectors per write cycle • 4 PCM channels • 8 PCM banks/controllers • Pipeline is active Experiment - Perform 10K write cycles with random data (x32 sectors) - Write, read and compare a set of 32 sectors (single write cycle)
  • 19. © 2013 IBM Corporation Conclusions  PCM is a promising new memory technology  PSS is a PCI-e attached subsystem that mitigates the limitations of current PCM technology  The 2D Channel Architecture allows the designer to trade-off read performance for write performance and vice-versa  PSS achieved good performace – 65k Read IOPS @ 35 μsec – 15k Write IOPS @ 61 μsec  PSS achieved consistently low write latency – 99.9% of the requests completed within 240 μsec • 12x and 275x lower than MLC and TLC Flash SSDs, respectively – Highest observed latency was 2 msec • 7x and 61x lower than MLC and TLC Flash SSDs, respectively 19
  • 20. © 2013 IBM Corporation20

Editor's Notes

  1. PCM is emerging as one of the most promising NVM technologies and is becoming more and more matureThe active material in PCM is a chalcogenidic alloy which is placed between two electrically conducting electrodes in each cell of the memory array.A low current is then applied that causes a phase transition.
  2. We focus on a PCM-based system that uses PCM in a persistent storage device attached on the I/O bus of the system
  3. One of the main concerns regarding the use of PCM for high-performance applications is its relatively high write latency. The purpose of this work has been to architect a PCM-based device and implement a fully functional, high-performance PCI-e card that takes advantage of the characteristics of PCM and mitigates its limitations. \The design of PSS targets storage workloads that are dominated by small (e.g., 4 kB) random I/O operations and aims to achieve low, predictable latency with reads and writes
  4. Was commercially available when the project started (end of 2012)