SlideShare a Scribd company logo
1 of 21
Performance Improvement Techniques for Software
Distributed Shared Memory
Speaker :呂宗螢
Adviser :梁文耀 老師
Date : 2007/3/9
Embedded and Parallel Systems Lab2
Paper
 Byung-Hyun Yu; Werstein, P.; Purvis, M.; Cranefield, S. ,
“Performance improvement techniques for software distributed
shared memory “
11th International Conference on Parallel and Distributed Systems,
2005. Proceedings. Volume 1, 20-22 July 2005 Page(s):119 - 125
Vol. 1
Embedded and Parallel Systems Lab3
Reference
 L. Iftode, J.P. Singh and K. Li: "Scope Consistency: A Bridge
between Release Consistency and Entry Consistency," In Proc.
of the 8th Annual ACM Symposium on Parallel Algorithms and
Architectures, 1996.
Embedded and Parallel Systems Lab4
Outline
 Introduction
 Implementation of ScC model
 Diff Integration Technique
 Dynamic Home Migration
 Performance Evaluation Environment
 Performance Evaluation
Embedded and Parallel Systems Lab5
Introduction
 It is more convenient to implement parallel
algorithms by using shared variables
compared to message passing in which a
programmer explicitly sends or receives
data between.
 DSM hasn’t been a major attraction to the
parallel computing community due to its
slow performance.
Embedded and Parallel Systems Lab6
Introduction
 Lazy home-based (LHB)
 Scope consistency (ScC)
 Diff integration technique which can solve most
diff accumulation problems
 A dynamic home migration protocol that solves
the static homes assignment problem in the
original home-based protocol.
 To evaluate the techniques, using well know
DSM benchmark applications.
Embedded and Parallel Systems Lab7
Implementation of ScC model
 The LHB protocol does not send diffs to
home nodes between two consecutive
barriers.
 Uses the update protocol during lock
synchronization and the invalidation
protocol for global scope during barrier
synchronization.
Embedded and Parallel Systems Lab8
Implementation of ScC model
Embedded and Parallel Systems Lab9
Diff Integration Technique
 Twinning occurs before diff application
and not after a write page fault.
 In this way, all previous diffs on the same
page made in the same critical section are
preserved and integrated into a single
integrated diff.
Embedded and Parallel Systems Lab10
Diff Integration Technique
Embedded and Parallel Systems Lab11
Dynamic Home Migration
 The home-based protocol has a weakness when
a home node is allocated for pages that are not
accessed or are less frequently accessed by the
home node compared with other nodes.
 General home migration techniques proposed
provide a solution only for single writer DSM
applications
 To migrate homes at the time of lock
synchronization (acq & rel)
Embedded and Parallel Systems Lab12
Dynamic Home Migration
 This paper propose a home migration
technique which can decide optimum
home nodes for multiple writer
applications as well as single writer
applications.
 Uses a barrier process in which best home
nodes are piggybacked with other
coherence –related data, thus minimizing
the home finding and data communication
overheads.
Embedded and Parallel Systems Lab13
Dynamic Home Migration
Embedded and Parallel Systems Lab14
Dynamic Home Migration
1. All nodes record their dirty pages between two
consecutive barriers.
2. Upon arrival at a barrier, all nodes create final
NCS diffs.
3. All nodes except the barrier manager node
send their invalidation notices including each
dirty page diff size to the manager node.
4. Barrier manager receives a barrier arrival
notice including a dirty page list and the size of
each dirty page diff from every node.
Embedded and Parallel Systems Lab15
Dynamic Home Migration
5. Whenever the manager receives the notice, it
accumulates dirty pages, creates global dirty
pages, and sets a home node which has the
maximum diff size for each dirty page
6. Receiving the new home node list, all nodes
update home nodes by sending their diffs to
corresponding home.
 Note That only the last lock owner updates the
home nodes with its integrated diffs made in
the lock synchronization if the last lock owner is
not the home of the CS diff.
Embedded and Parallel Systems Lab16
Performance Evaluation
Environment
 TM : ThreadMarks which is a home less LRC
 CHBLRC : conventional home-based LRC (eager, there is no diff
integration, static home)
 LHB (or LHB ScC) : lazy home-based Scope consistency
 Network has 32 nodes
 100Mbit switched ethernet
 350 MHz Pentium II CPU
 192 MB of memory
 Gentoo Linux with gcc3.3.2
Embedded and Parallel Systems Lab17
Performance Evaluation
Environment
 PNN : parallel neural network application (lock & barrier)
 Barnes-Hut : Barnes-Hut N-Body algorithm (barrier)
 IS : Integer sort (barrier)
 Water : simulates water molecular dynamic (lock & barrier)
 SOR : Successive Over-Relaxation (barrier)
Embedded and Parallel Systems Lab18
Performance Evaluation
Embedded and Parallel Systems Lab19
Performance Evaluation
Embedded and Parallel Systems Lab20
Performance Evaluation
 Diff integration Effect on PNN and Water
Embedded and Parallel Systems Lab21
Thank you!

More Related Content

What's hot

The Silence of the Canaries
The Silence of the CanariesThe Silence of the Canaries
The Silence of the CanariesKernel TLV
 
Process synchronization in Operating Systems
Process synchronization in Operating SystemsProcess synchronization in Operating Systems
Process synchronization in Operating SystemsRitu Ranjan Shrivastwa
 
Comparing Write-Ahead Logging and the Memory Bus Using
Comparing Write-Ahead Logging and the Memory Bus UsingComparing Write-Ahead Logging and the Memory Bus Using
Comparing Write-Ahead Logging and the Memory Bus Usingjorgerodriguessimao
 
RxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsRxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsBrendan Gregg
 
Synopsis on "ANALYZING THE EFFECTIVENESS OF THE ADVANCED ENCRYPTION STANDARD ...
Synopsis on "ANALYZING THE EFFECTIVENESS OF THE ADVANCED ENCRYPTION STANDARD ...Synopsis on "ANALYZING THE EFFECTIVENESS OF THE ADVANCED ENCRYPTION STANDARD ...
Synopsis on "ANALYZING THE EFFECTIVENESS OF THE ADVANCED ENCRYPTION STANDARD ...Nikhil Jain
 
Nondeterminism is unavoidable, but data races are pure evil
Nondeterminism is unavoidable, but data races are pure evilNondeterminism is unavoidable, but data races are pure evil
Nondeterminism is unavoidable, but data races are pure evilracesworkshop
 
Real time operating systems (rtos) concepts 5
Real time operating systems (rtos) concepts 5Real time operating systems (rtos) concepts 5
Real time operating systems (rtos) concepts 5Abu Bakr Ramadan
 
Introduction to Raft algorithm
Introduction to Raft algorithmIntroduction to Raft algorithm
Introduction to Raft algorithmmuayyad alsadi
 
Linux BPF Superpowers
Linux BPF SuperpowersLinux BPF Superpowers
Linux BPF SuperpowersBrendan Gregg
 
Distributed System by Pratik Tambekar
Distributed System by Pratik TambekarDistributed System by Pratik Tambekar
Distributed System by Pratik TambekarPratik Tambekar
 
Analytical Modeling of End-to-End Delay in OpenFlow Based Networks
Analytical Modeling of End-to-End Delay in OpenFlow Based NetworksAnalytical Modeling of End-to-End Delay in OpenFlow Based Networks
Analytical Modeling of End-to-End Delay in OpenFlow Based NetworksAzeem Iqbal
 
Operating system Q/A
Operating system Q/AOperating system Q/A
Operating system Q/AAbdul Munam
 
Tasklet vs work queues (Deferrable functions in linux)
Tasklet vs work queues (Deferrable functions in linux)Tasklet vs work queues (Deferrable functions in linux)
Tasklet vs work queues (Deferrable functions in linux)RajKumar Rampelli
 
IPC mechanisms in windows
IPC mechanisms in windowsIPC mechanisms in windows
IPC mechanisms in windowsVinoth Raj
 
IRQs: the Hard, the Soft, the Threaded and the Preemptible
IRQs: the Hard, the Soft, the Threaded and the PreemptibleIRQs: the Hard, the Soft, the Threaded and the Preemptible
IRQs: the Hard, the Soft, the Threaded and the PreemptibleAlison Chaiken
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...eXascale Infolab
 
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s goingKernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s goingAnne Nicolas
 

What's hot (20)

The Silence of the Canaries
The Silence of the CanariesThe Silence of the Canaries
The Silence of the Canaries
 
Process synchronization in Operating Systems
Process synchronization in Operating SystemsProcess synchronization in Operating Systems
Process synchronization in Operating Systems
 
Comparing Write-Ahead Logging and the Memory Bus Using
Comparing Write-Ahead Logging and the Memory Bus UsingComparing Write-Ahead Logging and the Memory Bus Using
Comparing Write-Ahead Logging and the Memory Bus Using
 
RxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsRxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance Results
 
Synopsis on "ANALYZING THE EFFECTIVENESS OF THE ADVANCED ENCRYPTION STANDARD ...
Synopsis on "ANALYZING THE EFFECTIVENESS OF THE ADVANCED ENCRYPTION STANDARD ...Synopsis on "ANALYZING THE EFFECTIVENESS OF THE ADVANCED ENCRYPTION STANDARD ...
Synopsis on "ANALYZING THE EFFECTIVENESS OF THE ADVANCED ENCRYPTION STANDARD ...
 
Nondeterminism is unavoidable, but data races are pure evil
Nondeterminism is unavoidable, but data races are pure evilNondeterminism is unavoidable, but data races are pure evil
Nondeterminism is unavoidable, but data races are pure evil
 
Real time operating systems (rtos) concepts 5
Real time operating systems (rtos) concepts 5Real time operating systems (rtos) concepts 5
Real time operating systems (rtos) concepts 5
 
RTX Kernal
RTX KernalRTX Kernal
RTX Kernal
 
Introduction to Raft algorithm
Introduction to Raft algorithmIntroduction to Raft algorithm
Introduction to Raft algorithm
 
Linux BPF Superpowers
Linux BPF SuperpowersLinux BPF Superpowers
Linux BPF Superpowers
 
Distributed System by Pratik Tambekar
Distributed System by Pratik TambekarDistributed System by Pratik Tambekar
Distributed System by Pratik Tambekar
 
Mastering Real-time Linux
Mastering Real-time LinuxMastering Real-time Linux
Mastering Real-time Linux
 
Analytical Modeling of End-to-End Delay in OpenFlow Based Networks
Analytical Modeling of End-to-End Delay in OpenFlow Based NetworksAnalytical Modeling of End-to-End Delay in OpenFlow Based Networks
Analytical Modeling of End-to-End Delay in OpenFlow Based Networks
 
Lock free programming- pro tips
Lock free programming- pro tipsLock free programming- pro tips
Lock free programming- pro tips
 
Operating system Q/A
Operating system Q/AOperating system Q/A
Operating system Q/A
 
Tasklet vs work queues (Deferrable functions in linux)
Tasklet vs work queues (Deferrable functions in linux)Tasklet vs work queues (Deferrable functions in linux)
Tasklet vs work queues (Deferrable functions in linux)
 
IPC mechanisms in windows
IPC mechanisms in windowsIPC mechanisms in windows
IPC mechanisms in windows
 
IRQs: the Hard, the Soft, the Threaded and the Preemptible
IRQs: the Hard, the Soft, the Threaded and the PreemptibleIRQs: the Hard, the Soft, the Threaded and the Preemptible
IRQs: the Hard, the Soft, the Threaded and the Preemptible
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
 
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s goingKernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s going
 

Viewers also liked

提高 Code 品質心得
提高 Code 品質心得提高 Code 品質心得
提高 Code 品質心得ZongYing Lyu
 
Parallel program design
Parallel program designParallel program design
Parallel program designZongYing Lyu
 
Three Post - Media Production Capabilities
Three Post - Media Production CapabilitiesThree Post - Media Production Capabilities
Three Post - Media Production CapabilitiesThree Post
 
Confidentiality
ConfidentialityConfidentiality
ConfidentialityMariek71
 
PKN tentang Rakyat :)
PKN tentang Rakyat :)PKN tentang Rakyat :)
PKN tentang Rakyat :)ichaa17
 
Tata cara perijinan pendakian g
Tata cara perijinan pendakian gTata cara perijinan pendakian g
Tata cara perijinan pendakian gUlfann
 
Niels Vink - Effectiviteit van massa media in digitale tijdperk
Niels Vink - Effectiviteit van massa media in digitale tijdperk Niels Vink - Effectiviteit van massa media in digitale tijdperk
Niels Vink - Effectiviteit van massa media in digitale tijdperk experiannederland
 
English research report
English research reportEnglish research report
English research reportJıa Yıı
 
Programme on Quality Improvement For Cooperative Banking & Non Banking Organi...
Programme on Quality Improvement For Cooperative Banking & Non Banking Organi...Programme on Quality Improvement For Cooperative Banking & Non Banking Organi...
Programme on Quality Improvement For Cooperative Banking & Non Banking Organi...vamnicom123
 
James bond essay
James bond essayJames bond essay
James bond essayVay Lu
 

Viewers also liked (20)

Cvs
CvsCvs
Cvs
 
提高 Code 品質心得
提高 Code 品質心得提高 Code 品質心得
提高 Code 品質心得
 
Parallel program design
Parallel program designParallel program design
Parallel program design
 
Vue.js
Vue.jsVue.js
Vue.js
 
Creative Business Development Briefing - January 2015
Creative Business Development Briefing - January 2015Creative Business Development Briefing - January 2015
Creative Business Development Briefing - January 2015
 
Creative Business Development Briefing - February 2015
Creative Business Development Briefing - February 2015Creative Business Development Briefing - February 2015
Creative Business Development Briefing - February 2015
 
Digital business briefing September 2014
Digital business briefing   September 2014Digital business briefing   September 2014
Digital business briefing September 2014
 
Three Post - Media Production Capabilities
Three Post - Media Production CapabilitiesThree Post - Media Production Capabilities
Three Post - Media Production Capabilities
 
Psy final (1)
Psy final (1)Psy final (1)
Psy final (1)
 
Pelota
PelotaPelota
Pelota
 
Confidentiality
ConfidentialityConfidentiality
Confidentiality
 
PKN tentang Rakyat :)
PKN tentang Rakyat :)PKN tentang Rakyat :)
PKN tentang Rakyat :)
 
Tata cara perijinan pendakian g
Tata cara perijinan pendakian gTata cara perijinan pendakian g
Tata cara perijinan pendakian g
 
Niels Vink - Effectiviteit van massa media in digitale tijdperk
Niels Vink - Effectiviteit van massa media in digitale tijdperk Niels Vink - Effectiviteit van massa media in digitale tijdperk
Niels Vink - Effectiviteit van massa media in digitale tijdperk
 
Cs437 lecture 09
Cs437 lecture 09Cs437 lecture 09
Cs437 lecture 09
 
English research report
English research reportEnglish research report
English research report
 
Programme on Quality Improvement For Cooperative Banking & Non Banking Organi...
Programme on Quality Improvement For Cooperative Banking & Non Banking Organi...Programme on Quality Improvement For Cooperative Banking & Non Banking Organi...
Programme on Quality Improvement For Cooperative Banking & Non Banking Organi...
 
Digital business briefing January 2015
Digital business briefing January 2015Digital business briefing January 2015
Digital business briefing January 2015
 
Digital business briefing August 2014
Digital business briefing   August 2014Digital business briefing   August 2014
Digital business briefing August 2014
 
James bond essay
James bond essayJames bond essay
James bond essay
 

Similar to Performance improvement techniques for software distributed shared memory

An Overview of Distributed Debugging
An Overview of Distributed DebuggingAn Overview of Distributed Debugging
An Overview of Distributed DebuggingAnant Narayanan
 
Systems Support for Many Task Computing
Systems Support for Many Task ComputingSystems Support for Many Task Computing
Systems Support for Many Task ComputingEric Van Hensbergen
 
LOCK-FREE PARALLEL ACCESS COLLECTIONS
LOCK-FREE PARALLEL ACCESS COLLECTIONSLOCK-FREE PARALLEL ACCESS COLLECTIONS
LOCK-FREE PARALLEL ACCESS COLLECTIONSijdpsjournal
 
Lock free parallel access collections
Lock free parallel access collectionsLock free parallel access collections
Lock free parallel access collectionsijdpsjournal
 
Cluster Computing
Cluster Computing Cluster Computing
Cluster Computing Shobha Rani
 
| IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss. 4 | April 2014 ...
    | IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss. 4 | April 2014 ...    | IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss. 4 | April 2014 ...
| IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss. 4 | April 2014 ...IJMER
 
Naveen nimmu sdn future of networking
Naveen nimmu sdn   future of networkingNaveen nimmu sdn   future of networking
Naveen nimmu sdn future of networkingOpenSourceIndia
 
Naveen nimmu sdn future of networking
Naveen nimmu sdn   future of networkingNaveen nimmu sdn   future of networking
Naveen nimmu sdn future of networkingsuniltomar04
 
Parallelization of Coupled Cluster Code with OpenMP
Parallelization of Coupled Cluster Code with OpenMPParallelization of Coupled Cluster Code with OpenMP
Parallelization of Coupled Cluster Code with OpenMPAnil Bohare
 
CIF16: Rethinking Foundations for Zero-devops Clouds (Maxim Kharchenko, Cloud...
CIF16: Rethinking Foundations for Zero-devops Clouds (Maxim Kharchenko, Cloud...CIF16: Rethinking Foundations for Zero-devops Clouds (Maxim Kharchenko, Cloud...
CIF16: Rethinking Foundations for Zero-devops Clouds (Maxim Kharchenko, Cloud...The Linux Foundation
 
Parallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.pptParallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.pptMohmdUmer
 
ZCloud Consensus on Hardware for Distributed Systems
ZCloud Consensus on Hardware for Distributed SystemsZCloud Consensus on Hardware for Distributed Systems
ZCloud Consensus on Hardware for Distributed SystemsGokhan Boranalp
 
Clustering by AKASHMSHAH
Clustering by AKASHMSHAHClustering by AKASHMSHAH
Clustering by AKASHMSHAHAkash M Shah
 
Parallelization using open mp
Parallelization using open mpParallelization using open mp
Parallelization using open mpranjit banshpal
 
Microx - A Unix like kernel for Embedded Systems written from scratch.
Microx - A Unix like kernel for Embedded Systems written from scratch.Microx - A Unix like kernel for Embedded Systems written from scratch.
Microx - A Unix like kernel for Embedded Systems written from scratch.Waqar Sheikh
 
ICCT2017: A user mode implementation of filtering rule management plane using...
ICCT2017: A user mode implementation of filtering rule management plane using...ICCT2017: A user mode implementation of filtering rule management plane using...
ICCT2017: A user mode implementation of filtering rule management plane using...Ruo Ando
 

Similar to Performance improvement techniques for software distributed shared memory (20)

An Overview of Distributed Debugging
An Overview of Distributed DebuggingAn Overview of Distributed Debugging
An Overview of Distributed Debugging
 
Systems Support for Many Task Computing
Systems Support for Many Task ComputingSystems Support for Many Task Computing
Systems Support for Many Task Computing
 
Open shmem
Open shmemOpen shmem
Open shmem
 
LOCK-FREE PARALLEL ACCESS COLLECTIONS
LOCK-FREE PARALLEL ACCESS COLLECTIONSLOCK-FREE PARALLEL ACCESS COLLECTIONS
LOCK-FREE PARALLEL ACCESS COLLECTIONS
 
Lock free parallel access collections
Lock free parallel access collectionsLock free parallel access collections
Lock free parallel access collections
 
Cluster Computing
Cluster Computing Cluster Computing
Cluster Computing
 
Clustering
ClusteringClustering
Clustering
 
| IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss. 4 | April 2014 ...
    | IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss. 4 | April 2014 ...    | IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss. 4 | April 2014 ...
| IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss. 4 | April 2014 ...
 
Naveen nimmu sdn future of networking
Naveen nimmu sdn   future of networkingNaveen nimmu sdn   future of networking
Naveen nimmu sdn future of networking
 
Naveen nimmu sdn future of networking
Naveen nimmu sdn   future of networkingNaveen nimmu sdn   future of networking
Naveen nimmu sdn future of networking
 
Parallelization of Coupled Cluster Code with OpenMP
Parallelization of Coupled Cluster Code with OpenMPParallelization of Coupled Cluster Code with OpenMP
Parallelization of Coupled Cluster Code with OpenMP
 
CIF16: Rethinking Foundations for Zero-devops Clouds (Maxim Kharchenko, Cloud...
CIF16: Rethinking Foundations for Zero-devops Clouds (Maxim Kharchenko, Cloud...CIF16: Rethinking Foundations for Zero-devops Clouds (Maxim Kharchenko, Cloud...
CIF16: Rethinking Foundations for Zero-devops Clouds (Maxim Kharchenko, Cloud...
 
Parallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.pptParallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.ppt
 
SDN_Gustaf_Nilstadius
SDN_Gustaf_NilstadiusSDN_Gustaf_Nilstadius
SDN_Gustaf_Nilstadius
 
ZCloud Consensus on Hardware for Distributed Systems
ZCloud Consensus on Hardware for Distributed SystemsZCloud Consensus on Hardware for Distributed Systems
ZCloud Consensus on Hardware for Distributed Systems
 
CLUSTER COMPUTING
CLUSTER COMPUTINGCLUSTER COMPUTING
CLUSTER COMPUTING
 
Clustering by AKASHMSHAH
Clustering by AKASHMSHAHClustering by AKASHMSHAH
Clustering by AKASHMSHAH
 
Parallelization using open mp
Parallelization using open mpParallelization using open mp
Parallelization using open mp
 
Microx - A Unix like kernel for Embedded Systems written from scratch.
Microx - A Unix like kernel for Embedded Systems written from scratch.Microx - A Unix like kernel for Embedded Systems written from scratch.
Microx - A Unix like kernel for Embedded Systems written from scratch.
 
ICCT2017: A user mode implementation of filtering rule management plane using...
ICCT2017: A user mode implementation of filtering rule management plane using...ICCT2017: A user mode implementation of filtering rule management plane using...
ICCT2017: A user mode implementation of filtering rule management plane using...
 

More from ZongYing Lyu

Device Driver - Chapter 6字元驅動程式的進階作業
Device Driver - Chapter 6字元驅動程式的進階作業Device Driver - Chapter 6字元驅動程式的進階作業
Device Driver - Chapter 6字元驅動程式的進階作業ZongYing Lyu
 
Device Driver - Chapter 3字元驅動程式
Device Driver - Chapter 3字元驅動程式Device Driver - Chapter 3字元驅動程式
Device Driver - Chapter 3字元驅動程式ZongYing Lyu
 
Web coding principle
Web coding principleWeb coding principle
Web coding principleZongYing Lyu
 
Consistency protocols
Consistency protocolsConsistency protocols
Consistency protocolsZongYing Lyu
 
Compiler optimization
Compiler optimizationCompiler optimization
Compiler optimizationZongYing Lyu
 
MPI use c language
MPI use c languageMPI use c language
MPI use c languageZongYing Lyu
 

More from ZongYing Lyu (9)

Device Driver - Chapter 6字元驅動程式的進階作業
Device Driver - Chapter 6字元驅動程式的進階作業Device Driver - Chapter 6字元驅動程式的進階作業
Device Driver - Chapter 6字元驅動程式的進階作業
 
Device Driver - Chapter 3字元驅動程式
Device Driver - Chapter 3字元驅動程式Device Driver - Chapter 3字元驅動程式
Device Driver - Chapter 3字元驅動程式
 
Web coding principle
Web coding principleWeb coding principle
Web coding principle
 
SCRUM
SCRUMSCRUM
SCRUM
 
Consistency protocols
Consistency protocolsConsistency protocols
Consistency protocols
 
Compiler optimization
Compiler optimizationCompiler optimization
Compiler optimization
 
MPI use c language
MPI use c languageMPI use c language
MPI use c language
 
MPI
MPIMPI
MPI
 
OpenMP
OpenMPOpenMP
OpenMP
 

Recently uploaded

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

Performance improvement techniques for software distributed shared memory

  • 1. Performance Improvement Techniques for Software Distributed Shared Memory Speaker :呂宗螢 Adviser :梁文耀 老師 Date : 2007/3/9
  • 2. Embedded and Parallel Systems Lab2 Paper  Byung-Hyun Yu; Werstein, P.; Purvis, M.; Cranefield, S. , “Performance improvement techniques for software distributed shared memory “ 11th International Conference on Parallel and Distributed Systems, 2005. Proceedings. Volume 1, 20-22 July 2005 Page(s):119 - 125 Vol. 1
  • 3. Embedded and Parallel Systems Lab3 Reference  L. Iftode, J.P. Singh and K. Li: "Scope Consistency: A Bridge between Release Consistency and Entry Consistency," In Proc. of the 8th Annual ACM Symposium on Parallel Algorithms and Architectures, 1996.
  • 4. Embedded and Parallel Systems Lab4 Outline  Introduction  Implementation of ScC model  Diff Integration Technique  Dynamic Home Migration  Performance Evaluation Environment  Performance Evaluation
  • 5. Embedded and Parallel Systems Lab5 Introduction  It is more convenient to implement parallel algorithms by using shared variables compared to message passing in which a programmer explicitly sends or receives data between.  DSM hasn’t been a major attraction to the parallel computing community due to its slow performance.
  • 6. Embedded and Parallel Systems Lab6 Introduction  Lazy home-based (LHB)  Scope consistency (ScC)  Diff integration technique which can solve most diff accumulation problems  A dynamic home migration protocol that solves the static homes assignment problem in the original home-based protocol.  To evaluate the techniques, using well know DSM benchmark applications.
  • 7. Embedded and Parallel Systems Lab7 Implementation of ScC model  The LHB protocol does not send diffs to home nodes between two consecutive barriers.  Uses the update protocol during lock synchronization and the invalidation protocol for global scope during barrier synchronization.
  • 8. Embedded and Parallel Systems Lab8 Implementation of ScC model
  • 9. Embedded and Parallel Systems Lab9 Diff Integration Technique  Twinning occurs before diff application and not after a write page fault.  In this way, all previous diffs on the same page made in the same critical section are preserved and integrated into a single integrated diff.
  • 10. Embedded and Parallel Systems Lab10 Diff Integration Technique
  • 11. Embedded and Parallel Systems Lab11 Dynamic Home Migration  The home-based protocol has a weakness when a home node is allocated for pages that are not accessed or are less frequently accessed by the home node compared with other nodes.  General home migration techniques proposed provide a solution only for single writer DSM applications  To migrate homes at the time of lock synchronization (acq & rel)
  • 12. Embedded and Parallel Systems Lab12 Dynamic Home Migration  This paper propose a home migration technique which can decide optimum home nodes for multiple writer applications as well as single writer applications.  Uses a barrier process in which best home nodes are piggybacked with other coherence –related data, thus minimizing the home finding and data communication overheads.
  • 13. Embedded and Parallel Systems Lab13 Dynamic Home Migration
  • 14. Embedded and Parallel Systems Lab14 Dynamic Home Migration 1. All nodes record their dirty pages between two consecutive barriers. 2. Upon arrival at a barrier, all nodes create final NCS diffs. 3. All nodes except the barrier manager node send their invalidation notices including each dirty page diff size to the manager node. 4. Barrier manager receives a barrier arrival notice including a dirty page list and the size of each dirty page diff from every node.
  • 15. Embedded and Parallel Systems Lab15 Dynamic Home Migration 5. Whenever the manager receives the notice, it accumulates dirty pages, creates global dirty pages, and sets a home node which has the maximum diff size for each dirty page 6. Receiving the new home node list, all nodes update home nodes by sending their diffs to corresponding home.  Note That only the last lock owner updates the home nodes with its integrated diffs made in the lock synchronization if the last lock owner is not the home of the CS diff.
  • 16. Embedded and Parallel Systems Lab16 Performance Evaluation Environment  TM : ThreadMarks which is a home less LRC  CHBLRC : conventional home-based LRC (eager, there is no diff integration, static home)  LHB (or LHB ScC) : lazy home-based Scope consistency  Network has 32 nodes  100Mbit switched ethernet  350 MHz Pentium II CPU  192 MB of memory  Gentoo Linux with gcc3.3.2
  • 17. Embedded and Parallel Systems Lab17 Performance Evaluation Environment  PNN : parallel neural network application (lock & barrier)  Barnes-Hut : Barnes-Hut N-Body algorithm (barrier)  IS : Integer sort (barrier)  Water : simulates water molecular dynamic (lock & barrier)  SOR : Successive Over-Relaxation (barrier)
  • 18. Embedded and Parallel Systems Lab18 Performance Evaluation
  • 19. Embedded and Parallel Systems Lab19 Performance Evaluation
  • 20. Embedded and Parallel Systems Lab20 Performance Evaluation  Diff integration Effect on PNN and Water
  • 21. Embedded and Parallel Systems Lab21 Thank you!