SlideShare a Scribd company logo
1 of 20
Download to read offline
Tobias Fuchs
tobias.fuchs@nm.ifi.lmu.de
LMU Munich, MNM Team
www.mnm-team.org
Expressing and Exploiting
Multi-Dimensional Locality
in DASH
SPPEXA Symposium 2016
2Expressing and Exploiting Multi-Dimensional Locality in DASH
Background
3Expressing and Exploiting Multi-Dimensional Locality in DASH
DASH
• Vision: “C++ standard template library for HPC”.
• Provides n-dim array abstraction for stencil- and dense matrix
operations.
• Realization of the PGAS (partitioned global address space)
programming model.
Background
4Expressing and Exploiting Multi-Dimensional Locality in DASH
PGAS and Locality
• Combine distributed memory into virtual global memory space.
• Strong sense of data ownership:
private, shared local, shared global
int p = 42;
Background
5Expressing and Exploiting Multi-Dimensional Locality in DASH
PGAS and Locality
• Combine distributed memory into virtual global memory space.
• Strong sense of data ownership:
private, shared local, shared global
int p = 42;
dash::Array<T> a;
a.local[4] = p;
Background
6Expressing and Exploiting Multi-Dimensional Locality in DASH
PGAS and Locality
• Combine distributed memory into virtual global memory space.
• Strong sense of data ownership:
private, shared local, shared global
int p;
dash::Array<T> a;
p = a[40];
Background
7Expressing and Exploiting Multi-Dimensional Locality in DASH
PGAS and Locality
• Locality (access distance to data) predominant factor for efficiency.
L = (local accesses) / (total accesses)
• Access pattern on data depends on implementation of algorithm.
• Complexity to maintain locality increases exponentially with the number
of data dimensions.
Objective and Approach
8Expressing and Exploiting Multi-Dimensional Locality in DASH
Objective
Portable efficiency by automatic deduction of optimal data distribution.
Approach
1. Identify distribution properties that allow well-defined specification of
any data distribution.
2. Let algorithms specify soft / hard constraints on distribution properties.
3. Derive optimal distribution for a given set of constraints.
 Automatic deduction of optimal data distribution
Distribution Properties
9Expressing and Exploiting Multi-Dimensional Locality in DASH
Property Categories
Mappings in data distribution can be categorized by their stages:
Partitioning Decomposing the index domain to blocks
Mapping Assigning blocks to units
Layout Storage order of block elements in units’ local memory
Distribution Properties
10Expressing and Exploiting Multi-Dimensional Locality in DASH
Example: Morton Order Distribution
Category Properties
Partitioning balanced, regular, rectangular
Mapping balanced, minimal, neighbor
Layout blocked, linear, canonical
Use Cases
11Expressing and Exploiting Multi-Dimensional Locality in DASH
Automatic Deduction of Optimal Data Distribution
“Find a data distribution that fulfills a set of properties.”
// Deduces pattern type, initializes pattern instance:
auto pattern =
make_pattern< _
partitioning_properties< |-- compile time deduction
balanced, regular >, | via C++11 generic meta template
mapping_properties< | programming
neighbor > |
layout_properties< |
blocked, row_major > _|
> _
(Size<2>(10000,10000), |-- run time deduction
Team<2>(24,24)); _|
Use Cases
12Expressing and Exploiting Multi-Dimensional Locality in DASH
Automatic Deduction of Optimal Data Distribution
“Find a data distribution that is optimal for a given algorithm.”
// Deduce pattern from algorithm constraints:
auto pattern = dash::make_pattern< dash::summa_pattern_constraints >(
Size<2>(10000,10000),
Team<2>(24,24));
dash::Matrix<double, 2> matrix_a(pattern);
dash::Matrix<double, 2> matrix_b(pattern);
dash::Matrix<double, 2> matrix_c(pattern);
dash::summa(matrix_a, matrix_b, matrix_c);
Use Cases
13Expressing and Exploiting Multi-Dimensional Locality in DASH
Automatic Deduction of Optimal Algorithm
“Find algorithm variant that is optimal for a given data distribution.”
// Specify how data is distributed in global memory:
auto pattern = dash::TilePattern<2>(10000,10000, TILED(100,100));
dash::Matrix<double, 2> matrix_a(pattern);
dash::Matrix<double, 2> matrix_b(pattern);
dash::Matrix<double, 2> matrix_c(pattern);
// Selects matrix product algorithm variant that is optimal for the given
// pattern:
dash::multiply(matrix_a, matrix_b, matrix_c);
Use Cases
14Expressing and Exploiting Multi-Dimensional Locality in DASH
Automatic Deduction of Optimal Algorithm
“Find data distribution for the most efficient algorithm variant.”
// Use constraints of most efficient algorithm, usually SUMMA for DGEMM:
auto pattern = dash::make_pattern< dash::multiply_pattern_constraints >(
Size<2>(10000,10000),
Team<2>(24,24));
dash::Matrix<double, 2> matrix_a(pattern);
dash::Matrix<double, 2> matrix_b(pattern);
dash::Matrix<double, 2> matrix_c(pattern);
// Calls dash::summa
dash::multiply(matrix_a, matrix_b, matrix_c);
Evaluation: DGEMM
15Expressing and Exploiting Multi-Dimensional Locality in DASH
MKL multithreaded vs. DASH MPI (GFLOP/s)
DASH: automatic distribution of matrix elements to MPI processes,
each using serial MKL for block matrix multiplication (SUMMA).
MKL: OpenMP threads, matrix initialization in master thread.
Evaluation: DGEMM
16Expressing and Exploiting Multi-Dimensional Locality in DASH
MKL multithreaded vs. DASH MPI (Speedup)
DASH: High locality due to optimal data distribution,
massive communication overhead (MPI, no shared windows).
MKL: Low locality (first touch issues), no communication.
 DASH beats MKL for bigger N and higher degrees of parallelism.
Speedup = DASHGFLOPS / MKLGFLOPS
Evaluation: SGEMM
17Expressing and Exploiting Multi-Dimensional Locality in DASH
MKL multithreaded vs. DASH MPI (GFLOP/s)
DASH: automatic distribution of matrix elements to MPI processes,
each using serial MKL for block matrix multiplication (SUMMA).
MKL: OpenMP threads, matrix initialization in master thread.
Evaluation: SGEMM
18Expressing and Exploiting Multi-Dimensional Locality in DASH
MKL multithreaded vs. DASH MPI (Speedup)
DASH: High locality due to optimal data distribution,
massive communication overhead (MPI, no shared windows).
MKL: Low locality (first touch issues), no communication.
 DASH beats MKL for bigger N and higher degrees of parallelism.
Speedup = DASHGFLOPS / MKLGFLOPS
Summary
19Expressing and Exploiting Multi-Dimensional Locality in DASH
Summary
• Optimal distribution of n-dim data depends on unmanageable multitude
of factors (topology, access pattern, data flow, …).
• We defined a universal classification of distribution properties.
• Property system allows automatic deduction of optimal data distribution
and algorithm variants at compile time and run time.
Works with any C++11 compiler (tested: Intel 14.0+, gcc 4.7+, clang).
• Work in progress: optimal data distribution for data flows.
Tobias Fuchs
tobias.fuchs@nm.ifi.lmu.de
www.mnm-team.org/~fuchst
DASH Project
www.dash-project.org
Visit for upcoming release

More Related Content

What's hot

Colfax-Winograd-Summary _final (1)
Colfax-Winograd-Summary _final (1)Colfax-Winograd-Summary _final (1)
Colfax-Winograd-Summary _final (1)
Sangamesh Ragate
 
The convergence of HPC and BigData: What does it mean for HPC sysadmins?
The convergence of HPC and BigData: What does it mean for HPC sysadmins?The convergence of HPC and BigData: What does it mean for HPC sysadmins?
The convergence of HPC and BigData: What does it mean for HPC sysadmins?
inside-BigData.com
 
Dremel interactive analysis of web scale datasets
Dremel interactive analysis of web scale datasetsDremel interactive analysis of web scale datasets
Dremel interactive analysis of web scale datasets
Carl Lu
 
Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets
robertlz
 
High Dimensional Indexing using MongoDB (MongoSV 2012)
High Dimensional Indexing using MongoDB (MongoSV 2012)High Dimensional Indexing using MongoDB (MongoSV 2012)
High Dimensional Indexing using MongoDB (MongoSV 2012)
Nicholas Knize, Ph.D., GISP
 

What's hot (20)

Colfax-Winograd-Summary _final (1)
Colfax-Winograd-Summary _final (1)Colfax-Winograd-Summary _final (1)
Colfax-Winograd-Summary _final (1)
 
The convergence of HPC and BigData: What does it mean for HPC sysadmins?
The convergence of HPC and BigData: What does it mean for HPC sysadmins?The convergence of HPC and BigData: What does it mean for HPC sysadmins?
The convergence of HPC and BigData: What does it mean for HPC sysadmins?
 
Communication Patterns with Apache Spark-(Reza Zadeh, Stanford)
Communication Patterns with Apache Spark-(Reza Zadeh, Stanford)Communication Patterns with Apache Spark-(Reza Zadeh, Stanford)
Communication Patterns with Apache Spark-(Reza Zadeh, Stanford)
 
KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.
 
Target Holding - Big Dikes and Big Data
Target Holding - Big Dikes and Big DataTarget Holding - Big Dikes and Big Data
Target Holding - Big Dikes and Big Data
 
Sandy Ryza – Software Engineer, Cloudera at MLconf ATL
Sandy Ryza – Software Engineer, Cloudera at MLconf ATLSandy Ryza – Software Engineer, Cloudera at MLconf ATL
Sandy Ryza – Software Engineer, Cloudera at MLconf ATL
 
Relational Algebra and MapReduce
Relational Algebra and MapReduceRelational Algebra and MapReduce
Relational Algebra and MapReduce
 
Google's Dremel
Google's DremelGoogle's Dremel
Google's Dremel
 
Dremel interactive analysis of web scale datasets
Dremel interactive analysis of web scale datasetsDremel interactive analysis of web scale datasets
Dremel interactive analysis of web scale datasets
 
MapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsMapReduce Scheduling Algorithms
MapReduce Scheduling Algorithms
 
On Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and ExperimentsOn Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and Experiments
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduce
 
Distributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsDistributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasets
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
 
Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets
 
Pig Experience
Pig ExperiencePig Experience
Pig Experience
 
Yarn spark next_gen_hadoop_8_jan_2014
Yarn spark next_gen_hadoop_8_jan_2014Yarn spark next_gen_hadoop_8_jan_2014
Yarn spark next_gen_hadoop_8_jan_2014
 
Ling liu part 02:big graph processing
Ling liu part 02:big graph processingLing liu part 02:big graph processing
Ling liu part 02:big graph processing
 
High Dimensional Indexing using MongoDB (MongoSV 2012)
High Dimensional Indexing using MongoDB (MongoSV 2012)High Dimensional Indexing using MongoDB (MongoSV 2012)
High Dimensional Indexing using MongoDB (MongoSV 2012)
 
MapReduce
MapReduceMapReduce
MapReduce
 

Viewers also liked

Correo electronico diapo
Correo electronico diapoCorreo electronico diapo
Correo electronico diapo
Nath Rosales
 
El Verbo powerpoint
El Verbo powerpointEl Verbo powerpoint
El Verbo powerpoint
Hernan Vlt
 

Viewers also liked (17)

Special Kashmir Ex- Jammu
Special Kashmir Ex- JammuSpecial Kashmir Ex- Jammu
Special Kashmir Ex- Jammu
 
Emergent UX: Seducing the six minds
Emergent UX: Seducing the six mindsEmergent UX: Seducing the six minds
Emergent UX: Seducing the six minds
 
MODERNIZING YOUR WORKPLACE WITH OFFICE 365
MODERNIZING YOUR WORKPLACE WITH OFFICE 365MODERNIZING YOUR WORKPLACE WITH OFFICE 365
MODERNIZING YOUR WORKPLACE WITH OFFICE 365
 
Next 2013: Conference on Innovation and The Future
Next 2013: Conference on Innovation and The FutureNext 2013: Conference on Innovation and The Future
Next 2013: Conference on Innovation and The Future
 
B2B Communication Matrix - Gruppo 1
B2B Communication Matrix - Gruppo 1B2B Communication Matrix - Gruppo 1
B2B Communication Matrix - Gruppo 1
 
CV Team / Resume template
CV Team / Resume templateCV Team / Resume template
CV Team / Resume template
 
Correo electronico diapo
Correo electronico diapoCorreo electronico diapo
Correo electronico diapo
 
"e" is for "everywhere": Designing email in the mobile age
"e" is for "everywhere": Designing email in the mobile age"e" is for "everywhere": Designing email in the mobile age
"e" is for "everywhere": Designing email in the mobile age
 
Wapiti Labs Inc. Website Design
Wapiti Labs Inc. Website DesignWapiti Labs Inc. Website Design
Wapiti Labs Inc. Website Design
 
Internet of Things: How Finance Should Embrace the Coming Flood to Drive Top-...
Internet of Things: How Finance Should Embrace the Coming Flood to Drive Top-...Internet of Things: How Finance Should Embrace the Coming Flood to Drive Top-...
Internet of Things: How Finance Should Embrace the Coming Flood to Drive Top-...
 
Reflexões sobre o terceiro ciclo dirigidas para alunos de doutoramento
Reflexões sobre o terceiro ciclo dirigidas para alunos de doutoramentoReflexões sobre o terceiro ciclo dirigidas para alunos de doutoramento
Reflexões sobre o terceiro ciclo dirigidas para alunos de doutoramento
 
Présentation EasyShair
Présentation EasyShairPrésentation EasyShair
Présentation EasyShair
 
Superstitious and Deluded Beliefs
Superstitious and Deluded BeliefsSuperstitious and Deluded Beliefs
Superstitious and Deluded Beliefs
 
Lumi
LumiLumi
Lumi
 
El Verbo powerpoint
El Verbo powerpointEl Verbo powerpoint
El Verbo powerpoint
 
Kia case study
Kia case studyKia case study
Kia case study
 
Website Design Trend 2016
Website Design Trend 2016Website Design Trend 2016
Website Design Trend 2016
 

Similar to Expressing and Exploiting Multi-Dimensional Locality in DASH

Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...
Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...
Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...
Kiruthikak14
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersHDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
Xiao Qin
 

Similar to Expressing and Exploiting Multi-Dimensional Locality in DASH (20)

Enterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using SparkEnterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using Spark
 
Data processing platforms with SMACK: Spark and Mesos internals
Data processing platforms with SMACK:  Spark and Mesos internalsData processing platforms with SMACK:  Spark and Mesos internals
Data processing platforms with SMACK: Spark and Mesos internals
 
Hadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologiesHadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologies
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
 
Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...
Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...
Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache Spark
 
Bigdata analytics K.kiruthika 2nd M.Sc.,computer science Bon secoures college...
Bigdata analytics K.kiruthika 2nd M.Sc.,computer science Bon secoures college...Bigdata analytics K.kiruthika 2nd M.Sc.,computer science Bon secoures college...
Bigdata analytics K.kiruthika 2nd M.Sc.,computer science Bon secoures college...
 
Big data distributed processing: Spark introduction
Big data distributed processing: Spark introductionBig data distributed processing: Spark introduction
Big data distributed processing: Spark introduction
 
E031201032036
E031201032036E031201032036
E031201032036
 
MAD skills for analysis and big data Machine Learning
MAD skills for analysis and big data Machine LearningMAD skills for analysis and big data Machine Learning
MAD skills for analysis and big data Machine Learning
 
PointNet
PointNetPointNet
PointNet
 
Hadoop
HadoopHadoop
Hadoop
 
11. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:211. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:2
 
A Tale of Data Pattern Discovery in Parallel
A Tale of Data Pattern Discovery in ParallelA Tale of Data Pattern Discovery in Parallel
A Tale of Data Pattern Discovery in Parallel
 
MAP REDUCE BASED ON CLOAK DHT DATA REPLICATION EVALUATION
MAP REDUCE BASED ON CLOAK DHT DATA REPLICATION EVALUATIONMAP REDUCE BASED ON CLOAK DHT DATA REPLICATION EVALUATION
MAP REDUCE BASED ON CLOAK DHT DATA REPLICATION EVALUATION
 
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLabBeyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
 
Big data analytics_beyond_hadoop_public_18_july_2013
Big data analytics_beyond_hadoop_public_18_july_2013Big data analytics_beyond_hadoop_public_18_july_2013
Big data analytics_beyond_hadoop_public_18_july_2013
 
Scala+data
Scala+dataScala+data
Scala+data
 
Spark training-in-bangalore
Spark training-in-bangaloreSpark training-in-bangalore
Spark training-in-bangalore
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersHDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
 

Recently uploaded

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 

Recently uploaded (20)

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 

Expressing and Exploiting Multi-Dimensional Locality in DASH

  • 1. Tobias Fuchs tobias.fuchs@nm.ifi.lmu.de LMU Munich, MNM Team www.mnm-team.org Expressing and Exploiting Multi-Dimensional Locality in DASH SPPEXA Symposium 2016
  • 2. 2Expressing and Exploiting Multi-Dimensional Locality in DASH
  • 3. Background 3Expressing and Exploiting Multi-Dimensional Locality in DASH DASH • Vision: “C++ standard template library for HPC”. • Provides n-dim array abstraction for stencil- and dense matrix operations. • Realization of the PGAS (partitioned global address space) programming model.
  • 4. Background 4Expressing and Exploiting Multi-Dimensional Locality in DASH PGAS and Locality • Combine distributed memory into virtual global memory space. • Strong sense of data ownership: private, shared local, shared global int p = 42;
  • 5. Background 5Expressing and Exploiting Multi-Dimensional Locality in DASH PGAS and Locality • Combine distributed memory into virtual global memory space. • Strong sense of data ownership: private, shared local, shared global int p = 42; dash::Array<T> a; a.local[4] = p;
  • 6. Background 6Expressing and Exploiting Multi-Dimensional Locality in DASH PGAS and Locality • Combine distributed memory into virtual global memory space. • Strong sense of data ownership: private, shared local, shared global int p; dash::Array<T> a; p = a[40];
  • 7. Background 7Expressing and Exploiting Multi-Dimensional Locality in DASH PGAS and Locality • Locality (access distance to data) predominant factor for efficiency. L = (local accesses) / (total accesses) • Access pattern on data depends on implementation of algorithm. • Complexity to maintain locality increases exponentially with the number of data dimensions.
  • 8. Objective and Approach 8Expressing and Exploiting Multi-Dimensional Locality in DASH Objective Portable efficiency by automatic deduction of optimal data distribution. Approach 1. Identify distribution properties that allow well-defined specification of any data distribution. 2. Let algorithms specify soft / hard constraints on distribution properties. 3. Derive optimal distribution for a given set of constraints.  Automatic deduction of optimal data distribution
  • 9. Distribution Properties 9Expressing and Exploiting Multi-Dimensional Locality in DASH Property Categories Mappings in data distribution can be categorized by their stages: Partitioning Decomposing the index domain to blocks Mapping Assigning blocks to units Layout Storage order of block elements in units’ local memory
  • 10. Distribution Properties 10Expressing and Exploiting Multi-Dimensional Locality in DASH Example: Morton Order Distribution Category Properties Partitioning balanced, regular, rectangular Mapping balanced, minimal, neighbor Layout blocked, linear, canonical
  • 11. Use Cases 11Expressing and Exploiting Multi-Dimensional Locality in DASH Automatic Deduction of Optimal Data Distribution “Find a data distribution that fulfills a set of properties.” // Deduces pattern type, initializes pattern instance: auto pattern = make_pattern< _ partitioning_properties< |-- compile time deduction balanced, regular >, | via C++11 generic meta template mapping_properties< | programming neighbor > | layout_properties< | blocked, row_major > _| > _ (Size<2>(10000,10000), |-- run time deduction Team<2>(24,24)); _|
  • 12. Use Cases 12Expressing and Exploiting Multi-Dimensional Locality in DASH Automatic Deduction of Optimal Data Distribution “Find a data distribution that is optimal for a given algorithm.” // Deduce pattern from algorithm constraints: auto pattern = dash::make_pattern< dash::summa_pattern_constraints >( Size<2>(10000,10000), Team<2>(24,24)); dash::Matrix<double, 2> matrix_a(pattern); dash::Matrix<double, 2> matrix_b(pattern); dash::Matrix<double, 2> matrix_c(pattern); dash::summa(matrix_a, matrix_b, matrix_c);
  • 13. Use Cases 13Expressing and Exploiting Multi-Dimensional Locality in DASH Automatic Deduction of Optimal Algorithm “Find algorithm variant that is optimal for a given data distribution.” // Specify how data is distributed in global memory: auto pattern = dash::TilePattern<2>(10000,10000, TILED(100,100)); dash::Matrix<double, 2> matrix_a(pattern); dash::Matrix<double, 2> matrix_b(pattern); dash::Matrix<double, 2> matrix_c(pattern); // Selects matrix product algorithm variant that is optimal for the given // pattern: dash::multiply(matrix_a, matrix_b, matrix_c);
  • 14. Use Cases 14Expressing and Exploiting Multi-Dimensional Locality in DASH Automatic Deduction of Optimal Algorithm “Find data distribution for the most efficient algorithm variant.” // Use constraints of most efficient algorithm, usually SUMMA for DGEMM: auto pattern = dash::make_pattern< dash::multiply_pattern_constraints >( Size<2>(10000,10000), Team<2>(24,24)); dash::Matrix<double, 2> matrix_a(pattern); dash::Matrix<double, 2> matrix_b(pattern); dash::Matrix<double, 2> matrix_c(pattern); // Calls dash::summa dash::multiply(matrix_a, matrix_b, matrix_c);
  • 15. Evaluation: DGEMM 15Expressing and Exploiting Multi-Dimensional Locality in DASH MKL multithreaded vs. DASH MPI (GFLOP/s) DASH: automatic distribution of matrix elements to MPI processes, each using serial MKL for block matrix multiplication (SUMMA). MKL: OpenMP threads, matrix initialization in master thread.
  • 16. Evaluation: DGEMM 16Expressing and Exploiting Multi-Dimensional Locality in DASH MKL multithreaded vs. DASH MPI (Speedup) DASH: High locality due to optimal data distribution, massive communication overhead (MPI, no shared windows). MKL: Low locality (first touch issues), no communication.  DASH beats MKL for bigger N and higher degrees of parallelism. Speedup = DASHGFLOPS / MKLGFLOPS
  • 17. Evaluation: SGEMM 17Expressing and Exploiting Multi-Dimensional Locality in DASH MKL multithreaded vs. DASH MPI (GFLOP/s) DASH: automatic distribution of matrix elements to MPI processes, each using serial MKL for block matrix multiplication (SUMMA). MKL: OpenMP threads, matrix initialization in master thread.
  • 18. Evaluation: SGEMM 18Expressing and Exploiting Multi-Dimensional Locality in DASH MKL multithreaded vs. DASH MPI (Speedup) DASH: High locality due to optimal data distribution, massive communication overhead (MPI, no shared windows). MKL: Low locality (first touch issues), no communication.  DASH beats MKL for bigger N and higher degrees of parallelism. Speedup = DASHGFLOPS / MKLGFLOPS
  • 19. Summary 19Expressing and Exploiting Multi-Dimensional Locality in DASH Summary • Optimal distribution of n-dim data depends on unmanageable multitude of factors (topology, access pattern, data flow, …). • We defined a universal classification of distribution properties. • Property system allows automatic deduction of optimal data distribution and algorithm variants at compile time and run time. Works with any C++11 compiler (tested: Intel 14.0+, gcc 4.7+, clang). • Work in progress: optimal data distribution for data flows.