SlideShare a Scribd company logo
1 of 31
Download to read offline
BENCHMARK
INSTRUMENTATION
Umit Cavus BUYUKSAHIN
Measurements Tools & Techinics, Spring β€˜12



                                             4/17/2012
Benchmark Instrumentation   2




OUTLINE
β€’ NAS Benchmark Suite
β€’ Experiments
β€’ Paraver Visualization
  β€’ Code View
  β€’ Communication
  β€’ Disk I/O
  β€’ Load Balancing
  β€’ LD1 Cache Miss
  β€’ Cycles per Instruction (CPI)
β€’ Execution Time
β€’ Benchmarking Time
β€’ Conclusion
Benchmark Instrumentation   3




NAS Benchmark Suite

β€’ NAS
   ... is a set of benchmarks.
   ... evaluates performance of highly parallel supercomputers.
   ... developed and maintained by NASA Advanced Supercomputing(NAS).
Benchmark Instrumentation   4




NAS Benchmark Suite
β€’ NAS Kernel Applications
  β€’ IS - Integer Sort
  β€’ EP - Embarrassingly Parallel
  β€’ CG - Conjugate Gradient
  β€’ MG - Multi-Grid
  β€’ FT - discrete 3D fast Fourier Transform


β€’ Problem Sizes
  β€’ S       : small size
  β€’ W       : workstation size
  β€’ A, B, C : standart test size; ~4X size in increasing order
  β€’ D, E, F : large test size; ~16X size in increasing order
Benchmark Instrumentation   5




OUTLINE
β€’ NAS Benchmark Suite
β€’ Experiments
β€’ Paraver Visualization
  β€’ Code View
  β€’ Communication
  β€’ Disk I/O
  β€’ Load Balancing
  β€’ LD1 Cache Miss
  β€’ Cycles per Instruction (CPI)
β€’ Execution Time
β€’ Benchmarking Time
β€’ Conclusion
Benchmark Instrumentation   6




Experiments
β€’ NAS Parallel Benchmark version 3.2.1
β€’ IS Kernel Application:
    ... sorts N keys in parallel.
    ... tests
       β€’ integer computation speed
       β€’ communication perfomance


β€’ S Problem Size:
    ... small for quick test purposes
    ... has 216 keys
Benchmark Instrumentation   7




Experiments
β€’ IS Benchmarking Procedure (generally)

 1. Generating sequence of N keys
 2. Loading N keys into the memory systems
 3. Time begins
 4. Loop
         Sorting & partial verification
 5. Time ends
 6. Full verification.
Benchmark Instrumentation   8




Experiments
Machines:

 β€’ My Computer
    i686 GNU/Linux
    3Gb Ram
    2 CPUSs with 800Mhz


 β€’ Boada
     x86_64 x86_64 x86_64 GNU/Linux
     24Gb Ram
     24 CPUS with 1596Mhz
Benchmark Instrumentation   9




Experiments
Procedure:

  β€’ Not manually instrumented.
  β€’ Paraver traces are automatically generated
    β€’ LD_PRELOAD is exported.
  β€’ Benchmarks are executed with 2,4,8,16,32, and 64 processors.
  β€’ Benchmark results are analyzed
  β€’ Generated traces are examined in paraver tools.
Benchmark Instrumentation   10




OUTLINE
β€’ NAS Benchmark Suite
β€’ Experiments
β€’ Paraver Visualization
  β€’ Code View
  β€’ Communication
  β€’ Disk I/O
  β€’ Load Balancing
  β€’ LD1 Cache Miss
  β€’ Cycles per Instruction (CPI)
β€’ Execution Time
β€’ Benchmarking Time
β€’ Conclusion
Benchmark Instrumentation   11




Paraver Visualization – Code View
β€’ My Computer




β€’ Boada
Benchmark Instrumentation   12




Paraver Visualization – Communication
β€’ My Computer




β€’ Boada
Benchmark Instrumentation   13




Paraver Visualization – Disk I/O
β€’ My Computer




β€’ Boada
Benchmark Instrumentation   14




Paraver Visualization – Load Balance
β€’ My Computer




....
Benchmark Instrumentation   15




Paraver Visualization – Load Balance
β€’ Boada




....
Benchmark Instrumentation   16




Paraver Visualization – LD1 Cache Miss
β€’ My Computer
Benchmark Instrumentation   17




Paraver Visualization – LD1 Cache Miss
β€’ Boada
Benchmark Instrumentation   18




Paraver Visualization – CPI
β€’ My Computer
Benchmark Instrumentation   19




Paraver Visualization – CPI
β€’ Boada
Benchmark Instrumentation   20




OUTLINE
β€’ NAS Benchmark Suite
β€’ Experiments
β€’ Paraver Visualization
  β€’ Code View
  β€’ Communication
  β€’ Disk I/O
  β€’ Load Balancing
  β€’ LD1 Cache Miss
  β€’ Cycles per Instruction (CPI)
β€’ Execution Time
β€’ Benchmarking Time
β€’ Conclusion
Benchmark Instrumentation                       21




Execution Time

                16000


                14000


                12000


                10000

                                                                       MyComputer
    Time (ms)




                 8000

                                                                       Boada
                 6000


                 4000


                 2000


                    0
                        2   4   8    16      32       64        # of processors
Benchmark Instrumentation             22




Execution Time
                               𝐸π‘₯π‘’π‘π‘’π‘‘π‘–π‘œπ‘›π‘‡π‘–π‘šπ‘’ π‘œπ‘“ π‘€π‘¦πΆπ‘œπ‘šπ‘π‘’π‘‘π‘’π‘Ÿ
β€’ Relative Speedup =
                                  𝐸π‘₯π‘’π‘π‘’π‘‘π‘–π‘œπ‘›π‘‡π‘–π‘šπ‘’ π‘œπ‘“ π΅π‘œπ‘Žπ‘‘π‘Ž

                  60


                  50


                  40


                  30
        SpeedUp




                  20


                  10


                   0
                       1   2      4           8         16          32   64

                                      # of processors
Benchmark Instrumentation   23




OUTLINE
β€’ NAS Benchmark Suite
β€’ Experiments
β€’ Paraver Visualization
  β€’ Code View
  β€’ Communication
  β€’ Disk I/O
  β€’ Load Balancing
  β€’ LD1 Cache Miss
  β€’ Cycles per Instruction (CPI)
β€’ Execution Time
β€’ Benchmarking Time
β€’ Conclusion
Benchmark Instrumentation   24




Benchmarking Time - reminder
β€’ IS Benchmarking Procedure (generally)

  1. Generating sequence of N keys
  2. Loading N keys into the memory systems
  3. Time begins
  4. Loop
          Sorting & partial verification
  5. Time ends
  6. Full verification.


β€’ Benchmarking Time = execution time of the parallel
 algorithm
Benchmark Instrumentation                     25




Benchmarking Time
               2,000


               1,800


               1,600


               1,400


               1,200
  Time (sec)




               1,000                                                                  MyComputer
                                                                                      Boada
               0,800


               0,600


               0,400


               0,200


               0,000
                       1   2   4   8         16        32          64   # of processors
Benchmark Instrumentation                  26




Benchmarking Time
                               π΅π‘’π‘›π‘β„Žπ‘šπ‘Žπ‘Ÿπ‘˜π‘–π‘›π‘”π‘‡π‘–π‘šπ‘’ π‘œπ‘“ π‘€π‘¦πΆπ‘œπ‘šπ‘π‘’π‘‘π‘’π‘Ÿ
β€’ Relative Speedup =
                                  π΅π‘’π‘›π‘β„Žπ‘šπ‘Žπ‘Ÿπ‘˜π‘–π‘›π‘”π‘‡π‘–π‘šπ‘’ π‘œπ‘“ π΅π‘œπ‘Žπ‘‘π‘Ž


               70,00


               60,00


               50,00


               40,00
     SpeedUp




               30,00


               20,00


               10,00


                0,00                                                      # of processors
                       1   2   4     8           16        32        64
Benchmark Instrumentation                  27




Benchmarking Time
β€’ SpeedUp of My Computer
             1,2



              1



             0,8
   SpeedUp




             0,6



             0,4



             0,2



              0
                                                                    # of processors
                   1   2   4   8        16         32          64
Benchmark Instrumentation                     28




Benchmarking Time
β€’ SpeedUp of Boada

             7



             6



             5



             4
   SpeedUp




             3



             2



             1



             0
                 1   2   4   8   16        32        64      # of processors
Benchmark Instrumentation   29




OUTLINE
β€’ NAS Benchmark Suite
β€’ Experiments
β€’ Paraver Visualization
  β€’ Code View
  β€’ Communication
  β€’ Disk I/O
  β€’ Load Balancing
  β€’ LD1 Cache Miss
  β€’ Cycles per Instruction (CPI)
β€’ Execution Time
β€’ Benchmarking Time
β€’ Conclusion
Benchmark Instrumentation   30




Conclusion
β€’ IS application
   β€’ ... does not have so much communication.
   β€’ ... is based on computation and memory loading.
   β€’ ... has low cache miss and high CPI values in computation phase.


β€’ NAS is designed for highly parallel supercomputers.
  β€’ MyComputer is inadequate to meet requierments of NAS.
  β€’ MyComputer can not speed up in this application.
  β€’ Boada can speed up untill number of processors that it has.
  β€’ Mycomputer saves less time for disk I/O operations.
  β€’ CPI values in Boada’ s computation phase less.
BENCHMARK
INSTRUMENTATION
Umit Cavus BUYUKSAHIN
Measurements & Tools, Spring β€˜12



                                   4/17/2012

More Related Content

Similar to M&t presentation

Understanding Android Benchmarks
Understanding Android BenchmarksUnderstanding Android Benchmarks
Understanding Android BenchmarksKoan-Sin Tan
Β 
Summary Of Course Projects
Summary Of Course ProjectsSummary Of Course Projects
Summary Of Course Projectsawan2008
Β 
Librato's Joseph Ruscio at Heroku's 2013: Instrumenting 12-Factor Apps
Librato's Joseph Ruscio at Heroku's 2013: Instrumenting 12-Factor AppsLibrato's Joseph Ruscio at Heroku's 2013: Instrumenting 12-Factor Apps
Librato's Joseph Ruscio at Heroku's 2013: Instrumenting 12-Factor AppsHeroku
Β 
Fast & Furious: building HPC solutions in a nutshell
Fast & Furious: building HPC solutions in a nutshellFast & Furious: building HPC solutions in a nutshell
Fast & Furious: building HPC solutions in a nutshellVictor Haydin
Β 
Performance Oriented Design
Performance Oriented DesignPerformance Oriented Design
Performance Oriented DesignRodrigo Campos
Β 
2010 ilmf asprs hot topics session
2010 ilmf asprs hot topics session2010 ilmf asprs hot topics session
2010 ilmf asprs hot topics sessionraj.m.rao
Β 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf toolsBrendan Gregg
Β 
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...DATAVERSITY
Β 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsYinghai Lu
Β 
03 performance
03 performance03 performance
03 performancemarangburu42
Β 
What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performancePiotr Przymus
Β 
Rails Performance
Rails PerformanceRails Performance
Rails PerformanceWen-Tien Chang
Β 
So You Want To Write Your Own Benchmark
So You Want To Write Your Own BenchmarkSo You Want To Write Your Own Benchmark
So You Want To Write Your Own BenchmarkDror Bereznitsky
Β 
Asynchronous design with Spring and RTI: 1M events per second
Asynchronous design with Spring and RTI: 1M events per secondAsynchronous design with Spring and RTI: 1M events per second
Asynchronous design with Spring and RTI: 1M events per secondStuart (Pid) Williams
Β 
Daniel dauwe ece 561 Trial 3
Daniel dauwe   ece 561 Trial 3Daniel dauwe   ece 561 Trial 3
Daniel dauwe ece 561 Trial 3cinedan
Β 
Daniel dauwe ece 561 Benchmarking Results
Daniel dauwe   ece 561 Benchmarking ResultsDaniel dauwe   ece 561 Benchmarking Results
Daniel dauwe ece 561 Benchmarking Resultscinedan
Β 
Daniel dauwe ece 561 Benchmarking Results Trial 2
Daniel dauwe   ece 561 Benchmarking Results Trial 2Daniel dauwe   ece 561 Benchmarking Results Trial 2
Daniel dauwe ece 561 Benchmarking Results Trial 2cinedan
Β 
cloud scheduling
cloud schedulingcloud scheduling
cloud schedulingMudit Verma
Β 
[2011 CodeEngn Conference 05] Deok9 - DBI(Dynamic Binary Instrumentation)λ₯Ό 이용...
[2011 CodeEngn Conference 05] Deok9 - DBI(Dynamic Binary Instrumentation)λ₯Ό 이용...[2011 CodeEngn Conference 05] Deok9 - DBI(Dynamic Binary Instrumentation)λ₯Ό 이용...
[2011 CodeEngn Conference 05] Deok9 - DBI(Dynamic Binary Instrumentation)λ₯Ό 이용...GangSeok Lee
Β 
Evaluating Data Freshness in Large Scale Replicated Databases
Evaluating Data Freshness in Large Scale Replicated DatabasesEvaluating Data Freshness in Large Scale Replicated Databases
Evaluating Data Freshness in Large Scale Replicated DatabasesMiguel AraΓΊjo
Β 

Similar to M&t presentation (20)

Understanding Android Benchmarks
Understanding Android BenchmarksUnderstanding Android Benchmarks
Understanding Android Benchmarks
Β 
Summary Of Course Projects
Summary Of Course ProjectsSummary Of Course Projects
Summary Of Course Projects
Β 
Librato's Joseph Ruscio at Heroku's 2013: Instrumenting 12-Factor Apps
Librato's Joseph Ruscio at Heroku's 2013: Instrumenting 12-Factor AppsLibrato's Joseph Ruscio at Heroku's 2013: Instrumenting 12-Factor Apps
Librato's Joseph Ruscio at Heroku's 2013: Instrumenting 12-Factor Apps
Β 
Fast & Furious: building HPC solutions in a nutshell
Fast & Furious: building HPC solutions in a nutshellFast & Furious: building HPC solutions in a nutshell
Fast & Furious: building HPC solutions in a nutshell
Β 
Performance Oriented Design
Performance Oriented DesignPerformance Oriented Design
Performance Oriented Design
Β 
2010 ilmf asprs hot topics session
2010 ilmf asprs hot topics session2010 ilmf asprs hot topics session
2010 ilmf asprs hot topics session
Β 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
Β 
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Β 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and Solutions
Β 
03 performance
03 performance03 performance
03 performance
Β 
What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performance
Β 
Rails Performance
Rails PerformanceRails Performance
Rails Performance
Β 
So You Want To Write Your Own Benchmark
So You Want To Write Your Own BenchmarkSo You Want To Write Your Own Benchmark
So You Want To Write Your Own Benchmark
Β 
Asynchronous design with Spring and RTI: 1M events per second
Asynchronous design with Spring and RTI: 1M events per secondAsynchronous design with Spring and RTI: 1M events per second
Asynchronous design with Spring and RTI: 1M events per second
Β 
Daniel dauwe ece 561 Trial 3
Daniel dauwe   ece 561 Trial 3Daniel dauwe   ece 561 Trial 3
Daniel dauwe ece 561 Trial 3
Β 
Daniel dauwe ece 561 Benchmarking Results
Daniel dauwe   ece 561 Benchmarking ResultsDaniel dauwe   ece 561 Benchmarking Results
Daniel dauwe ece 561 Benchmarking Results
Β 
Daniel dauwe ece 561 Benchmarking Results Trial 2
Daniel dauwe   ece 561 Benchmarking Results Trial 2Daniel dauwe   ece 561 Benchmarking Results Trial 2
Daniel dauwe ece 561 Benchmarking Results Trial 2
Β 
cloud scheduling
cloud schedulingcloud scheduling
cloud scheduling
Β 
[2011 CodeEngn Conference 05] Deok9 - DBI(Dynamic Binary Instrumentation)λ₯Ό 이용...
[2011 CodeEngn Conference 05] Deok9 - DBI(Dynamic Binary Instrumentation)λ₯Ό 이용...[2011 CodeEngn Conference 05] Deok9 - DBI(Dynamic Binary Instrumentation)λ₯Ό 이용...
[2011 CodeEngn Conference 05] Deok9 - DBI(Dynamic Binary Instrumentation)λ₯Ό 이용...
Β 
Evaluating Data Freshness in Large Scale Replicated Databases
Evaluating Data Freshness in Large Scale Replicated DatabasesEvaluating Data Freshness in Large Scale Replicated Databases
Evaluating Data Freshness in Large Scale Replicated Databases
Β 

Recently uploaded

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
Β 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
Β 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
Β 
FULL ENJOY πŸ” 8264348440 πŸ” Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY πŸ” 8264348440 πŸ” Call Girls in Diplomatic Enclave | DelhiFULL ENJOY πŸ” 8264348440 πŸ” Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY πŸ” 8264348440 πŸ” Call Girls in Diplomatic Enclave | Delhisoniya singh
Β 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
Β 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
Β 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
Β 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
Β 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
Β 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
Β 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
Β 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
Β 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
Β 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
Β 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
Β 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
Β 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
Β 
WhatsApp 9892124323 βœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 βœ“Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 βœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 βœ“Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
Β 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
Β 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
Β 

Recently uploaded (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
Β 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
Β 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Β 
FULL ENJOY πŸ” 8264348440 πŸ” Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY πŸ” 8264348440 πŸ” Call Girls in Diplomatic Enclave | DelhiFULL ENJOY πŸ” 8264348440 πŸ” Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY πŸ” 8264348440 πŸ” Call Girls in Diplomatic Enclave | Delhi
Β 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
Β 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Β 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Β 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Β 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Β 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Β 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Β 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Β 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Β 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
Β 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Β 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Β 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
Β 
WhatsApp 9892124323 βœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 βœ“Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 βœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 βœ“Call Girls In Kalyan ( Mumbai ) secure service
Β 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
Β 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
Β 

M&t presentation

  • 1. BENCHMARK INSTRUMENTATION Umit Cavus BUYUKSAHIN Measurements Tools & Techinics, Spring β€˜12 4/17/2012
  • 2. Benchmark Instrumentation 2 OUTLINE β€’ NAS Benchmark Suite β€’ Experiments β€’ Paraver Visualization β€’ Code View β€’ Communication β€’ Disk I/O β€’ Load Balancing β€’ LD1 Cache Miss β€’ Cycles per Instruction (CPI) β€’ Execution Time β€’ Benchmarking Time β€’ Conclusion
  • 3. Benchmark Instrumentation 3 NAS Benchmark Suite β€’ NAS ... is a set of benchmarks. ... evaluates performance of highly parallel supercomputers. ... developed and maintained by NASA Advanced Supercomputing(NAS).
  • 4. Benchmark Instrumentation 4 NAS Benchmark Suite β€’ NAS Kernel Applications β€’ IS - Integer Sort β€’ EP - Embarrassingly Parallel β€’ CG - Conjugate Gradient β€’ MG - Multi-Grid β€’ FT - discrete 3D fast Fourier Transform β€’ Problem Sizes β€’ S : small size β€’ W : workstation size β€’ A, B, C : standart test size; ~4X size in increasing order β€’ D, E, F : large test size; ~16X size in increasing order
  • 5. Benchmark Instrumentation 5 OUTLINE β€’ NAS Benchmark Suite β€’ Experiments β€’ Paraver Visualization β€’ Code View β€’ Communication β€’ Disk I/O β€’ Load Balancing β€’ LD1 Cache Miss β€’ Cycles per Instruction (CPI) β€’ Execution Time β€’ Benchmarking Time β€’ Conclusion
  • 6. Benchmark Instrumentation 6 Experiments β€’ NAS Parallel Benchmark version 3.2.1 β€’ IS Kernel Application: ... sorts N keys in parallel. ... tests β€’ integer computation speed β€’ communication perfomance β€’ S Problem Size: ... small for quick test purposes ... has 216 keys
  • 7. Benchmark Instrumentation 7 Experiments β€’ IS Benchmarking Procedure (generally) 1. Generating sequence of N keys 2. Loading N keys into the memory systems 3. Time begins 4. Loop Sorting & partial verification 5. Time ends 6. Full verification.
  • 8. Benchmark Instrumentation 8 Experiments Machines: β€’ My Computer  i686 GNU/Linux  3Gb Ram  2 CPUSs with 800Mhz β€’ Boada  x86_64 x86_64 x86_64 GNU/Linux  24Gb Ram  24 CPUS with 1596Mhz
  • 9. Benchmark Instrumentation 9 Experiments Procedure: β€’ Not manually instrumented. β€’ Paraver traces are automatically generated β€’ LD_PRELOAD is exported. β€’ Benchmarks are executed with 2,4,8,16,32, and 64 processors. β€’ Benchmark results are analyzed β€’ Generated traces are examined in paraver tools.
  • 10. Benchmark Instrumentation 10 OUTLINE β€’ NAS Benchmark Suite β€’ Experiments β€’ Paraver Visualization β€’ Code View β€’ Communication β€’ Disk I/O β€’ Load Balancing β€’ LD1 Cache Miss β€’ Cycles per Instruction (CPI) β€’ Execution Time β€’ Benchmarking Time β€’ Conclusion
  • 11. Benchmark Instrumentation 11 Paraver Visualization – Code View β€’ My Computer β€’ Boada
  • 12. Benchmark Instrumentation 12 Paraver Visualization – Communication β€’ My Computer β€’ Boada
  • 13. Benchmark Instrumentation 13 Paraver Visualization – Disk I/O β€’ My Computer β€’ Boada
  • 14. Benchmark Instrumentation 14 Paraver Visualization – Load Balance β€’ My Computer ....
  • 15. Benchmark Instrumentation 15 Paraver Visualization – Load Balance β€’ Boada ....
  • 16. Benchmark Instrumentation 16 Paraver Visualization – LD1 Cache Miss β€’ My Computer
  • 17. Benchmark Instrumentation 17 Paraver Visualization – LD1 Cache Miss β€’ Boada
  • 18. Benchmark Instrumentation 18 Paraver Visualization – CPI β€’ My Computer
  • 19. Benchmark Instrumentation 19 Paraver Visualization – CPI β€’ Boada
  • 20. Benchmark Instrumentation 20 OUTLINE β€’ NAS Benchmark Suite β€’ Experiments β€’ Paraver Visualization β€’ Code View β€’ Communication β€’ Disk I/O β€’ Load Balancing β€’ LD1 Cache Miss β€’ Cycles per Instruction (CPI) β€’ Execution Time β€’ Benchmarking Time β€’ Conclusion
  • 21. Benchmark Instrumentation 21 Execution Time 16000 14000 12000 10000 MyComputer Time (ms) 8000 Boada 6000 4000 2000 0 2 4 8 16 32 64 # of processors
  • 22. Benchmark Instrumentation 22 Execution Time 𝐸π‘₯π‘’π‘π‘’π‘‘π‘–π‘œπ‘›π‘‡π‘–π‘šπ‘’ π‘œπ‘“ π‘€π‘¦πΆπ‘œπ‘šπ‘π‘’π‘‘π‘’π‘Ÿ β€’ Relative Speedup = 𝐸π‘₯π‘’π‘π‘’π‘‘π‘–π‘œπ‘›π‘‡π‘–π‘šπ‘’ π‘œπ‘“ π΅π‘œπ‘Žπ‘‘π‘Ž 60 50 40 30 SpeedUp 20 10 0 1 2 4 8 16 32 64 # of processors
  • 23. Benchmark Instrumentation 23 OUTLINE β€’ NAS Benchmark Suite β€’ Experiments β€’ Paraver Visualization β€’ Code View β€’ Communication β€’ Disk I/O β€’ Load Balancing β€’ LD1 Cache Miss β€’ Cycles per Instruction (CPI) β€’ Execution Time β€’ Benchmarking Time β€’ Conclusion
  • 24. Benchmark Instrumentation 24 Benchmarking Time - reminder β€’ IS Benchmarking Procedure (generally) 1. Generating sequence of N keys 2. Loading N keys into the memory systems 3. Time begins 4. Loop Sorting & partial verification 5. Time ends 6. Full verification. β€’ Benchmarking Time = execution time of the parallel algorithm
  • 25. Benchmark Instrumentation 25 Benchmarking Time 2,000 1,800 1,600 1,400 1,200 Time (sec) 1,000 MyComputer Boada 0,800 0,600 0,400 0,200 0,000 1 2 4 8 16 32 64 # of processors
  • 26. Benchmark Instrumentation 26 Benchmarking Time π΅π‘’π‘›π‘β„Žπ‘šπ‘Žπ‘Ÿπ‘˜π‘–π‘›π‘”π‘‡π‘–π‘šπ‘’ π‘œπ‘“ π‘€π‘¦πΆπ‘œπ‘šπ‘π‘’π‘‘π‘’π‘Ÿ β€’ Relative Speedup = π΅π‘’π‘›π‘β„Žπ‘šπ‘Žπ‘Ÿπ‘˜π‘–π‘›π‘”π‘‡π‘–π‘šπ‘’ π‘œπ‘“ π΅π‘œπ‘Žπ‘‘π‘Ž 70,00 60,00 50,00 40,00 SpeedUp 30,00 20,00 10,00 0,00 # of processors 1 2 4 8 16 32 64
  • 27. Benchmark Instrumentation 27 Benchmarking Time β€’ SpeedUp of My Computer 1,2 1 0,8 SpeedUp 0,6 0,4 0,2 0 # of processors 1 2 4 8 16 32 64
  • 28. Benchmark Instrumentation 28 Benchmarking Time β€’ SpeedUp of Boada 7 6 5 4 SpeedUp 3 2 1 0 1 2 4 8 16 32 64 # of processors
  • 29. Benchmark Instrumentation 29 OUTLINE β€’ NAS Benchmark Suite β€’ Experiments β€’ Paraver Visualization β€’ Code View β€’ Communication β€’ Disk I/O β€’ Load Balancing β€’ LD1 Cache Miss β€’ Cycles per Instruction (CPI) β€’ Execution Time β€’ Benchmarking Time β€’ Conclusion
  • 30. Benchmark Instrumentation 30 Conclusion β€’ IS application β€’ ... does not have so much communication. β€’ ... is based on computation and memory loading. β€’ ... has low cache miss and high CPI values in computation phase. β€’ NAS is designed for highly parallel supercomputers. β€’ MyComputer is inadequate to meet requierments of NAS. β€’ MyComputer can not speed up in this application. β€’ Boada can speed up untill number of processors that it has. β€’ Mycomputer saves less time for disk I/O operations. β€’ CPI values in Boada’ s computation phase less.