SlideShare a Scribd company logo
1 of 26
A review of power & energy
consumption optimization in HPC

                   Rishi Pathak
                 riship@cdac.in
 National PARAM Supercomputing Facility, C-DAC, Pune


    Symposium on HPC Applications – IIT Kanpur
                 March 12 - 14, 2012
Top 10 – Top500
Top 10 – Green 500
3

          2.02                                                     GF/Watt
                      2.02
                                         1.98
2.5                                                        1.68


                                                                                     Green 500, Rank 1-10 (GF per Watt)
                                 1.99

                                                                                     Top 500, Rank 1-10 (GF per Watt)
 2




                                                                        1.37
                                                                                 1.26
                                                                  GPU
1.5                                                                                               GPU
                                                                               GPU
                                                                                                   1.01                      0.95
                                                                                                                         GPU
                                                                                                              0.96
                                                                                                               GPU
 1
                                                           GPU
          0.83                                             0.85

                     GPU
                      0.63
                                        GPU
0.5                                             0.49
                                                                                                  0.36                       0.44
                                                                        0.28
                                 0.25                                                                         0.29
                                                                                     0.27

 0
      1          2           3            4            5            6           7             8           9             10
Exascale system
• Likely to be feasible by 2017±2
• 10-100 Million processing elements (cores or mini-
  cores)
• Chips perhaps as dense as 1,000 cores per socket
• Clock rates will grow more slowly
• Large-scale optics based interconnects
• 10-100 PB of aggregate memory
• Performance per watt ~ 100 GF/watt sustained
  performance
• 10 – 100 MW Exascale system
Power & Energy
   E=P*T
   Energy(E) consumed in time(T) with average
    power(P)
   Minimizing time interval will limit energy
   A minimum value of T for an application
       Mapping of application to cluster system
       Scalability & system bottlenecks
   Beyond that – Power management approaches
Power management techniques
   Static Power Management(SPM)
       Low power CPUs
       Local flash storage
       Suitable for data centric applications
   Dynamic Power Management(DPM)
       Software & power scalable components
       Dynamically adjust power consumption
       Frequency & Voltage scaling for CPU & memory
DVFS
   Dynamic Voltage & Frequency Scaling
   P = C * V2 * f
   Throttling when
       Workload is not CPU bound
       Is not much CPU intensive
DVFS Scheduling
   Off-line, trace-based scheduling
       Source code instrumentation for performance profiling
       Execution with profiling
       Determination of appropriate processor frequencies for
        each phase
       Source code instrumentation for DVFS scheduling


S. Huang & W. Feng – Proc. Cluster computing[IEEE/ACM](2009)
DVFS Scheduling
   Run-time, profiling-based scheduling
       Time-window based performance prediction model
       No a priori information of application phases
       False prediction will have dire consequences for performance
        or energy efficiency
       Metrics
            MIPS & CPU utilization
            Interception of MPI communication calls
            File I/O calls
            MPI receive wait cycles
       Shown to reduce energy with pre-specified performance loss
        constraint
DVFS Implementations
    Memory MISER (Management Infra-Structure for Enerygy
     Reduction)
    CPU MISER
    Linux CPUSPEED
    Ecod
    Beta-Algorithm
    M. E. Tolentino, J. Turner & K. W. Cameron – Proc. of the 4th international conference
    on Computing frontiers(2007)
    S. Huang & W. Feng – Proc. Cluster computing[IEEE/ACM](2009)
    C. Hsu & W. Feng - Proc. of the 2005 ACM/IEEE conference on Supercomputing
Enhancements in DVFS
   Dynamic Frequency Scaling per Core
       Each core runs at its own clock
       Power is linear with frequency
       Power savings are relatively small
   Separate power planes for the core and "uncore" part
    of the CPU
       Cores can go to sleep (C-state)
       Memory controller is still operational for external device
        (e.g. via DMA)
Enhancements in DVFS
   Clock gating
       Clock disabled sleep state (AMD-C1,E1, Intel-
        C[0,1,3,6])
       At the CPU block level
       At the core level
       Reduces dynamic power
   Power Gating
       Power to CPU/core cut off (~0V)
       Reduces both dynamic and static(leakage) power
Nehalem core sleep states
AMD's and Intel's techniques
Power optimization at NPSF
   Scheduler capable of :
       Power off a node after a pre specified state of idleness(no
        job)
       Power optimization with QOS(turnaround time)
       Node power on time(2-3 min) is additional
   Targeted power policies
       Aggressive optimization w/o regard to QOS
       Power capping
       Power budget
Power optimization at NPSF
   Node packing via checkpointing, migration & restart
       MPI with BLCR – one approach
       Use of virtualization – another approach
       Considerations –
            Remaining walltime of job being migrated
            Remaining walltime of jobs on node in consideration
            Associated cost of migration against power savings expected to
             be achieved
Saving Potential
Simulation Result - Plot
Simulation Results - Table
       Parameter Case           Case I   Case II   Case III



 Power saving (in percentage)    4.05     4.22      9.29



NODEIDLEPOWERTHRESHOLD            8        6          4
        (In minutes)
Power optimization at NPSF
   Feedback driven policy engine
       Speculative power on/off of nodes at any given time
       Metrics/deciding factors
            Function of Jobs arrival time & resource requirements
            How many nodes at what time
            Current and probable cluster utilization at given time – another
             metric
       Expected starttime of jobs in queue
       Minimize impact on turnaround time of job
Job Arrival Time
PARAM Yuva – Access & Account
https://yuva.cdac.in/
Technical Affiliation Scheme
Thank You
npsfhelp@cdac.in

More Related Content

What's hot

Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...Shaheryar Iqbal
 
AMBER Molecular Dynamics on GPU
AMBER Molecular Dynamics on GPUAMBER Molecular Dynamics on GPU
AMBER Molecular Dynamics on GPUDevang Sachdev
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architectureDhaval Kaneria
 
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...Bharath Sudharsan
 
Nvidia (History, GPU Architecture and New Pascal Architecture)
Nvidia (History, GPU Architecture and New Pascal Architecture)Nvidia (History, GPU Architecture and New Pascal Architecture)
Nvidia (History, GPU Architecture and New Pascal Architecture)Saksham Tanwar
 
Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Shien-Chun Luo
 
SLA-aware Dynamic CPU Scaling in Business Cloud Computing Environments
SLA-aware Dynamic CPU Scaling in Business Cloud Computing EnvironmentsSLA-aware Dynamic CPU Scaling in Business Cloud Computing Environments
SLA-aware Dynamic CPU Scaling in Business Cloud Computing EnvironmentsZhenyun Zhuang
 
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing UnitSlides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing UnitCarlo C. del Mundo
 
Graphic Processing Unit
Graphic Processing UnitGraphic Processing Unit
Graphic Processing UnitKamran Ashraf
 
Graphic Processing Unit (GPU)
Graphic Processing Unit (GPU)Graphic Processing Unit (GPU)
Graphic Processing Unit (GPU)Jafar Khan
 
Atoll getting started_lte_282_en
Atoll getting started_lte_282_enAtoll getting started_lte_282_en
Atoll getting started_lte_282_enMorokot
 
GPU power consumption and performance trends
GPU power consumption and performance trendsGPU power consumption and performance trends
GPU power consumption and performance trendsAlessio Villardita
 
Gpu Systems
Gpu SystemsGpu Systems
Gpu Systemsjpaugh
 

What's hot (20)

Parallel Computing on the GPU
Parallel Computing on the GPUParallel Computing on the GPU
Parallel Computing on the GPU
 
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
 
Tensor Processing Unit (TPU)
Tensor Processing Unit (TPU)Tensor Processing Unit (TPU)
Tensor Processing Unit (TPU)
 
TPU paper slide
TPU paper slideTPU paper slide
TPU paper slide
 
AMBER Molecular Dynamics on GPU
AMBER Molecular Dynamics on GPUAMBER Molecular Dynamics on GPU
AMBER Molecular Dynamics on GPU
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architecture
 
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
 
GPU Programming
GPU ProgrammingGPU Programming
GPU Programming
 
Nvidia (History, GPU Architecture and New Pascal Architecture)
Nvidia (History, GPU Architecture and New Pascal Architecture)Nvidia (History, GPU Architecture and New Pascal Architecture)
Nvidia (History, GPU Architecture and New Pascal Architecture)
 
Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)
 
SLA-aware Dynamic CPU Scaling in Business Cloud Computing Environments
SLA-aware Dynamic CPU Scaling in Business Cloud Computing EnvironmentsSLA-aware Dynamic CPU Scaling in Business Cloud Computing Environments
SLA-aware Dynamic CPU Scaling in Business Cloud Computing Environments
 
Introduction to GPU Programming
Introduction to GPU ProgrammingIntroduction to GPU Programming
Introduction to GPU Programming
 
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing UnitSlides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
 
2017 04-13-google-tpu-04
2017 04-13-google-tpu-042017 04-13-google-tpu-04
2017 04-13-google-tpu-04
 
Graphic Processing Unit
Graphic Processing UnitGraphic Processing Unit
Graphic Processing Unit
 
Graphic Processing Unit (GPU)
Graphic Processing Unit (GPU)Graphic Processing Unit (GPU)
Graphic Processing Unit (GPU)
 
GPU Computing
GPU ComputingGPU Computing
GPU Computing
 
Atoll getting started_lte_282_en
Atoll getting started_lte_282_enAtoll getting started_lte_282_en
Atoll getting started_lte_282_en
 
GPU power consumption and performance trends
GPU power consumption and performance trendsGPU power consumption and performance trends
GPU power consumption and performance trends
 
Gpu Systems
Gpu SystemsGpu Systems
Gpu Systems
 

Viewers also liked

Ejercicio tecnica vocal
Ejercicio tecnica vocalEjercicio tecnica vocal
Ejercicio tecnica vocalANAIS TIPAN
 
#CNX14 - Building Killer Apps - Moving Beyond Transactions to Experiences
#CNX14 - Building Killer Apps - Moving Beyond Transactions to Experiences#CNX14 - Building Killer Apps - Moving Beyond Transactions to Experiences
#CNX14 - Building Killer Apps - Moving Beyond Transactions to ExperiencesSalesforce Marketing Cloud
 
Value Proposition Of Thomas Jackson
Value Proposition Of Thomas JacksonValue Proposition Of Thomas Jackson
Value Proposition Of Thomas JacksonThomas Jackson
 
網站首頁比較
網站首頁比較網站首頁比較
網站首頁比較心瑜 楊
 
Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...
Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...
Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...Global Business Events
 
AlphaGraphics Design
AlphaGraphics DesignAlphaGraphics Design
AlphaGraphics DesignAlpha522
 
BelalOssamaAbuLabanResume2016 - Copy
BelalOssamaAbuLabanResume2016 - CopyBelalOssamaAbuLabanResume2016 - Copy
BelalOssamaAbuLabanResume2016 - Copybelal abulaban
 
Inmigración Armenia
Inmigración ArmeniaInmigración Armenia
Inmigración ArmeniaLadesergio
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhentranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhenDavid Peyruc
 
UPB - Software is eating up the world
UPB - Software is eating up the worldUPB - Software is eating up the world
UPB - Software is eating up the worldEddy D. Sánchez
 
Using Social Media for Ministry
Using Social Media for MinistryUsing Social Media for Ministry
Using Social Media for MinistryJason Caston
 
现代化敏捷测试工作者
现代化敏捷测试工作者现代化敏捷测试工作者
现代化敏捷测试工作者Yi Xu
 
黃晴與彭立人醫師的對話錄
黃晴與彭立人醫師的對話錄黃晴與彭立人醫師的對話錄
黃晴與彭立人醫師的對話錄honan4108
 

Viewers also liked (17)

5° básico b semana 18 al 22 abril
 5° básico b  semana 18  al 22 abril 5° básico b  semana 18  al 22 abril
5° básico b semana 18 al 22 abril
 
Ejercicio tecnica vocal
Ejercicio tecnica vocalEjercicio tecnica vocal
Ejercicio tecnica vocal
 
#CNX14 - Building Killer Apps - Moving Beyond Transactions to Experiences
#CNX14 - Building Killer Apps - Moving Beyond Transactions to Experiences#CNX14 - Building Killer Apps - Moving Beyond Transactions to Experiences
#CNX14 - Building Killer Apps - Moving Beyond Transactions to Experiences
 
Backlink service
Backlink serviceBacklink service
Backlink service
 
Value Proposition Of Thomas Jackson
Value Proposition Of Thomas JacksonValue Proposition Of Thomas Jackson
Value Proposition Of Thomas Jackson
 
網站首頁比較
網站首頁比較網站首頁比較
網站首頁比較
 
Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...
Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...
Clemente De Lucia, Senior Economist at BNP Paribas - How should the ECB act t...
 
AlphaGraphics Design
AlphaGraphics DesignAlphaGraphics Design
AlphaGraphics Design
 
BelalOssamaAbuLabanResume2016 - Copy
BelalOssamaAbuLabanResume2016 - CopyBelalOssamaAbuLabanResume2016 - Copy
BelalOssamaAbuLabanResume2016 - Copy
 
Inmigración Armenia
Inmigración ArmeniaInmigración Armenia
Inmigración Armenia
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhentranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
 
What's all about this ....
What's all about this ....What's all about this ....
What's all about this ....
 
Guns (v.m.)
Guns (v.m.)Guns (v.m.)
Guns (v.m.)
 
UPB - Software is eating up the world
UPB - Software is eating up the worldUPB - Software is eating up the world
UPB - Software is eating up the world
 
Using Social Media for Ministry
Using Social Media for MinistryUsing Social Media for Ministry
Using Social Media for Ministry
 
现代化敏捷测试工作者
现代化敏捷测试工作者现代化敏捷测试工作者
现代化敏捷测试工作者
 
黃晴與彭立人醫師的對話錄
黃晴與彭立人醫師的對話錄黃晴與彭立人醫師的對話錄
黃晴與彭立人醫師的對話錄
 

Similar to Symposium on HPC Applications – IIT Kanpur

CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...
CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...
CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...CAST, Inc.
 
improve deep learning training and inference performance
improve deep learning training and inference performanceimprove deep learning training and inference performance
improve deep learning training and inference performances.rohit
 
APSys Presentation Final copy2
APSys Presentation Final copy2APSys Presentation Final copy2
APSys Presentation Final copy2Junli Gu
 
PACT_conference_2019_Tutorial_02_gpgpusim.pptx
PACT_conference_2019_Tutorial_02_gpgpusim.pptxPACT_conference_2019_Tutorial_02_gpgpusim.pptx
PACT_conference_2019_Tutorial_02_gpgpusim.pptxssuser30e7d2
 
Fugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons LearnedFugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons LearnedRCCSRENKEI
 
Kindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievKindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievVolodymyr Saviak
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUsiguazio
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit pptSandeep Singh
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)Kohei KaiGai
 
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerIntroduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerFörderverein Technische Fakultät
 
CUDA and Caffe for deep learning
CUDA and Caffe for deep learningCUDA and Caffe for deep learning
CUDA and Caffe for deep learningAmgad Muhammad
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~Kohei KaiGai
 
NAMD Molecular Dynamics on GPU
NAMD Molecular Dynamics on GPUNAMD Molecular Dynamics on GPU
NAMD Molecular Dynamics on GPUDevang Sachdev
 
GPU Computing In Higher Education And Research
GPU Computing In Higher Education And ResearchGPU Computing In Higher Education And Research
GPU Computing In Higher Education And ResearchDevang Sachdev
 
2-GPGPU-Sim-Overview.pptx
2-GPGPU-Sim-Overview.pptx2-GPGPU-Sim-Overview.pptx
2-GPGPU-Sim-Overview.pptxYonggangLiu3
 
Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...
Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...
Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...Chanwoo Choi
 
GPU Compute in Medical and Print Imaging
GPU Compute in Medical and Print ImagingGPU Compute in Medical and Print Imaging
GPU Compute in Medical and Print ImagingAMD
 

Similar to Symposium on HPC Applications – IIT Kanpur (20)

CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...
CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...
CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...
 
improve deep learning training and inference performance
improve deep learning training and inference performanceimprove deep learning training and inference performance
improve deep learning training and inference performance
 
APSys Presentation Final copy2
APSys Presentation Final copy2APSys Presentation Final copy2
APSys Presentation Final copy2
 
PACT_conference_2019_Tutorial_02_gpgpusim.pptx
PACT_conference_2019_Tutorial_02_gpgpusim.pptxPACT_conference_2019_Tutorial_02_gpgpusim.pptx
PACT_conference_2019_Tutorial_02_gpgpusim.pptx
 
Fugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons LearnedFugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons Learned
 
Kindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievKindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 Kiev
 
Dasia 2022
Dasia 2022Dasia 2022
Dasia 2022
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUs
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit ppt
 
GPU - Basic Working
GPU - Basic WorkingGPU - Basic Working
GPU - Basic Working
 
Gpu
GpuGpu
Gpu
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
 
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerIntroduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
 
CUDA and Caffe for deep learning
CUDA and Caffe for deep learningCUDA and Caffe for deep learning
CUDA and Caffe for deep learning
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
 
NAMD Molecular Dynamics on GPU
NAMD Molecular Dynamics on GPUNAMD Molecular Dynamics on GPU
NAMD Molecular Dynamics on GPU
 
GPU Computing In Higher Education And Research
GPU Computing In Higher Education And ResearchGPU Computing In Higher Education And Research
GPU Computing In Higher Education And Research
 
2-GPGPU-Sim-Overview.pptx
2-GPGPU-Sim-Overview.pptx2-GPGPU-Sim-Overview.pptx
2-GPGPU-Sim-Overview.pptx
 
Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...
Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...
Tizen Developer Conference 2017 San Francisco - Tizen Power Management Servic...
 
GPU Compute in Medical and Print Imaging
GPU Compute in Medical and Print ImagingGPU Compute in Medical and Print Imaging
GPU Compute in Medical and Print Imaging
 

Recently uploaded

[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 

Recently uploaded (20)

[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 

Symposium on HPC Applications – IIT Kanpur

  • 1. A review of power & energy consumption optimization in HPC Rishi Pathak riship@cdac.in National PARAM Supercomputing Facility, C-DAC, Pune Symposium on HPC Applications – IIT Kanpur March 12 - 14, 2012
  • 2. Top 10 – Top500
  • 3. Top 10 – Green 500
  • 4. 3 2.02 GF/Watt 2.02 1.98 2.5 1.68 Green 500, Rank 1-10 (GF per Watt) 1.99 Top 500, Rank 1-10 (GF per Watt) 2 1.37 1.26 GPU 1.5 GPU GPU 1.01 0.95 GPU 0.96 GPU 1 GPU 0.83 0.85 GPU 0.63 GPU 0.5 0.49 0.36 0.44 0.28 0.25 0.29 0.27 0 1 2 3 4 5 6 7 8 9 10
  • 5. Exascale system • Likely to be feasible by 2017±2 • 10-100 Million processing elements (cores or mini- cores) • Chips perhaps as dense as 1,000 cores per socket • Clock rates will grow more slowly • Large-scale optics based interconnects • 10-100 PB of aggregate memory • Performance per watt ~ 100 GF/watt sustained performance • 10 – 100 MW Exascale system
  • 6. Power & Energy  E=P*T  Energy(E) consumed in time(T) with average power(P)  Minimizing time interval will limit energy  A minimum value of T for an application  Mapping of application to cluster system  Scalability & system bottlenecks  Beyond that – Power management approaches
  • 7. Power management techniques  Static Power Management(SPM)  Low power CPUs  Local flash storage  Suitable for data centric applications  Dynamic Power Management(DPM)  Software & power scalable components  Dynamically adjust power consumption  Frequency & Voltage scaling for CPU & memory
  • 8. DVFS  Dynamic Voltage & Frequency Scaling  P = C * V2 * f  Throttling when  Workload is not CPU bound  Is not much CPU intensive
  • 9. DVFS Scheduling  Off-line, trace-based scheduling  Source code instrumentation for performance profiling  Execution with profiling  Determination of appropriate processor frequencies for each phase  Source code instrumentation for DVFS scheduling S. Huang & W. Feng – Proc. Cluster computing[IEEE/ACM](2009)
  • 10. DVFS Scheduling  Run-time, profiling-based scheduling  Time-window based performance prediction model  No a priori information of application phases  False prediction will have dire consequences for performance or energy efficiency  Metrics  MIPS & CPU utilization  Interception of MPI communication calls  File I/O calls  MPI receive wait cycles  Shown to reduce energy with pre-specified performance loss constraint
  • 11. DVFS Implementations  Memory MISER (Management Infra-Structure for Enerygy Reduction)  CPU MISER  Linux CPUSPEED  Ecod  Beta-Algorithm M. E. Tolentino, J. Turner & K. W. Cameron – Proc. of the 4th international conference on Computing frontiers(2007) S. Huang & W. Feng – Proc. Cluster computing[IEEE/ACM](2009) C. Hsu & W. Feng - Proc. of the 2005 ACM/IEEE conference on Supercomputing
  • 12. Enhancements in DVFS  Dynamic Frequency Scaling per Core  Each core runs at its own clock  Power is linear with frequency  Power savings are relatively small  Separate power planes for the core and "uncore" part of the CPU  Cores can go to sleep (C-state)  Memory controller is still operational for external device (e.g. via DMA)
  • 13. Enhancements in DVFS  Clock gating  Clock disabled sleep state (AMD-C1,E1, Intel- C[0,1,3,6])  At the CPU block level  At the core level  Reduces dynamic power  Power Gating  Power to CPU/core cut off (~0V)  Reduces both dynamic and static(leakage) power
  • 15. AMD's and Intel's techniques
  • 16. Power optimization at NPSF  Scheduler capable of :  Power off a node after a pre specified state of idleness(no job)  Power optimization with QOS(turnaround time)  Node power on time(2-3 min) is additional  Targeted power policies  Aggressive optimization w/o regard to QOS  Power capping  Power budget
  • 17. Power optimization at NPSF  Node packing via checkpointing, migration & restart  MPI with BLCR – one approach  Use of virtualization – another approach  Considerations –  Remaining walltime of job being migrated  Remaining walltime of jobs on node in consideration  Associated cost of migration against power savings expected to be achieved
  • 20. Simulation Results - Table Parameter Case Case I Case II Case III Power saving (in percentage) 4.05 4.22 9.29 NODEIDLEPOWERTHRESHOLD 8 6 4 (In minutes)
  • 21. Power optimization at NPSF  Feedback driven policy engine  Speculative power on/off of nodes at any given time  Metrics/deciding factors  Function of Jobs arrival time & resource requirements  How many nodes at what time  Current and probable cluster utilization at given time – another metric  Expected starttime of jobs in queue  Minimize impact on turnaround time of job
  • 23. PARAM Yuva – Access & Account