SlideShare a Scribd company logo
1 of 41
Chapter 3: Principles of Scalable
         Performance

•   Performance measures
•   Speedup laws
•   Scalability principles
•   Scaling up vs. scaling down



                EENG-630 Chapter 3   1
Performance metrics and
               measures

•   Parallelism profiles
•   Asymptotic speedup factor
•   System efficiency, utilization and quality
•   Standard performance measures



                 EENG-630 Chapter 3   2
Degree of parallelism
• Reflects the matching of software and
  hardware parallelism
• Discrete time function – measure, for each
  time period, the # of processors used
• Parallelism profile is a plot of the DOP as a
  function of time
• Ideally have unlimited resources

               EENG-630 Chapter 3    3
Factors affecting parallelism
                profiles
•   Algorithm structure
•   Program optimization
•   Resource utilization
•   Run-time conditions
•   Realistically limited by # of available
    processors, memory, and other
    nonprocessor resources

                 EENG-630 Chapter 3    4
Average parallelism variables
• n – homogeneous processors
• m – maximum parallelism in a profile
∀ ∆ - computing capacity of a single
  processor (execution rate only, no
  overhead)
• DOP=i – # processors busy during an
  observation period

              EENG-630 Chapter 3   5
Average parallelism
• Total amount of work performed is
  proportional to the area under the profile
  curve
                     t2
          W = ∆ ∫ DOP(t )dt
                    t1
                    m
          W = ∆ ∑ i ⋅ ti
                    i =1
               EENG-630 Chapter 3    6
Average parallelism

        1 t2
A=
    t 2 − t1 ∫t1 DOP(t )dt

      m
                   m
                          
A =  ∑ i ⋅ ti  /  ∑ ti 
     i =1        i =1 
        EENG-630 Chapter 3   7
Example: parallelism profile and
     average parallelism




          EENG-630 Chapter 3   8
Asymptotic speedup
                                                     m
         m
T (1) = ∑ ti (1) = ∑
                     m
                        Wi
                                       T (1)       ∑W         i

        i =1       i =1 ∆         S∞ =        =    i =1
                                       T (∞ )     m
          m
T ( ∞ ) = ∑ ti ( ∞ ) = ∑
                          m
                            Wi                    ∑W /i
                                                  i =1
                                                          i


          i =1         i =1 i∆        = A in the ideal case
        (response time)



                     EENG-630 Chapter 3      9
Performance measures
• Consider n processors executing m
  programs in various modes
• Want to define the mean performance of
  these multimode computers:
  – Arithmetic mean performance
  – Geometric mean performance
  – Harmonic mean performance


              EENG-630 Chapter 3   10
Arithmetic mean performance
        m
Ra = ∑ Ri / m             Arithmetic mean execution rate
                          (assumes equal weighting)
       i =1
       m
R = ∑ ( f i Ri )
 *                        Weighted arithmetic mean
 a                        execution rate
       i =1

     -proportional to the sum of the inverses of
      execution times

               EENG-630 Chapter 3           11
Geometric mean performance
        m
Rg = ∏ R       1/ m
               i
                        Geometric mean execution rate

       i =1
          m
R = ∏ Ri
 *
 g
                 fi      Weighted geometric mean
                         execution rate
        i =1
  -does not summarize the real performance since it does
   not have the inverse relation with the total time

                EENG-630 Chapter 3         12
Harmonic mean performance

 Ti = 1 / Ri      Mean execution time per instruction
                  For program i


    1 m    1 m 1
Ta = ∑ Ti = ∑          Arithmetic mean execution time
    m i =1 m i =1 Ri   per instruction




                EENG-630 Chapter 3        13
Harmonic mean performance
                            m
Rh =1 / Ta =          m               Harmonic mean execution rate
                    ∑1 / R )
                     (
                     i=1
                                 i




             1
R =
 *
 h    m
                            Weighted harmonic mean execution rate

      ∑( f
      i =1
             i   / Ri )
                      -corresponds to total # of operations divided by
                       the total time (closest to the real performance)
                          EENG-630 Chapter 3        14
Harmonic Mean Speedup
• Ties the various modes of a program to the
  number of processors used
• Program is in mode i if i processors used
• Sequential execution time T1 = 1/R1 = 1
                                 1
          S = T1 / T =
                   *

                         ∑
                             n
                          i =1
                                 f i / Ri

              EENG-630 Chapter 3            15
Harmonic Mean Speedup
     Performance




     EENG-630 Chapter 3   16
Amdahl’s Law
• Assume Ri = i, w = (α, 0, 0, …, 1- α)
• System is either sequential, with probability
  α, or fully parallel with prob. 1- α
                     n
          Sn =
               1 + (n − 1)α
• Implies S → 1/ α as n → ∞
               EENG-630 Chapter 3   17
Speedup Performance




    EENG-630 Chapter 3   18
System Efficiency
• O(n) is the total # of unit operations
• T(n) is execution time in unit time steps
• T(n) < O(n) and T(1) = O(1)

           S ( N ) = T (1) / T (n)

                   S (n) T (1)
          E ( n) =      =
                     n    nT (n)
               EENG-630 Chapter 3    19
Redundancy and Utilization
• Redundancy signifies the extent of
  matching software and hardware parallelism

         R (n) = O(n) / O(1)
• Utilization indicates the percentage of
  resources kept busy during execution
                                  O ( n)
       U ( n) = R ( n) E ( n) =
                                nT (n20
               EENG-630 Chapter 3
                                        )
Quality of Parallelism
• Directly proportional to the speedup and
  efficiency and inversely related to the
  redundancy
• Upper-bounded by the speedup S(n)

                                   3
             S ( n) E ( n)     T (1)
    Q ( n) =               =   2
                 R ( n)      nT (n)O(n)
              EENG-630 Chapter 3       21
Example of Performance
• Given O(1) = T(1) = n3, O(n) = n3 + n2log n,
  and T(n) = 4 n3/(n+3)
  S(n)    =     (n+3)/4
  E(n)    =     (n+3)/(4n)
  R(n)    =     (n + log n)/n
  U(n)    =     (n+3)(n + log n)/(4n2)
  Q(n)    =     (n+3)2 / (16(n + log n))


               EENG-630 Chapter 3       22
Standard Performance Measures
• MIPS and Mflops
  – Depends on instruction set and program used
• Dhrystone results
  – Measure of integer performance
• Whestone results
  – Measure of floating-point performance
• TPS and KLIPS ratings
  – Transaction performance and reasoning power

               EENG-630 Chapter 3     23
Parallel Processing Applications
•   Drug design
•   High-speed civil transport
•   Ocean modeling
•   Ozone depletion research
•   Air pollution
•   Digital anatomy


                EENG-630 Chapter 3   24
Application Models for Parallel
           Computers
• Fixed-load model
  – Constant workload
• Fixed-time model
  – Demands constant program execution time
• Fixed-memory model
  – Limited by the memory bound



              EENG-630 Chapter 3    25
Algorithm Characteristics
• Deterministic vs. nondeterministic
• Computational granularity
• Parallelism profile
• Communication patterns and
  synchronization requirements
• Uniformity of operations
• Memory requirement and data structures
              EENG-630 Chapter 3   26
Isoefficiency Concept
• Relates workload to machine size n needed
  to maintain a fixed efficiency
                                   workload
                 w( s )
        E=                             overhead
           w( s ) + h( s, n)

• The smaller the power of n, the more
  scalable the system

              EENG-630 Chapter 3      27
Isoefficiency Function
• To maintain a constant E, w(s) should grow
  in proportion to h(s,n)

                   E
         w( s ) =      × h( s, n)
                  1− E
• C = E/(1-E) is constant for fixed E

        f E ( n) = C × h( s, n)
               EENG-630 Chapter 3   28
Speedup Performance Laws
• Amdahl’s law
  – for fixed workload or fixed problem size
• Gustafson’s law
  – for scaled problems (problem size increases
    with increased machine size)
• Speedup model
  – for scaled problems bounded by memory
    capacity

               EENG-630 Chapter 3      29
Amdahl’s Law
• As # of processors increase, the fixed load
  is distributed to more processors
• Minimal turnaround time is primary goal
• Speedup factor is upper-bounded by a
  sequential bottleneck
• Two cases:
   DOP < n
   DOP ≥ n

               EENG-630 Chapter 3   30
Fixed Load Speedup Factor
• Case 1: DOP > n                  • Case 2: DOP < n
            Wi   i 
  ti (i ) =      n                                        Wi
            i∆                     t i ( n ) = t i (∞ ) =
                                                            i∆
            m
               Wi   i
 T ( n) = ∑         n
          i =1 i∆                     m

                        T (1)           ∑W     i
                   Sn =        =        i =i

                        T ( n)      m
                                        Wi     i 
                                   ∑i
                                   i =1
                                               n
                                                
                        EENG-630 Chapter 3            31
Gustafson’s Law
• With Amdahl’s Law, the workload cannot
  scale to match the available computing
  power as n increases
• Gustafson’s Law fixes the time, allowing
  the problem size to increase with higher n
• Not saving time, but increasing accuracy


              EENG-630 Chapter 3   32
Fixed-time Speedup
• As the machine size increases, have
  increased workload and new profile
• In general, Wi’ > Wi for 2 ≤ i ≤ m’ and   W1’
  = W1
• Assume T(1) = T’(n)



               EENG-630 Chapter 3   33
Gustafson’s Scaled Speedup
                             '
 m
            Wi  m
                                 i 
 ∑Wi = ∑ i
 i =1  i =1
                                 n
                                  
                                      + Q ( n)
         m'

         ∑W         i
                        '
                              W1 + nWn
  S ='
     n
         i =1
                            =
           m
                              W1 + Wn
         ∑W
         i =1
                    i

          EENG-630 Chapter 3               34
Memory Bounded Speedup
             Model
• Idea is to solve largest problem, limited by
  memory space
• Results in a scaled workload and higher accuracy
• Each node can handle only a small subproblem for
  distributed memory
• Using a large # of nodes collectively increases the
  memory capacity proportionally


                 EENG-630 Chapter 3      35
Fixed-Memory Speedup
• Let M be the memory requirement and W
  the computational workload: W = g(M)
• g*(nM)=G(n)g(M)=G(n)Wn
                   m*

               ∑W         i
                              *
                                        W1 + G (n)Wn
 S =
  *
  n
                   i =1
                                     =
       m*
            Wi *    i                W1 + G (n)Wn / n
       ∑ i
       i =1
                     n  + Q ( n)
                     
                     EENG-630 Chapter 3       36
Relating Speedup Models
• G(n) reflects the increase in workload as
  memory increases n times
• G(n) = 1 : Fixed problem size (Amdahl)
• G(n) = n : Workload increases n times when
  memory increased n times (Gustafson)
• G(n) > n : workload increases faster than
  memory than the memory requirement

              EENG-630 Chapter 3   37
Scalability Metrics
• Machine size (n) : # of processors
• Clock rate (f) : determines basic m/c cycle
• Problem size (s) : amount of computational
  workload. Directly proportional to T(s,1).
• CPU time (T(s,n)) : actual CPU time for
  execution
• I/O demand (d) : demand in moving the
  program, data, and results for a given run
              EENG-630 Chapter 3   38
Scalability Metrics
• Memory capacity (m) : max # of memory words
  demanded
• Communication overhead (h(s,n)) : amount of
  time for interprocessor communication,
  synchronization, etc.
• Computer cost (c) : total cost of h/w and s/w
  resources required
• Programming overhead (p) : development
  overhead associated with an application program

                EENG-630 Chapter 3     39
Speedup and Efficiency
• The problem size is the independent
  parameter
                         T ( s,1)
      S ( s, n) =
                  T ( s, n) + h( s, n)
                       S ( s, n)
           E ( s, n) =
                           n
              EENG-630 Chapter 3    40
Scalable Systems
• Ideally, if E(s,n)=1 for all algorithms and
  any s and n, system is scalable
• Practically, consider the scalability of a m/c

                     S ( s, n) TI ( s, n)
         Φ ( s, n) =            =
                     S I ( s, n) T ( s, n)


                EENG-630 Chapter 3     41

More Related Content

What's hot

Interconnection Network
Interconnection NetworkInterconnection Network
Interconnection NetworkHeman Pathak
 
system interconnect architectures in ACA
system interconnect architectures in ACAsystem interconnect architectures in ACA
system interconnect architectures in ACAPankaj Kumar Jain
 
Dichotomy of parallel computing platforms
Dichotomy of parallel computing platformsDichotomy of parallel computing platforms
Dichotomy of parallel computing platformsSyed Zaid Irshad
 
Introduction to Distributed System
Introduction to Distributed SystemIntroduction to Distributed System
Introduction to Distributed SystemSunita Sahu
 
Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]Ravindra Raju Kolahalam
 
Foult Tolerence In Distributed System
Foult Tolerence In Distributed SystemFoult Tolerence In Distributed System
Foult Tolerence In Distributed SystemRajan Kumar
 
System interconnect architecture
System interconnect architectureSystem interconnect architecture
System interconnect architectureGagan Kumar
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systemssumitjain2013
 
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSINGADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING Zena Abo-Altaheen
 
Distributed & parallel system
Distributed & parallel systemDistributed & parallel system
Distributed & parallel systemManish Singh
 
INTER PROCESS COMMUNICATION (IPC).pptx
INTER PROCESS COMMUNICATION (IPC).pptxINTER PROCESS COMMUNICATION (IPC).pptx
INTER PROCESS COMMUNICATION (IPC).pptxLECO9
 
Parallel computing
Parallel computingParallel computing
Parallel computingVinay Gupta
 
Synchronization in distributed computing
Synchronization in distributed computingSynchronization in distributed computing
Synchronization in distributed computingSVijaylakshmi
 
Processor allocation in Distributed Systems
Processor allocation in Distributed SystemsProcessor allocation in Distributed Systems
Processor allocation in Distributed SystemsRitu Ranjan Shrivastwa
 
Heterogeneous computing
Heterogeneous computingHeterogeneous computing
Heterogeneous computingRashid Ansari
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSKathirvel Ayyaswamy
 

What's hot (20)

Interconnection Network
Interconnection NetworkInterconnection Network
Interconnection Network
 
system interconnect architectures in ACA
system interconnect architectures in ACAsystem interconnect architectures in ACA
system interconnect architectures in ACA
 
Dichotomy of parallel computing platforms
Dichotomy of parallel computing platformsDichotomy of parallel computing platforms
Dichotomy of parallel computing platforms
 
Introduction to Distributed System
Introduction to Distributed SystemIntroduction to Distributed System
Introduction to Distributed System
 
Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]
 
Distributed Operating System_4
Distributed Operating System_4Distributed Operating System_4
Distributed Operating System_4
 
Foult Tolerence In Distributed System
Foult Tolerence In Distributed SystemFoult Tolerence In Distributed System
Foult Tolerence In Distributed System
 
System interconnect architecture
System interconnect architectureSystem interconnect architecture
System interconnect architecture
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systems
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSINGADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
 
Distributed & parallel system
Distributed & parallel systemDistributed & parallel system
Distributed & parallel system
 
INTER PROCESS COMMUNICATION (IPC).pptx
INTER PROCESS COMMUNICATION (IPC).pptxINTER PROCESS COMMUNICATION (IPC).pptx
INTER PROCESS COMMUNICATION (IPC).pptx
 
Parallel computing
Parallel computingParallel computing
Parallel computing
 
Synchronization in distributed computing
Synchronization in distributed computingSynchronization in distributed computing
Synchronization in distributed computing
 
Processor allocation in Distributed Systems
Processor allocation in Distributed SystemsProcessor allocation in Distributed Systems
Processor allocation in Distributed Systems
 
Heterogeneous computing
Heterogeneous computingHeterogeneous computing
Heterogeneous computing
 
Distributed Operating System_1
Distributed Operating System_1Distributed Operating System_1
Distributed Operating System_1
 
Chpt7
Chpt7Chpt7
Chpt7
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMS
 

Viewers also liked

Viewers also liked (20)

Parallel computing chapter 2
Parallel computing chapter 2Parallel computing chapter 2
Parallel computing chapter 2
 
Parallel computing(2)
Parallel computing(2)Parallel computing(2)
Parallel computing(2)
 
Computer architecture kai hwang
Computer architecture   kai hwangComputer architecture   kai hwang
Computer architecture kai hwang
 
Bengali optical character recognition system
Bengali optical character recognition systemBengali optical character recognition system
Bengali optical character recognition system
 
Observer pattern
Observer patternObserver pattern
Observer pattern
 
Mediator pattern
Mediator patternMediator pattern
Mediator pattern
 
Clustering manual
Clustering manualClustering manual
Clustering manual
 
Parallel searching
Parallel searchingParallel searching
Parallel searching
 
Advanced computer architecture
Advanced computer architectureAdvanced computer architecture
Advanced computer architecture
 
Program and Network Properties
Program and Network PropertiesProgram and Network Properties
Program and Network Properties
 
Statistical Machine Translation for Language Localisation
Statistical Machine Translation for Language LocalisationStatistical Machine Translation for Language Localisation
Statistical Machine Translation for Language Localisation
 
Brown Junior Public School - EQAO School Report
Brown Junior Public School - EQAO School ReportBrown Junior Public School - EQAO School Report
Brown Junior Public School - EQAO School Report
 
Aca2 08 new
Aca2 08 newAca2 08 new
Aca2 08 new
 
Aca2 01 new
Aca2 01 newAca2 01 new
Aca2 01 new
 
Map reduce
Map reduceMap reduce
Map reduce
 
R with excel
R with excelR with excel
R with excel
 
Apache hadoop & map reduce
Apache hadoop & map reduceApache hadoop & map reduce
Apache hadoop & map reduce
 
Strategy pattern.pdf
Strategy pattern.pdfStrategy pattern.pdf
Strategy pattern.pdf
 
Interviews 1
Interviews 1Interviews 1
Interviews 1
 
Twitter
TwitterTwitter
Twitter
 

Similar to Parallel computing chapter 3

CS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of AlgorithmsCS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of AlgorithmsKrishnan MuthuManickam
 
Algorithm analysis
Algorithm analysisAlgorithm analysis
Algorithm analysissumitbardhan
 
Algorithm analysis
Algorithm analysisAlgorithm analysis
Algorithm analysisAkshay Dagar
 
Data Structures- Part2 analysis tools
Data Structures- Part2 analysis toolsData Structures- Part2 analysis tools
Data Structures- Part2 analysis toolsAbdullah Al-hazmy
 
Asymptotics 140510003721-phpapp02
Asymptotics 140510003721-phpapp02Asymptotics 140510003721-phpapp02
Asymptotics 140510003721-phpapp02mansab MIRZA
 
algorithmanalysis and effciency.pptx
algorithmanalysis and effciency.pptxalgorithmanalysis and effciency.pptx
algorithmanalysis and effciency.pptxChSreenivasuluReddy
 
jn;lm;lkm';m';;lmppt of data structure.pdf
jn;lm;lkm';m';;lmppt of data structure.pdfjn;lm;lkm';m';;lmppt of data structure.pdf
jn;lm;lkm';m';;lmppt of data structure.pdfVinayNassa3
 
Chapter One.pdf
Chapter One.pdfChapter One.pdf
Chapter One.pdfabay golla
 
Data Structure & Algorithms - Mathematical
Data Structure & Algorithms - MathematicalData Structure & Algorithms - Mathematical
Data Structure & Algorithms - Mathematicalbabuk110
 
Data Structures Notes
Data Structures NotesData Structures Notes
Data Structures NotesRobinRohit2
 
Unit i basic concepts of algorithms
Unit i basic concepts of algorithmsUnit i basic concepts of algorithms
Unit i basic concepts of algorithmssangeetha s
 
Algorithm chapter 2
Algorithm chapter 2Algorithm chapter 2
Algorithm chapter 2chidabdu
 
Introduction to Data Structures Sorting and searching
Introduction to Data Structures Sorting and searchingIntroduction to Data Structures Sorting and searching
Introduction to Data Structures Sorting and searchingMvenkatarao
 

Similar to Parallel computing chapter 3 (20)

CS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of AlgorithmsCS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of Algorithms
 
Unit 1
Unit 1Unit 1
Unit 1
 
Algorithm analysis
Algorithm analysisAlgorithm analysis
Algorithm analysis
 
Algorithm analysis
Algorithm analysisAlgorithm analysis
Algorithm analysis
 
Big Oh.ppt
Big Oh.pptBig Oh.ppt
Big Oh.ppt
 
Data Structures- Part2 analysis tools
Data Structures- Part2 analysis toolsData Structures- Part2 analysis tools
Data Structures- Part2 analysis tools
 
Unit ii algorithm
Unit   ii algorithmUnit   ii algorithm
Unit ii algorithm
 
Asymptotics 140510003721-phpapp02
Asymptotics 140510003721-phpapp02Asymptotics 140510003721-phpapp02
Asymptotics 140510003721-phpapp02
 
algorithmanalysis and effciency.pptx
algorithmanalysis and effciency.pptxalgorithmanalysis and effciency.pptx
algorithmanalysis and effciency.pptx
 
Iare ds ppt_3
Iare ds ppt_3Iare ds ppt_3
Iare ds ppt_3
 
jn;lm;lkm';m';;lmppt of data structure.pdf
jn;lm;lkm';m';;lmppt of data structure.pdfjn;lm;lkm';m';;lmppt of data structure.pdf
jn;lm;lkm';m';;lmppt of data structure.pdf
 
Chapter One.pdf
Chapter One.pdfChapter One.pdf
Chapter One.pdf
 
Algorithms
Algorithms Algorithms
Algorithms
 
Chapter two
Chapter twoChapter two
Chapter two
 
Data Structure & Algorithms - Mathematical
Data Structure & Algorithms - MathematicalData Structure & Algorithms - Mathematical
Data Structure & Algorithms - Mathematical
 
L12 complexity
L12 complexityL12 complexity
L12 complexity
 
Data Structures Notes
Data Structures NotesData Structures Notes
Data Structures Notes
 
Unit i basic concepts of algorithms
Unit i basic concepts of algorithmsUnit i basic concepts of algorithms
Unit i basic concepts of algorithms
 
Algorithm chapter 2
Algorithm chapter 2Algorithm chapter 2
Algorithm chapter 2
 
Introduction to Data Structures Sorting and searching
Introduction to Data Structures Sorting and searchingIntroduction to Data Structures Sorting and searching
Introduction to Data Structures Sorting and searching
 

More from Md. Mahedi Mahfuj

More from Md. Mahedi Mahfuj (19)

Parallel computing(1)
Parallel computing(1)Parallel computing(1)
Parallel computing(1)
 
Message passing interface
Message passing interfaceMessage passing interface
Message passing interface
 
Matrix multiplication graph
Matrix multiplication graphMatrix multiplication graph
Matrix multiplication graph
 
Strategy pattern
Strategy patternStrategy pattern
Strategy pattern
 
Database management system chapter16
Database management system chapter16Database management system chapter16
Database management system chapter16
 
Database management system chapter15
Database management system chapter15Database management system chapter15
Database management system chapter15
 
Database management system chapter12
Database management system chapter12Database management system chapter12
Database management system chapter12
 
Strategies in job search process
Strategies in job search processStrategies in job search process
Strategies in job search process
 
Report writing(short)
Report writing(short)Report writing(short)
Report writing(short)
 
Report writing(long)
Report writing(long)Report writing(long)
Report writing(long)
 
Job search_resume
Job search_resumeJob search_resume
Job search_resume
 
Job search_interview
Job search_interviewJob search_interview
Job search_interview
 
Basic and logical implementation of r language
Basic and logical implementation of r language Basic and logical implementation of r language
Basic and logical implementation of r language
 
R language
R languageR language
R language
 
Big data
Big dataBig data
Big data
 
Chatbot Artificial Intelligence
Chatbot Artificial IntelligenceChatbot Artificial Intelligence
Chatbot Artificial Intelligence
 
Cloud testing v1
Cloud testing v1Cloud testing v1
Cloud testing v1
 
Distributed deadlock
Distributed deadlockDistributed deadlock
Distributed deadlock
 
Paper review
Paper review Paper review
Paper review
 

Recently uploaded

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsZilliz
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 

Recently uploaded (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 

Parallel computing chapter 3

  • 1. Chapter 3: Principles of Scalable Performance • Performance measures • Speedup laws • Scalability principles • Scaling up vs. scaling down EENG-630 Chapter 3 1
  • 2. Performance metrics and measures • Parallelism profiles • Asymptotic speedup factor • System efficiency, utilization and quality • Standard performance measures EENG-630 Chapter 3 2
  • 3. Degree of parallelism • Reflects the matching of software and hardware parallelism • Discrete time function – measure, for each time period, the # of processors used • Parallelism profile is a plot of the DOP as a function of time • Ideally have unlimited resources EENG-630 Chapter 3 3
  • 4. Factors affecting parallelism profiles • Algorithm structure • Program optimization • Resource utilization • Run-time conditions • Realistically limited by # of available processors, memory, and other nonprocessor resources EENG-630 Chapter 3 4
  • 5. Average parallelism variables • n – homogeneous processors • m – maximum parallelism in a profile ∀ ∆ - computing capacity of a single processor (execution rate only, no overhead) • DOP=i – # processors busy during an observation period EENG-630 Chapter 3 5
  • 6. Average parallelism • Total amount of work performed is proportional to the area under the profile curve t2 W = ∆ ∫ DOP(t )dt t1 m W = ∆ ∑ i ⋅ ti i =1 EENG-630 Chapter 3 6
  • 7. Average parallelism 1 t2 A= t 2 − t1 ∫t1 DOP(t )dt  m   m  A =  ∑ i ⋅ ti  /  ∑ ti   i =1   i =1  EENG-630 Chapter 3 7
  • 8. Example: parallelism profile and average parallelism EENG-630 Chapter 3 8
  • 9. Asymptotic speedup m m T (1) = ∑ ti (1) = ∑ m Wi T (1) ∑W i i =1 i =1 ∆ S∞ = = i =1 T (∞ ) m m T ( ∞ ) = ∑ ti ( ∞ ) = ∑ m Wi ∑W /i i =1 i i =1 i =1 i∆ = A in the ideal case (response time) EENG-630 Chapter 3 9
  • 10. Performance measures • Consider n processors executing m programs in various modes • Want to define the mean performance of these multimode computers: – Arithmetic mean performance – Geometric mean performance – Harmonic mean performance EENG-630 Chapter 3 10
  • 11. Arithmetic mean performance m Ra = ∑ Ri / m Arithmetic mean execution rate (assumes equal weighting) i =1 m R = ∑ ( f i Ri ) * Weighted arithmetic mean a execution rate i =1 -proportional to the sum of the inverses of execution times EENG-630 Chapter 3 11
  • 12. Geometric mean performance m Rg = ∏ R 1/ m i Geometric mean execution rate i =1 m R = ∏ Ri * g fi Weighted geometric mean execution rate i =1 -does not summarize the real performance since it does not have the inverse relation with the total time EENG-630 Chapter 3 12
  • 13. Harmonic mean performance Ti = 1 / Ri Mean execution time per instruction For program i 1 m 1 m 1 Ta = ∑ Ti = ∑ Arithmetic mean execution time m i =1 m i =1 Ri per instruction EENG-630 Chapter 3 13
  • 14. Harmonic mean performance m Rh =1 / Ta = m Harmonic mean execution rate ∑1 / R ) ( i=1 i 1 R = * h m Weighted harmonic mean execution rate ∑( f i =1 i / Ri ) -corresponds to total # of operations divided by the total time (closest to the real performance) EENG-630 Chapter 3 14
  • 15. Harmonic Mean Speedup • Ties the various modes of a program to the number of processors used • Program is in mode i if i processors used • Sequential execution time T1 = 1/R1 = 1 1 S = T1 / T = * ∑ n i =1 f i / Ri EENG-630 Chapter 3 15
  • 16. Harmonic Mean Speedup Performance EENG-630 Chapter 3 16
  • 17. Amdahl’s Law • Assume Ri = i, w = (α, 0, 0, …, 1- α) • System is either sequential, with probability α, or fully parallel with prob. 1- α n Sn = 1 + (n − 1)α • Implies S → 1/ α as n → ∞ EENG-630 Chapter 3 17
  • 18. Speedup Performance EENG-630 Chapter 3 18
  • 19. System Efficiency • O(n) is the total # of unit operations • T(n) is execution time in unit time steps • T(n) < O(n) and T(1) = O(1) S ( N ) = T (1) / T (n) S (n) T (1) E ( n) = = n nT (n) EENG-630 Chapter 3 19
  • 20. Redundancy and Utilization • Redundancy signifies the extent of matching software and hardware parallelism R (n) = O(n) / O(1) • Utilization indicates the percentage of resources kept busy during execution O ( n) U ( n) = R ( n) E ( n) = nT (n20 EENG-630 Chapter 3 )
  • 21. Quality of Parallelism • Directly proportional to the speedup and efficiency and inversely related to the redundancy • Upper-bounded by the speedup S(n) 3 S ( n) E ( n) T (1) Q ( n) = = 2 R ( n) nT (n)O(n) EENG-630 Chapter 3 21
  • 22. Example of Performance • Given O(1) = T(1) = n3, O(n) = n3 + n2log n, and T(n) = 4 n3/(n+3) S(n) = (n+3)/4 E(n) = (n+3)/(4n) R(n) = (n + log n)/n U(n) = (n+3)(n + log n)/(4n2) Q(n) = (n+3)2 / (16(n + log n)) EENG-630 Chapter 3 22
  • 23. Standard Performance Measures • MIPS and Mflops – Depends on instruction set and program used • Dhrystone results – Measure of integer performance • Whestone results – Measure of floating-point performance • TPS and KLIPS ratings – Transaction performance and reasoning power EENG-630 Chapter 3 23
  • 24. Parallel Processing Applications • Drug design • High-speed civil transport • Ocean modeling • Ozone depletion research • Air pollution • Digital anatomy EENG-630 Chapter 3 24
  • 25. Application Models for Parallel Computers • Fixed-load model – Constant workload • Fixed-time model – Demands constant program execution time • Fixed-memory model – Limited by the memory bound EENG-630 Chapter 3 25
  • 26. Algorithm Characteristics • Deterministic vs. nondeterministic • Computational granularity • Parallelism profile • Communication patterns and synchronization requirements • Uniformity of operations • Memory requirement and data structures EENG-630 Chapter 3 26
  • 27. Isoefficiency Concept • Relates workload to machine size n needed to maintain a fixed efficiency workload w( s ) E= overhead w( s ) + h( s, n) • The smaller the power of n, the more scalable the system EENG-630 Chapter 3 27
  • 28. Isoefficiency Function • To maintain a constant E, w(s) should grow in proportion to h(s,n) E w( s ) = × h( s, n) 1− E • C = E/(1-E) is constant for fixed E f E ( n) = C × h( s, n) EENG-630 Chapter 3 28
  • 29. Speedup Performance Laws • Amdahl’s law – for fixed workload or fixed problem size • Gustafson’s law – for scaled problems (problem size increases with increased machine size) • Speedup model – for scaled problems bounded by memory capacity EENG-630 Chapter 3 29
  • 30. Amdahl’s Law • As # of processors increase, the fixed load is distributed to more processors • Minimal turnaround time is primary goal • Speedup factor is upper-bounded by a sequential bottleneck • Two cases:  DOP < n  DOP ≥ n EENG-630 Chapter 3 30
  • 31. Fixed Load Speedup Factor • Case 1: DOP > n • Case 2: DOP < n Wi i  ti (i ) = n Wi i∆   t i ( n ) = t i (∞ ) = i∆ m Wi i T ( n) = ∑ n i =1 i∆   m T (1) ∑W i Sn = = i =i T ( n) m Wi i  ∑i i =1 n   EENG-630 Chapter 3 31
  • 32. Gustafson’s Law • With Amdahl’s Law, the workload cannot scale to match the available computing power as n increases • Gustafson’s Law fixes the time, allowing the problem size to increase with higher n • Not saving time, but increasing accuracy EENG-630 Chapter 3 32
  • 33. Fixed-time Speedup • As the machine size increases, have increased workload and new profile • In general, Wi’ > Wi for 2 ≤ i ≤ m’ and W1’ = W1 • Assume T(1) = T’(n) EENG-630 Chapter 3 33
  • 34. Gustafson’s Scaled Speedup ' m Wi m i  ∑Wi = ∑ i i =1 i =1 n   + Q ( n) m' ∑W i ' W1 + nWn S =' n i =1 = m W1 + Wn ∑W i =1 i EENG-630 Chapter 3 34
  • 35. Memory Bounded Speedup Model • Idea is to solve largest problem, limited by memory space • Results in a scaled workload and higher accuracy • Each node can handle only a small subproblem for distributed memory • Using a large # of nodes collectively increases the memory capacity proportionally EENG-630 Chapter 3 35
  • 36. Fixed-Memory Speedup • Let M be the memory requirement and W the computational workload: W = g(M) • g*(nM)=G(n)g(M)=G(n)Wn m* ∑W i * W1 + G (n)Wn S = * n i =1 = m* Wi * i  W1 + G (n)Wn / n ∑ i i =1  n  + Q ( n)   EENG-630 Chapter 3 36
  • 37. Relating Speedup Models • G(n) reflects the increase in workload as memory increases n times • G(n) = 1 : Fixed problem size (Amdahl) • G(n) = n : Workload increases n times when memory increased n times (Gustafson) • G(n) > n : workload increases faster than memory than the memory requirement EENG-630 Chapter 3 37
  • 38. Scalability Metrics • Machine size (n) : # of processors • Clock rate (f) : determines basic m/c cycle • Problem size (s) : amount of computational workload. Directly proportional to T(s,1). • CPU time (T(s,n)) : actual CPU time for execution • I/O demand (d) : demand in moving the program, data, and results for a given run EENG-630 Chapter 3 38
  • 39. Scalability Metrics • Memory capacity (m) : max # of memory words demanded • Communication overhead (h(s,n)) : amount of time for interprocessor communication, synchronization, etc. • Computer cost (c) : total cost of h/w and s/w resources required • Programming overhead (p) : development overhead associated with an application program EENG-630 Chapter 3 39
  • 40. Speedup and Efficiency • The problem size is the independent parameter T ( s,1) S ( s, n) = T ( s, n) + h( s, n) S ( s, n) E ( s, n) = n EENG-630 Chapter 3 40
  • 41. Scalable Systems • Ideally, if E(s,n)=1 for all algorithms and any s and n, system is scalable • Practically, consider the scalability of a m/c S ( s, n) TI ( s, n) Φ ( s, n) = = S I ( s, n) T ( s, n) EENG-630 Chapter 3 41