SlideShare a Scribd company logo
1 of 338
Download to read offline
New operational instruments for statistical exploration (=NOISE)




                                New operational instruments
                                 for statistical exploration
                                         (=NOISE)

                                               Christian P. Robert

                                          Universit´ Paris Dauphine
                                                   e
                                  http://www.ceremade.dauphine.fr/~xian


                                         Licence MI2E, 2008–2009
New operational instruments for statistical exploration (=NOISE)




Outline


      1    Simulation of random variables

      2    Monte Carlo Method and EM algorithm

      3    Bootstrap Method

      4    Rudiments of Nonparametric Statistics
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables




Chapter 1 :
Simulation of random variables


                Introduction
                Random generator
                Non-uniform distributions (1)
                Non-uniform distributions (2)
                Markovian methods
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Introduction

       Necessity to ”reproduce chance” on a computer
               Evaluation of the behaviour of a complex system (network,
               computer program, queue, particle system, atmosphere,
               epidemics, economic actions, &tc)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Introduction

       Necessity to ”reproduce chance” on a computer
               Evaluation of the behaviour of a complex system (network,
               computer program, queue, particle system, atmosphere,
               epidemics, economic actions, &tc)
               Determine probabilistic properties of a new statistical
               procedure or under an unknown distribution [bootstrap]
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Introduction

       Necessity to ”reproduce chance” on a computer
               Evaluation of the behaviour of a complex system (network,
               computer program, queue, particle system, atmosphere,
               epidemics, economic actions, &tc)
               Determine probabilistic properties of a new statistical
               procedure or under an unknown distribution [bootstrap]
               Validation of a probabilistic model
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Introduction

       Necessity to ”reproduce chance” on a computer
               Evaluation of the behaviour of a complex system (network,
               computer program, queue, particle system, atmosphere,
               epidemics, economic actions, &tc)
               Determine probabilistic properties of a new statistical
               procedure or under an unknown distribution [bootstrap]
               Validation of a probabilistic model
               Approximation of an expectation/integral for a non-standard
               distribution [Law of Large Numbers]
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Introduction

       Necessity to ”reproduce chance” on a computer
               Evaluation of the behaviour of a complex system (network,
               computer program, queue, particle system, atmosphere,
               epidemics, economic actions, &tc)
               Determine probabilistic properties of a new statistical
               procedure or under an unknown distribution [bootstrap]
               Validation of a probabilistic model
               Approximation of an expectation/integral for a non-standard
               distribution [Law of Large Numbers]
               Maximisation of a weakly regular function/likelihood
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction




       Example (TCL for the binomial distribution)
       If
                                                   Xn ∼ B(n, p) ,
       Xn converges in distribution to the normal distribution:
                              √                           n→∞               p(1 − p)
                                    n (Xn /n − p)                  N   0,
                                                                               n
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction




                                                         n= 4                                                                                 n= 8                                                                                     n= 16




                                                                                                 10 15 20 25




                                                                                                                                                                                        20
          30




                                                                                                                                                                                        15
          20




                                                                                                                                                                                        10
          10




                                                                                                                                                                                        5
                                                                                                 5
          0




                                                                                                 0




                                                                                                                                                                                        0
                         0.0      0.2              0.4           0.6            0.8        1.0                   0.0      0.2           0.4            0.6      0.8              1.0                       0.2         0.3     0.4    0.5      0.6          0.7      0.8      0.9




                                                         n= 32                                                                                n= 64                                                                                   n= 128
          14




                                                                                                 10 15 20 25




                                                                                                                                                                                        15
          0 2 4 6 8 10




                                                                                                                                                                                        10
                                                                                                                                                                                        5
                                                                                                 5
                                                                                                 0




                                                                                                                                                                                        0
                         0.2    0.3          0.4          0.5          0.6          0.7    0.8                   0.3            0.4              0.5                0.6                                  0.35      0.40        0.45    0.50     0.55          0.60         0.65




                                                     n= 256                                                                               n= 512                                                                                      n= 1024
          30




                                                                                                                                                                                        10 20 30 40 50
                                                                                                 5 10 15 20 25
          20
          5 10
          0




                                                                                                 0




                                                                                                                                                                                        0
                         0.40         0.45               0.50                0.55         0.60                   0.44   0.46     0.48     0.50     0.52      0.54         0.56   0.58                           0.46         0.48     0.50           0.52         0.54
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction




       Example (Stochastic minimisation)
       Consider the function

               h(x, y) = (x sin(20y) + y sin(20x))2 cosh(sin(10x)x)
                              + (x cos(10y) − y sin(10x))2 cosh(cos(20y)y) ,

        to be minimised.
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction




       Example (Stochastic minimisation)
       Consider the function

               h(x, y) = (x sin(20y) + y sin(20x))2 cosh(sin(10x)x)
                              + (x cos(10y) − y sin(10x))2 cosh(cos(20y)y) ,

        to be minimised. (I know that the global minimum is 0 for
       (x, y) = (0, 0).)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction
       6
       5
       Z
       34
           21
            0




               1




                    0.5
                                                                                        1




                                                                                  0.5
                            0
                                Y

                                                                              0
                                                                          X
                                      -0.5

                                                                   -0.5




                                                  -1
                                                       -1
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



       Example (Stochastic minimisation (2))
       Instead of solving the first order equations

                                       ∂h(x, y)      ∂h(x, y)
                                                = 0,          =0
                                         ∂x            ∂y
       and of checking that the second order conditions are met, we can
       generate a random sequence in R2
                                                         αj
                                    θj+1 = θj +              ∆h(θj , βj ζj ) ζj
                                                         2βj
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



       Example (Stochastic minimisation (2))
       Instead of solving the first order equations

                                       ∂h(x, y)      ∂h(x, y)
                                                = 0,          =0
                                         ∂x            ∂y
       and of checking that the second order conditions are met, we can
       generate a random sequence in R2
                                                         αj
                                    θj+1 = θj +              ∆h(θj , βj ζj ) ζj
                                                         2βj

        where
           ⋄ the ζj ’s are uniform on the unit circle x2 + y 2 = 1;
           ⋄ ∆h(θ, ζ) = h(θ + ζ) − h(θ − ζ);
           ⋄ (αj ) and (βj ) converge to 0
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction




                    0.8
                    0.6
                    0.4
                    0.2




                          -0.2         0.0              0.2        0.4   0.6




                            Case when αj = 1/10 log(1 + j) et βj = 1/j
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



The traveling salesman problem


       A classical allocation problem:
           Salesman who needs to visit
           n cities
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



The traveling salesman problem


       A classical allocation problem:
           Salesman who needs to visit
           n cities
           Traveling costs between
           pairs of cities known [and
           different]
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



The traveling salesman problem


       A classical allocation problem:
           Salesman who needs to visit
           n cities
           Traveling costs between
           pairs of cities known [and
           different]
           Search of the optimum
           circuit
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



The traveling salesman problem


       A classical allocation problem:
           Salesman who needs to visit
           n cities
           Traveling costs between
           pairs of cities known [and
           different]
           Search of the optimum
           circuit
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



An NP-complete problem

       The traveling salesman
       problem is an example of
       mathematical problems that
       require explosive resolution
       times
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



An NP-complete problem

       The traveling salesman
       problem is an example of
       mathematical problems that
       require explosive resolution
       times
       Number of possible circuits n!
       and exact solutions available
       in O(2n ) time
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



An NP-complete problem

       The traveling salesman
       problem is an example of
       mathematical problems that
       require explosive resolution
       times
       Number of possible circuits n!
       and exact solutions available
       in O(2n ) time
       Numerous practical
       consequences (networks,
       integrated circuit design,
       genomic sequencing, &tc.)
                                                                   Procter & Gamble
                                                                   competition, 1962
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



An open problem




   Exact solution for 15, 112
   German cities found in 2001 in
   22.6 CPU years.
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



An open problem




   Exact solution for 15, 112                                      Exact solution for the 24, 978
   German cities found in 2001 in                                  Sweedish cities found in 2004 in
   22.6 CPU years.                                                 84.8 CPU years.
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Resolution via simulation

       The simulated annealing algorithm:
       Repeat
               Random modifications of parts of the original circuit with cost
               C0
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Resolution via simulation

       The simulated annealing algorithm:
       Repeat
               Random modifications of parts of the original circuit with cost
               C0
               Evaluation of the cost C of the new circuit
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Resolution via simulation

       The simulated annealing algorithm:
       Repeat
               Random modifications of parts of the original circuit with cost
               C0
               Evaluation of the cost C of the new circuit
               Acceptation of the new circuit with probability

                                                             C0 − C
                                                   exp                ∧1
                                                               T
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Resolution via simulation

       The simulated annealing algorithm:
       Repeat
               Random modifications of parts of the original circuit with cost
               C0
               Evaluation of the cost C of the new circuit
               Acceptation of the new circuit with probability

                                                             C0 − C
                                                   exp                ∧1
                                                               T

       T , temperature, is progressively reduced
                                                                           [Metropolis, 1953]
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Illustration

       Example (400 cities)




                                                        T = 1.2
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Illustration

       Example (400 cities)




                                                        T = 0.8
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Illustration

       Example (400 cities)




                                                        T = 0.4
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Illustration

       Example (400 cities)




                                                        T = 0.0
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Option pricing

       Complicated computation of expectations/average values of
       options, E[CT ], necessary to evaluate the entry price
       (1 + r)−T E[CT ]
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Option pricing

       Complicated computation of expectations/average values of
       options, E[CT ], necessary to evaluate the entry price
       (1 + r)−T E[CT ]

       Example (European options)
       Case when
                                                CT = (ST − K)+
       with

            ST = S0 × Y1 × · · · × YT , Pr(Yi = u) = 1 − Pr(Yi = d) = p .
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Option pricing

       Complicated computation of expectations/average values of
       options, E[CT ], necessary to evaluate the entry price
       (1 + r)−T E[CT ]

       Example (European options)
       Case when
                                                CT = (ST − K)+
       with

            ST = S0 × Y1 × · · · × YT , Pr(Yi = u) = 1 − Pr(Yi = d) = p .

       Resolution via the simulation of the binomial rv’s Yi
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Option pricing (cont’d)


       Example (Asian options)
       Continuous time model where
                                                               +           T                +
                                        T
                              1                                        1
                CT =                        S(t)dt − K             ≈             S(n) − K       ,
                              T     0                                  T
                                                                           n=1

       with
                                                                                    iid
            S(n + 1) = S(n) × exp {∆X(n + 1)} , ∆X(n) ∼ N (0, σ 2 ) .
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Introduction



Option pricing (cont’d)


       Example (Asian options)
       Continuous time model where
                                                               +           T                +
                                        T
                              1                                        1
                CT =                        S(t)dt − K             ≈             S(n) − K       ,
                              T     0                                  T
                                                                           n=1

       with
                                                                                    iid
            S(n + 1) = S(n) × exp {∆X(n + 1)} , ∆X(n) ∼ N (0, σ 2 ) .

       Resolution via the simulation of the normal rv’s ∆Xi
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Random generator



Pseudo-random generator
       Pivotal element of simulation techniques: they all require the
       availability of uniform U (0, 1) random variables via
       transformations
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Random generator



Pseudo-random generator
       Pivotal element of simulation techniques: they all require the
       availability of uniform U (0, 1) random variables via
       transformations
       Definition (Pseudo-random generator)
       Un Pseudo-random generator is a deterministic Ψ from ]0, 1[ to
       ]0, 1[ such that, for any starting value u0 and any n, the sequence

                                    {u0 , Ψ(u0 ), Ψ(Ψ(u0 )), . . . , Ψn (u0 )}

       behaves (statistically) like an iid sequence U (0, 1)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Random generator



Pseudo-random generator
       Pivotal element of simulation techniques: they all require the
       availability of uniform U (0, 1) random variables via
       transformations
       Definition (Pseudo-random generator)
       Un Pseudo-random generator is a deterministic Ψ from ]0, 1[ to
       ]0, 1[ such that, for any starting value u0 and any n, the sequence

                                    {u0 , Ψ(u0 ), Ψ(Ψ(u0 )), . . . , Ψn (u0 )}

       behaves (statistically) like an iid sequence U (0, 1)

       ¡Paradox!
       While avoiding randomness, the deterministic sequence
       (u0 , u1 = Ψ(u0 ), . . . , un = Ψ(un−1 ))
       must resemble a random sequence!
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Random generator




       In R, use of the procedure
       runif( )
       Description:
       ‘runif’ generates random deviates.
       Example:
       u = runif(20)
       ‘Random.seed’ is an integer vector, containing the random number
       generator (RNG) state for random number generation in R. It can
       be saved and restored, but should not be altered by the user.
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Random generator




                   0.0 0.2 0.4 0.6 0.8 1.0




                                             500   520     540                      560   580   600

                                                                   uniform sample
                   1.5
                   1.0
                   0.5
                   0.0




                                             0.0   0.2     0.4                      0.6   0.8   1.0
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Random generator




       In C, use of the procedure

       rand() / random()
       SYNOPSIS
       # include <stdlib.h>
       long int random(void);
       DESCRIPTION
       The random() function uses a non-linear additive feedback random
       number generator employing a default table of size 31 long
       integers to return successive pseudo-random numbers in the range
       from 0 to RAND MAX. The period of this random generator is
       very large, approximately 16*((2**31)-1).
       RETURN VALUE
       random() returns a value between 0 and RAND MAX.
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Random generator




       En Scilab, use of the procedure
       rand()
       rand() : with no arguments gives a scalar whose value changes
       each time it is referenced. By default, random numbers are
       uniformly distributed in the interval (0,1). rand(’normal’) switches
       to a normal distribution with mean 0 and variance 1.
       rand(’uniform’) switches back to the uniform distribution
       EXAMPLE
       x=rand(10,10,’uniform’)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Random generator



       Example (A standard uniform generator)
       The congruencial generator

                                                               D(x) = (ax + b) mod (M + 1).

       has a period of M for proper choices of (a, b) and becomes a
       generator on ]0, 1[ when dividing by M + 2
                                                               v = u*69069069 (1)




                                                                                                       1.0
                     0.0 0.2 0.4 0.6 0.8 1.0 1.2




                                                                                                       0.8
                                                                                                       0.6
                                                                                                t+1

                                                                                                       0.4
                                                                                                       0.2
                                                                                                       0.0
                                                   0.0   0.2     0.4        0.6     0.8   1.0                0.0   0.2   0.4       0.6   0.8   1.0

                                                                                                                               t
                     1.0




                                                                                                       1.0
                     0.8




                                                                                                       0.8
                     0.6




                                                                                                       0.6
                                                                                                t+10
               t+5

                     0.4




                                                                                                       0.4
                     0.2




                                                                                                       0.2
                     0.0




                                                                                                       0.0




                                                   0.0   0.2     0.4        0.6     0.8   1.0                0.0   0.2   0.4       0.6   0.8   1.0

                                                                       t                                                       t
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Random generator




       Conclusion :
       Use the appropriate random generator on the computer or the
       software at hand instead of constructing a random generator of
       poor quality
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (1)



Distributions different from the uniform distribution (1)

       A problem formaly solved since
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (1)



Distributions different from the uniform distribution (1)

       A problem formaly solved since

       Theorem (Generic inversion)
       If U is a uniform random variable on [0, 1) and if FX is the cdf of
                                      −1
       the random variable X, then FX (U ) is distributed like X
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (1)



Distributions different from the uniform distribution (1)

       A problem formaly solved since

       Theorem (Generic inversion)
       If U is a uniform random variable on [0, 1) and if FX is the cdf of
                                      −1
       the random variable X, then FX (U ) is distributed like X

       Proof. Indeed,
                              −1
                          P (FX (U ) ≤ x) = P (U ≤ FX (x)) = FX (x)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (1)



Distributions different from the uniform distribution (1)

       A problem formaly solved since

       Theorem (Generic inversion)
       If U is a uniform random variable on [0, 1) and if FX is the cdf of
                                      −1
       the random variable X, then FX (U ) is distributed like X

       Proof. Indeed,
                              −1
                          P (FX (U ) ≤ x) = P (U ≤ FX (x)) = FX (x)

       Note. When FX is not strictly increasing, we can take
                                       −1
                                      FX (u) = inf {x; FX (x) ≥ u}
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (1)



Applications...
               Binomial distribution, B(n, p),

                                                                   n i
                                         FX (x) =                    p (1 − p)n−i
                                                                   i
                                                          i≤x

                    −1
               and FX (u) can be obtained numerically
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (1)



Applications...
               Binomial distribution, B(n, p),

                                                                   n i
                                         FX (x) =                    p (1 − p)n−i
                                                                   i
                                                          i≤x

                    −1
               and FX (u) can be obtained numerically
               Exponential distribution, E xp(λ),
                                                                         −1
                    FX (x) = 1 − exp(λx)                           et   FX (u) = − log(u)/λ
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (1)



Applications...
               Binomial distribution, B(n, p),

                                                                   n i
                                         FX (x) =                    p (1 − p)n−i
                                                                   i
                                                          i≤x

                    −1
               and FX (u) can be obtained numerically
               Exponential distribution, E xp(λ),
                                                                         −1
                    FX (x) = 1 − exp(λx)                           et   FX (u) = − log(u)/λ


               Cauchy distribution, C (0, 1),
                                1            1                           −1
               FX (x) =           arctan(x)+                       et   FX (u) = tan(π(u−1/2))
                                π            2
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (1)



Other transformations...
       [Hint]
       Find transforms linking the distribution of interest with
       simpler/know distributions
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (1)



Other transformations...
       [Hint]
       Find transforms linking the distribution of interest with
       simpler/know distributions

       Example (Box-M¨ller transform)
                     u
                                                                            i.i.d.
       For the normal distribution N (0, 1), if X1 , X2 ∼ N (0, 1),

                     X1 + X2 ∼ χ2 ,
                      2    2
                                2                       arctan(X1 /X2 ) ∼ U ([0, 2π])

                                                                 [Jacobian]
       Since the        χ2
                     distribution is the same as the E xp(1/2) distribution,
                         2
       using a cdf inversion produces

       X1 =          −2 log(U1 ) sin(2πU2 )                        X2 =   −2 log(U1 ) cos(2πU2 )
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (1)




       Example
       Student’s t and Fisher’s F distributions are natural byproducts of
       the generation of the normal and of the chi-square distributions.
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (1)




       Example
       Student’s t and Fisher’s F distributions are natural byproducts of
       the generation of the normal and of the chi-square distributions.

       Example
       The Cauchy distribution can be derived from the normal
                                  i.i.d.
       distribution as: if X1 , X2 ∼ N (0, 1), then X1 /X2 ∼ C (0, 1)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (1)




       Example
       The Beta distribution B(α, β), with density

                                                Γ(α + β) α−1
                                fX (x) =                 x   (1 − x)β−1 ,
                                                Γ(α)Γ(β)

       can be derived from the Gamma distribution by: if X1 ∼ G a(α, 1),
       X2 ∼ G a(β, 1), then

                                                X1
                                                      ∼ B(α, β)
                                              X1 + X2
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (1)



Multidimensional distributions


       Consider the generation of

                                      (X1 , . . . , Xp ) ∼ f (x1 , . . . , xp )

       in Rp with components that are not necessarily independent
       Cascade rule

        f (x1 , . . . , xp ) = f1 (x1 ) × f2|1 (x2 |x1 ) . . . × fp|−p (xp |x1 , . . . , xp−1 )
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (1)



Implementation



        Simulate for t = 1, . . . , T
            1   X1 ∼ f1 (x1 )
            2   X2 ∼ f2|1 (x2 |x1 )
                .
                .
                .
           p. Xp ∼ fp|−p (xp |x1 , . . . , xp−1 )
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)



Distributions different from the uniform distribution (2)



                −1
               FX rarely available
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)



Distributions different from the uniform distribution (2)



                −1
               FX rarely available
               implemented algorithm in a resident software only for
               standard distributions
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)



Distributions different from the uniform distribution (2)



                −1
               FX rarely available
               implemented algorithm in a resident software only for
               standard distributions
               inversion lemma does not apply in larger dimensions
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)



Distributions different from the uniform distribution (2)



                −1
               FX rarely available
               implemented algorithm in a resident software only for
               standard distributions
               inversion lemma does not apply in larger dimensions
               new distributions may require fast resolution
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)



The Accept-Reject Algorithm
       Given a distribution with density f to be simulated
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)



The Accept-Reject Algorithm
       Given a distribution with density f to be simulated

       Theorem (Fundamental theorem of simulation)




                                                                          0.25
       The uniform distribution on the
       sub-graph




                                                                          0.20
                                                                          0.15
           Sf = {(x, u); 0 ≤ u ≤ f (x)}



                                                                   f(x)

                                                                          0.10
       produces a marginal in x with

                                                                          0.05
       density f .
                                                                          0.00
                                                                                 0   2   4       6   8   10

                                                                                             x
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)




       Proof :
       Marginal density given by
                                                 ∞
                                                     I0≤u≤f (x) du = f (x)
                                             0

       and independence from the normalisation constant
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)




       Proof :
       Marginal density given by
                                                 ∞
                                                     I0≤u≤f (x) du = f (x)
                                             0

       and independence from the normalisation constant

       Example
       For a normal distribution, we just need to simulate (u, x) at
       random in
                         {(u, x); 0 ≤ u ≤ exp(−x2 /2)}
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)




       Accept-reject algorithm
          1    Find a density g that can be simulated and such that

                                                         f (x)
                                                  sup          =M <∞
                                                    x    g(x)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)




       Accept-reject algorithm
          1    Find a density g that can be simulated and such that

                                                          f (x)
                                                  sup           =M <∞
                                                      x   g(x)

          2    Generate
                                             i.i.d.                            i.i.d.
                            Y1 , Y2 , . . . ∼ g ,                  U1 , U2 , . . . ∼ U ([0, 1])
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)




       Accept-reject algorithm
          1    Find a density g that can be simulated and such that

                                                          f (x)
                                                  sup           =M <∞
                                                      x   g(x)

          2    Generate
                                             i.i.d.                            i.i.d.
                            Y1 , Y2 , . . . ∼ g ,                  U1 , U2 , . . . ∼ U ([0, 1])


          3    Take X = Yk where

                                       k = inf{n ; Un ≤ f (Yn )/M g(Yn )}
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)


       Theorem (Accept–reject)
       The random variable produced by the above stopping rule is
       distributed form fX
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)


       Theorem (Accept–reject)
       The random variable produced by the above stopping rule is
       distributed form fX

       Proof (1) : We have
                                     ∞
       P (X ≤ x) =                         P (X = Yk , Yk ≤ x)
                                     k=1
                                      ∞                     k−1
                                                    1
                            =               1−                     P (Uk ≤ f (Yk )/M g(Yk ) , Yk ≤ x)
                                                    M
                                     k=1
                                      ∞                    k−1       x       f (y)/M g(y)
                                                   1
                            =               1−                                              du g(y)dy
                                                   M                −∞   0
                                     k=1
                                      ∞                    k−1           x
                                                   1               1
                            =               1−                                f (y)dy
                                                   M               M     −∞
                                     k=1
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)




       Proof (2)




                                                                   5
                                                                   4
       If (X, U ) is uniform on A ⊃ B,




                                                                   3
       the distribution of (X, U )




                                                                   2
       restricted to B is uniform on B.




                                                                   1
                                                                   0
                                                                       −4   −2   0   2   4
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)



Properties



               Does not require a normalisation constant
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)



Properties



               Does not require a normalisation constant
               Does not require an exact upper bound M
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)



Properties



               Does not require a normalisation constant
               Does not require an exact upper bound M
               Allows for the recycling of the Yk ’s for another density f (note
               that rejected Yk ’s are no longer distributed from g)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)



Properties



               Does not require a normalisation constant
               Does not require an exact upper bound M
               Allows for the recycling of the Yk ’s for another density f (note
               that rejected Yk ’s are no longer distributed from g)
               Requires on average M Yk ’s for one simulated X (efficiency
               measure)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)




       Example
       Take f (x) = exp(−x2 /2) et g(x) = 1/(1 + x2 )

                                     f (x)                2       √
                                           = (1 + x2 ) e−x /2 ≤ 2/ e
                                     g(x)

       Probability of acceptance                      e/2π = 0.66
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)




       Theorem (Envelope)
        If there exists a density gm , a function gl and a constant M such
       that
                             gl (x) ≤ f (x) ≤ M gm (x) ,
       then
            1   Generate X ∼ gm (x), U ∼ U[0,1] ;
            2   Accept X if U ≤ gl (X)/M gm (X);
            3   else, accept X if U ≤ f (X)/M gm (X)
       produces random variable distributed from f .
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)



Uniform ratio algorithms
          Slice sampler


       Result :
       Uniform simulation on

                                       {(u, v); 0 ≤ u ≤            2f (v/u)}

       produces
                                                   X = V /U ∼ f
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)



Uniform ratio algorithms
          Slice sampler


       Result :
       Uniform simulation on

                                       {(u, v); 0 ≤ u ≤            2f (v/u)}

       produces
                                                   X = V /U ∼ f

       Proof :
       Change of variable (u, v) → (x, u) with Jacobian u and marginal
       distribution of x provided by
                              √                      2
                                2f (x)
                                              2f (x)
                      x∼               u du =          = f (x)
                            0                  2
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Non-uniform distributions (2)




       Example




                                                                           0.6
       For a normal distribution,




                                                                           0.4
                                                                       v
       simulate (u, v) at random in




                                                                           0.2
                                                                           0.0
                                                                                 0.0   0.2   0.4   0.6       0.8   1.0   1.2   1.4

                                                                                                         u




                                     √           2 /4u2                                  √
        {(u, v); 0 ≤ u ≤                 2 e−v            } = {(u, v); v 2 ≤ −4 u2 log(u/ 2)}
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Slice sampler



       If a uniform simulation on

                                       G = {(u, x); 0 ≤ u ≤ f (x)}

       is too complex [because of the inversion of x into u ≤ f (x)], we
       can use instead a random walk on G:
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Slice sampler




                                                                          0.010
     Simulate for t = 1, . . . , T




                                                                          0.008
         1   ω (t+1) ∼ U[0,f (x(t) )] ;




                                                                          0.006
             x(t+1) ∼ UG(t+1) , where




                                                                   f(x)
         2




                                                                          0.004
             G(t+1) = {y; f (y) ≥ ω (t+1) }.




                                                                          0.002
                                                                          0.000
                                                                                  0.0   0.2   0.4       0.6   0.8   1.0

                                                                                                    x
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Justification



       The random walk is exploring uniformly G:
       If
                                               (U (t) , X (t) ) ∼ UG ,
       then
                                           (U (t+1) , X (t+1) ) ∼ UG .
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Proof:

             Pr((U (t+1) , X (t+1) ) ∈ A × B)
                                        I0≤u′ ≤f (x) If (x′ )≥u′ (x′ )
           =                I0≤u≤f (x)                                 d(x, u, x′ , u′ )
                     B A                   f (x)        If (y)≥u′ dy
                                            I0≤u′ ≤f (x) If (x′ )≥u′ (x′ )
           =                        f (x)                                  d(x, x′ , u′ )
                          B     A              f (x)        If (y)≥u′ dy
                                                      If (x′ )≥u′ (x′ )
           =            If (x)≥u′ dx                                    d(x′ , u′ )
                                            B    A       If (y)≥u′ dy

           =                  If (x′ )≥u′ ≥0 d(x′ , u′ )
                    B     A
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Example (Normal distribution)
       For the standard normal distribution,

                                              f (x) ∝ exp(−x2 /2),

       a slice sampler is

                                    ω|x ∼ U[0,exp(−x2 /2)] ,
                                    X|ω ∼ U √          [−
                                                          √
                                                             −2 log(ω),   −2 log(ω)]
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Note
       The technique also operates when f is replaced with

                                                    ϕ(x) ∝ f (x)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Note
       The technique also operates when f is replaced with

                                                    ϕ(x) ∝ f (x)

       It can be generalised to the case when f is decomposed in
                                                                   p
                                                 f (x) =               fi (x)
                                                              i=1
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Example (Truncated normal distribution)
       If we consider instead the truncated N (−3, 1) distribution
       restricted to [0, 1], with density

                       exp(−(x + 3)2 /2)
               f (x) = √                 ∝ exp(−(x + 3)2 /2) = ϕ(x) ,
                         2π[Φ(4) − Φ(3)]
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Example (Truncated normal distribution)
       If we consider instead the truncated N (−3, 1) distribution
       restricted to [0, 1], with density

                       exp(−(x + 3)2 /2)
               f (x) = √                 ∝ exp(−(x + 3)2 /2) = ϕ(x) ,
                         2π[Φ(4) − Φ(3)]

       a slice sampler is

                                    ω|x ∼ U[0,exp(−(x+3)2 /2)] ,
                                    X|ω ∼ U          √
                                                         [0,1∧{−3+   −2 log(ω)}]
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



The Metropolis–Hastings algorithm



       Generalisation of the slice sampler to situations when the slice
       sampler cannot be easily implemented
       Idea
       Create a sequence (Xn )n such that, for n ‘large enough’, the
       density of Xn is close to f
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



The Metropolis–Hastings algorithm (2)



       If f is the density of interest, we pick a proposal conditional density

                                                          q(y|x)

       such that
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



The Metropolis–Hastings algorithm (2)



       If f is the density of interest, we pick a proposal conditional density

                                                          q(y|x)

       such that
               it is easy to simulate
               it is positive everywhere f is positive
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Metropolis–Hastings
       For a current value X (t) = x(t) ,
          1    Generate Yt ∼ q(y|x(t) ).
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Metropolis–Hastings
       For a current value X (t) = x(t) ,
          1    Generate Yt ∼ q(y|x(t) ).
          2    Take

                                                 Yt        with proba. ρ(x(t) , Yt ),
                             X (t+1) =
                                                 x(t)      with proba. 1 − ρ(x(t) , Yt ),

               where
                                                                   f (y) q(x|y)
                                     ρ(x, y) = min                              ,1   .
                                                                   f (x) q(y|x)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Properties

               Always accept moves to yt ’s such that

                                                    f (yt )      f (xt )
                                                              ≥
                                                   q(yt |xt )   q(xt |yt )
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Properties

               Always accept moves to yt ’s such that

                                                    f (yt )      f (xt )
                                                              ≥
                                                   q(yt |xt )   q(xt |yt )


               Does not depend on normalising constants for both f and
               q(·|x) (if the later is independent from x)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Properties

               Always accept moves to yt ’s such that

                                                    f (yt )      f (xt )
                                                              ≥
                                                   q(yt |xt )   q(xt |yt )


               Does not depend on normalising constants for both f and
               q(·|x) (if the later is independent from x)
               Never accept values of yt such that f (yt ) = 0
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Properties

               Always accept moves to yt ’s such that

                                                    f (yt )      f (xt )
                                                              ≥
                                                   q(yt |xt )   q(xt |yt )


               Does not depend on normalising constants for both f and
               q(·|x) (if the later is independent from x)
               Never accept values of yt such that f (yt ) = 0
               The sequence (x(t) )t can take repeatedly the same value
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Properties

                  Always accept moves to yt ’s such that

                                                    f (yt )      f (xt )
                                                              ≥
                                                   q(yt |xt )   q(xt |yt )


                  Does not depend on normalising constants for both f and
                  q(·|x) (if the later is independent from x)
                  Never accept values of yt such that f (yt ) = 0
                  The sequence (x(t) )t can take repeatedly the same value
                  The X (t) ’s are dependent (Marakovian) random variables
          links
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Justification


       Joint distribution of (X (t) , X (t+1) )
       If X (t) ∼ f (x(t) ),

       (X (t) , X (t+1) ) ∼ f (x(t) ) ρ(x(t) , x(t+1) ) × q(x(t+1) |x(t) )
                                                                   [Yt accepted]

                                         +          1 − ρ(x(t) , y) q(y|x(t) ) dy Ix(t) (x(t+1) )

                                                                   [Yt rejected]
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Balance condition




                                                                 f (y) q(x|y)
        f (x) × ρ(x, y) × q(y|x) = f (x) min                                   , 1 q(y|x)
                                                                 f (x) q(y|x)
                                                   = min {f (y)q(x|y), f (x)q(y|x)}
                                                   = f (y) × ρ(y, x) × q(x|y)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Balance condition




                                                                 f (y) q(x|y)
        f (x) × ρ(x, y) × q(y|x) = f (x) min                                   , 1 q(y|x)
                                                                 f (x) q(y|x)
                                                   = min {f (y)q(x|y), f (x)q(y|x)}
                                                   = f (y) × ρ(y, x) × q(x|y)

       Thus the distribution of (X (t) , X (t+1) ) as the distribution of
       (X (t+1) , X (t) ) : if X (t) has the density f , then so does X (t+1)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Link with slice sampling
       The slice sampler is a very special case of Metropolis-Hastings
       algorithm where the acceptance probability is always 1
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Link with slice sampling
       The slice sampler is a very special case of Metropolis-Hastings
       algorithm where the acceptance probability is always 1
          1    for the generation of U ,

                                     I0≤u′ ≤f (x)                  f (x)−1 I0≤u′ ≤f (x)
                                                          ×                               =1
                                     I0≤u≤f (x)                    f (x)−1 I0≤u≤f (x)
                            [joint density]                             [conditional density]
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Link with slice sampling
       The slice sampler is a very special case of Metropolis-Hastings
       algorithm where the acceptance probability is always 1
          1    for the generation of U ,

                                     I0≤u′ ≤f (x)                  f (x)−1 I0≤u′ ≤f (x)
                                                          ×                                =1
                                     I0≤u≤f (x)                    f (x)−1 I0≤u≤f (x)
                            [joint density]                             [conditional density]


          2    pour la g´n´ration de X,
                        e e

                         I0≤u≤f (y)                  I{z;u≤f (z)} (x)       {z;u≤f (z)} f (z) dz
                                             ×                                                     =1
                         I0≤u≤f (x)                  I{z;u≤f (z)} (y)       {z;u≤f (z)} f (z) dz
               [joint density]                              [conditional density]
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Independent proposals

       Proposal q independent from X (t) , denoted g as in Accept-Reject
       algorithms.
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Independent proposals

       Proposal q independent from X (t) , denoted g as in Accept-Reject
       algorithms.
       Independent Metropolis-Hastings
       For the current value X (t) = x(t) ,
          1    Generate Yt ∼ g(y)
          2    Take
                                     
                                     
                                     Y                             f (Yt ) g(x(t) )
                                           t      with proba. min                    ,1 ,
                    X (t+1) =                                       f (x(t) ) g(Yt )
                                     
                                     
                                         x(t)     otherwise.
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Properties



               Alternative to Accept-Reject
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Properties



               Alternative to Accept-Reject
               Avoids the computation of max f (x)/g(x)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Properties



               Alternative to Accept-Reject
               Avoids the computation of max f (x)/g(x)
               Accepts more often than Accept-Reject
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Properties



               Alternative to Accept-Reject
               Avoids the computation of max f (x)/g(x)
               Accepts more often than Accept-Reject
               If xt achieves max f (x)/g(x), this is almost identical to
               Accept-Reject
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Properties



               Alternative to Accept-Reject
               Avoids the computation of max f (x)/g(x)
               Accepts more often than Accept-Reject
               If xt achieves max f (x)/g(x), this is almost identical to
               Accept-Reject
               Except that the sequence (xt ) is not independent
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Example (Gamma distribution)
       Generate a distribution Ga(α, β) from a proposal
       Ga(⌊α⌋, b = ⌊α⌋/α), where ⌊α⌋ is the integer part of α (this is a
       sum of exponentials)
            1   Generate Yt ∼ Ga(⌊α⌋, ⌊α⌋/α)
            2   Take
                              
                                                                                         α−⌊α⌋
                              
                                                                   Yt        x(t) − Yt
                               Yt             with prob.                exp
                X (t+1)     =                                      x(t)           α
                              
                              
                              x(t)           else.
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Random walk Metropolis–Hastings


       Proposal
                                                  Yt = X (t) + εt ,
        where εt ∼ g, independent from X (t) , and g symmetrical
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Random walk Metropolis–Hastings


       Proposal
                                                  Yt = X (t) + εt ,
        where εt ∼ g, independent from X (t) , and g symmetrical
       Instrumental distribution with density

                                                        g(y − x)


       Motivation
       local perturbation of X (t) / exploration of its neighbourhood
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Random walk Metropolis–Hastings
       Starting from X (t) = x(t)
          1    Generate Yt ∼ g(y − x(t) )
          2    Take
                                       
                                       
                                       Yt                               f (Yt )
                                       
                                                    with proba. min 1,           ,
                                                                        f (x(t) )
                      X (t+1) =
                                       
                                                                    [symmetry of g]
                                        (t)
                                       x            otherwise
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Properties


               Always accepts higher point and sometimes lower points (see
               gradient algorithm)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Properties


               Always accepts higher point and sometimes lower points (see
               gradient algorithm)
               Depends on the dispersion de g
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Properties


               Always accepts higher point and sometimes lower points (see
               gradient algorithm)
               Depends on the dispersion de g
               Average robability of acceptance

                                    ̺=            min{f (x), f (y)}g(y − x) dxdy


                       close to 1 if g has a small variance                     [Danger!]
                       far from 1 if g has a large variance                  [Re-Danger!]
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Example (Normal distribution)
       Generate N (0, 1) based on a uniform perturbation on [−δ, δ]

                                                  Yt = X (t) + δωt
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Example (Normal distribution)
       Generate N (0, 1) based on a uniform perturbation on [−δ, δ]

                                                  Yt = X (t) + δωt

       Acceptance probability
                                                                   2
                                ρ(x(t) , yt ) = exp{(x(t) − yt )/2} ∧ 1.
                                                             2
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods


       Example (Normal distribution (2))
                             Statistics based on 15000 simulations

                                          δ             0.1        0.5   1.0
                                      mean 0.399 −0.111 0.10
                                     variance 0.698 1.11 1.06

       When δ ↑, faster exploration of the support of f .
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods


       Example (Normal distribution (2))
                               Statistics based on 15000 simulations

                                               δ                0.1         0.5          1.0
                                          mean 0.399 −0.111 0.10
                                         variance 0.698 1.11 1.06

       When δ ↑, faster exploration of the support of f .




                                                                                         400
                                                          400
             250




                                                   0.5




                                                                                  0.5




                                                                                                                                0.5
                                                                                         300
             200




                                                          300
                                                   0.0




                                                                                  0.0




                                                                                                                                0.0
             150




                                                                                         200
                                                          200
                                                   -0.5




                                                                                  -0.5




                                                                                                                                -0.5
             100




                                                                                         100
                                                          100
                                                   -1.0




                                                                                  -1.0




                                                                                                                                -1.0
             50




                                                   -1.5




                                                                                  -1.5




                                                                                                                                -1.5
             0




                                                          0




                                                                                         0
                    -1     0         1     2                    -2    0      2                 -3   -2   -1   0     1   2   3
                               (a)                                    (b)                                     (c)




       3 samples with δ = 0.1, 0.5 and 1.0, with convergence of
       empirical averages (over 15000 simulations).
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Missing variable models




       Special case when the density to simulate can be written as

                                              f (x) =              ˜
                                                                   f (x, z)dz
                                                             Z

       The random variable Z is then called missing data
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Completion principe



       Idea
                 ˜
        Simulate f produces simulations from f
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



Completion principe



       Idea
                 ˜
        Simulate f produces simulations from f

       If
                                                         ˜
                                                (X, Z) ∼ f (x, z) ,
       marginaly
                                                       X ∼ f (x)
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Data Augmentation
       Starting from x(t) ,
          1.                      ˜
               Simulate Z (t+1) ∼ fZ|X (z|x(t) ) ;
          2.                        ˜
               Simuleater X (t+1) ∼ fX|Z (x|z (t+1) ) .
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Example (Mixture of distributions)
       Consider the simulation (in R2 ) of the density f (µ1 , µ2 )
       proportional to
                                      100
                          2     2                                  2 /2                       2 /2
                    e−µ1 −µ2 ×                 0.3 e−(xi −µ1 )            + 0.7 e−(xi −µ2 )
                                      i=1

       when the xi ’s are given/observed.
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




                   Echantillon de 0.3 N(2.5,1)+ 0.7 N(0,1)
           0.35




                                                                  3.0
           0.30




                                                                  2.8
           0.25




                                                                  2.6
           0.20




                                                             µ2
           0.15




                                                                  2.4
           0.10




                                                                  2.2
           0.05
           0.00




                                                                  2.0




                  −2         0            2           4                 −0.4   −0.2   0.0   0.2   0.4

                                      x                                               µ1


       Histogram of the xi ’s and level set of f (µ1 , µ2 )
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Completion (1)
       Replace every sum in the density with an integral:
                                    2 /2                           2 /2
              0.3 e−(xi −µ1 )              + 0.7 e−(xi −µ2 )              =   I[0,0.3 e−(xi −µ1 )2 /2 ] (ui )

                       +I[0.3 e−(xi −µ1 )2 /2 ,0.3 e−(xi −µ1 )2 /2 +0.7 e−(xi −µ2 )2 /2 ] (ui ) dui

       and simulate ((µ1 , µ2 ), (U1 , . . . , Un )) = (X, Z) via Data
       Augmentation
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Completion (2)
       Replace the Ui ’s by the ξi ’s , where
                                                                         2 /2
                                             1 si Ui ≤ 0.3 e−(xi −µ1 )          ,
                                    ξi =
                                             2 sinon

       Then
                                                                            2
                                                0.3 e−(xi −µ1 ) /2
              Pr (ξi = 1|µ1 , µ2 ) =
                                     0.3 e−(xi −µ1 )2 /2 + 0.7 e−(xi −µ2 )2 /2
                                   = 1 − Pr (ξi = 2|µ1 , µ2 )
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods




       Conditioning (1)
       The conditional distribution of Z = (ξ1 , . . . , ξn ) given
       X = (µ1 , µ2 ) is given by
                                                                            2
                                                            0.3 e−(xi −µ1 ) /2
              Pr (ξi = 1|µ1 , µ2 ) =
                                                 0.3 e−(xi −µ1 )2 /2 + 0.7 e−(xi −µ2 )2 /2
                                               = 1 − Pr (ξi = 2|µ1 , µ2 )
New operational instruments for statistical exploration (=NOISE)
   Simulation of random variables
     Markovian methods



       Conditioning (2)
       The conditional distribution of X = (µ1 , µ2 ) given
       Z = (ξ1 , . . . , ξn ) is given by
                                         2     2                                 2 /2                                 2 /2
       (µ1 , µ2 )|Z ∼ e−µ1 −µ2 ×                                   e−(xi −µ1 )          ×               e−(xi −µ2 )
                                                     {i;ξi =1}                              {i;ξi =2}
                                                                                        2
                                                                          n1 µ1
                                                                             ˆ
                            ∝ exp −(n1 + 2) µ1 −                                             /2
                                                                         n1 + 2
                                                                                              2
                                                                             n2 µ2
                                                                                ˆ
                                    × exp −(n2 + 2) µ2 −                                          /2
                                                                            n2 + 2

       where nj is the number of ξi ’s equal to j and nj µj is the sum of
                                                         ˆ
       the xi ’s associated with those ξi equal to j
                                                                     [Easy!]
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm




Chapter 2 :
Monte Carlo Methods &
EM algorithm


                Introduction
                Integration by Monte Carlo method
                Importance functions
                Acceleration methods
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Introduction



Uses of simulation

          1    integration

                                        I = Ef [h(X)] =            h(x)f (x)dx
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Introduction



Uses of simulation

          1    integration

                                        I = Ef [h(X)] =            h(x)f (x)dx


          2    limiting behaviour/stationarity of complex systems
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Introduction



Uses of simulation

          1    integration

                                        I = Ef [h(X)] =            h(x)f (x)dx


          2    limiting behaviour/stationarity of complex systems
          3    optimisation

                          arg min h(x) = arg max exp{−βh(x)}                     β>0
                                   x                          x
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Introduction




       Example (Propagation of an epidemic)
       On a grid representing a region, a point is given by its coordinates
       x, y
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Introduction




       Example (Propagation of an epidemic)
       On a grid representing a region, a point is given by its coordinates
       x, y
       The probability to catch a disease is

                                               exp(α + β · nx,y )
                                Px,y =                             In >0
                                             1 + exp(α + β · nx,y ) x,y

       if nx,y denotes the number of neighbours of (x, y) who alread have
       this disease
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Introduction




       Example (Propagation of an epidemic)
       On a grid representing a region, a point is given by its coordinates
       x, y
       The probability to catch a disease is

                                               exp(α + β · nx,y )
                                Px,y =                             In >0
                                             1 + exp(α + β · nx,y ) x,y

       if nx,y denotes the number of neighbours of (x, y) who alread have
       this disease
       The probability to get healed is

                                                     exp(δ + γ · nx,y )
                                      Qx,y =
                                                   1 + exp(δ + γ · nx,y )
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Introduction




       Example (Propagation of an epidemic (2))

       Question
       Given (α, β, γ, δ), what is the speed of propagation of this
       epidemic? the average duration? the number of infected persons?
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Integration by Monte Carlo method



Monte Carlo integration




       Law of large numbers
       If X1 , . . . , Xn simulated from f ,
                                                          n
                                           ˆ    1
                                           In =                h(Xi ) −→ I
                                                n
                                                         i=1
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Integration by Monte Carlo method



Central Limit Theorem



       Evaluation of the error b
                                                           n
                                                    1                    ˆ
                                         ˆ2
                                         σn =                  (h(Xi ) − I)2
                                                    n2
                                                         i=1

       and
                                                  ˆ          ˆ2
                                                  In ≈ N (I, σn )
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Integration by Monte Carlo method




       Example (Normal)
       For a Gaussian distribution, E[X 4 ] = 3. Via Monte Carlo
       integration,
              n     5       50     500 5000 50,000 500,000
             ˆn 1.65 5.69 3.24 3.13 3.038
             I                                              3.029
                                    3.0
                                    2.5
                                    2.0
                                    1.5
                               In

                                    1.0
                                    0.5
                                    0.0




                                          5   10     50   100      500   1000   5000   10000   50000

                                                                   n
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Integration by Monte Carlo method




       Example (Cauchy / Normal)
       Consider the joint model

                                      X|θ ∼ N (θ, 1),              θ ∼ C(0, 1)

       Once X is observed, θ is estimated by
                                                      ∞
                                                             θ           2
                                                                2
                                                                  e−(x−θ) /2 dθ
                                                    −∞     1+θ
                                  δ π (x) =          ∞
                                                             1           2
                                                                  e−(x−θ) /2 dθ
                                                    −∞     1 + θ2
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Integration by Monte Carlo method




       Example (Cauchy / Normal (2))
       This representation of δ π suggests using iid variables

                                            θ1 , · · · , θm ∼ N (x, 1)

       and to compute
                                                               m       θi
                                                               i=1        2
                                           ˆπ                        1 + θi
                                           δm (x) =                         .
                                                               m       1
                                                               i=1        2
                                                                     1 + θi
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Integration by Monte Carlo method




       Example (Cauchy / Normal (2))
       This representation of δ π suggests using iid variables

                                            θ1 , · · · , θm ∼ N (x, 1)

       and to compute
                                                               m       θi
                                                               i=1        2
                                           ˆπ                        1 + θi
                                           δm (x) =                         .
                                                               m       1
                                                               i=1        2
                                                                     1 + θi
       By vurtue of the Law of Large Numbers,
                                ˆπ
                                δm (x) −→ δ π (x)                  quand m −→ ∞.
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Integration by Monte Carlo method




       Example (Normal cdf)
       Approximation of the normal cdf
                                                         t
                                                               1   2
                                         Φ(t) =               √ e−y /2 dy
                                                        −∞     2π
       by
                                                                   n
                                              ˆ      1
                                              Φ(t) =                     IXi ≤t ,
                                                     n
                                                                   i=1

       based on a sample of size n (X1 , . . . , Xn ), generated by the
       algorithm of Box-Muller.
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Integration by Monte Carlo method




       Example (Normal cdf(2))
           • Variance
                                                    Φ(t)(1 − Φ(t))/n,
               since the variables IXi ≤t are iid Bernoulli(Φ(t)).
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Integration by Monte Carlo method




       Example (Normal cdf(2))
           • Variance
                                                    Φ(t)(1 − Φ(t))/n,
               since the variables IXi ≤t are iid Bernoulli(Φ(t)).
           • For t close to t = 0 thea variance is about 1/4n:
             a precision of four decimals requires on average
                                      √      √
                                        n = 2 104

               simulations, thus, 200 millions of iterations.
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Integration by Monte Carlo method




       Example (Normal cdf(2))
           • Variance
                                                    Φ(t)(1 − Φ(t))/n,
               since the variables IXi ≤t are iid Bernoulli(Φ(t)).
           • For t close to t = 0 thea variance is about 1/4n:
             a precision of four decimals requires on average
                                      √      √
                                        n = 2 104

               simulations, thus, 200 millions of iterations.
           • Larger [absolute] precision in the tails
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Integration by Monte Carlo method




       Example (Normal cdf(3))

            n       0.0        0.67       0.84       1.28          1.65    2.32     2.58     3.09     3.72

           102     0.485       0.74       0.77        0.9       0.945     0.985    0.995       1        1
           103    0.4925      0.7455     0.801      0.902      0.9425     0.9885   0.9955   0.9985      1
           104    0.4962      0.7425     0.7941       0.9      0.9498     0.9896    0.995   0.999    0.9999
           105    0.4995      0.7489     0.7993     0.9003     0.9498     0.9898    0.995   0.9989   0.9999
           106    0.5001      0.7497       0.8      0.9002     0.9502      0.99     0.995   0.999    0.9999
           107    0.5002      0.7499       0.8      0.9001     0.9501      0.99     0.995   0.999    0.9999
           108      0.5        0.75        0.8        0.9       0.95       0.99     0.995   0.999    0.9999




       Evaluation of normal quantiles by Monte Carlo based on n
       normal generations
New operational instruments for statistical exploration (=NOISE)
   Monte Carlo Method and EM algorithm
     Importance functions



Importance functions



       Alternative representation :
                                                                          f (x)
                            I=          h(x)f (x)dx =              h(x)         g(x)dx
                                                                          g(x)
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R
Exploratory Statistics with R

More Related Content

Viewers also liked

Maximizing Benefit: Five Strategies for Getting the Most from Your Survey Ass...
Maximizing Benefit: Five Strategies for Getting the Most from Your Survey Ass...Maximizing Benefit: Five Strategies for Getting the Most from Your Survey Ass...
Maximizing Benefit: Five Strategies for Getting the Most from Your Survey Ass...Pat Barlow
 
Hs Industrial Insights Manufacturing
Hs Industrial Insights   ManufacturingHs Industrial Insights   Manufacturing
Hs Industrial Insights ManufacturingTKarlsson
 
disenos experimentales
disenos experimentalesdisenos experimentales
disenos experimentalesAngel Velazco
 
Mineral processing-design-and-operation, gupta
Mineral processing-design-and-operation, guptaMineral processing-design-and-operation, gupta
Mineral processing-design-and-operation, guptaCarlos Barreto Gamarra
 
JKSimMet Course - Part 1
JKSimMet Course - Part 1JKSimMet Course - Part 1
JKSimMet Course - Part 1James Didovich
 
The Binomial, Poisson, and Normal Distributions
The Binomial, Poisson, and Normal DistributionsThe Binomial, Poisson, and Normal Distributions
The Binomial, Poisson, and Normal DistributionsSCE.Surat
 
Simple math for anomaly detection toufic boubez - metafor software - monito...
Simple math for anomaly detection   toufic boubez - metafor software - monito...Simple math for anomaly detection   toufic boubez - metafor software - monito...
Simple math for anomaly detection toufic boubez - metafor software - monito...tboubez
 
Diseño de experimetos pres
Diseño de experimetos presDiseño de experimetos pres
Diseño de experimetos presJose Ojeda
 
Guía métodos estadísticos.
Guía métodos estadísticos.Guía métodos estadísticos.
Guía métodos estadísticos.Lenin Goursa
 
Holistic modelling of mineral processing plants a practical approach
Holistic modelling of mineral processing plants   a practical approachHolistic modelling of mineral processing plants   a practical approach
Holistic modelling of mineral processing plants a practical approachBasdew Rooplal
 
Diseños de Experimentos - CAPITULO 07 HERNANDEZ SAMPERI
Diseños de Experimentos - CAPITULO 07 HERNANDEZ SAMPERI Diseños de Experimentos - CAPITULO 07 HERNANDEZ SAMPERI
Diseños de Experimentos - CAPITULO 07 HERNANDEZ SAMPERI Pedro Salcedo Lagos
 

Viewers also liked (20)

Maximizing Benefit: Five Strategies for Getting the Most from Your Survey Ass...
Maximizing Benefit: Five Strategies for Getting the Most from Your Survey Ass...Maximizing Benefit: Five Strategies for Getting the Most from Your Survey Ass...
Maximizing Benefit: Five Strategies for Getting the Most from Your Survey Ass...
 
Hs Industrial Insights Manufacturing
Hs Industrial Insights   ManufacturingHs Industrial Insights   Manufacturing
Hs Industrial Insights Manufacturing
 
9주차
9주차9주차
9주차
 
Pearson y sperman
Pearson y spermanPearson y sperman
Pearson y sperman
 
Estadística: Cálculos SPSS
Estadística: Cálculos SPSSEstadística: Cálculos SPSS
Estadística: Cálculos SPSS
 
disenos experimentales
disenos experimentalesdisenos experimentales
disenos experimentales
 
Mineral processing-design-and-operation, gupta
Mineral processing-design-and-operation, guptaMineral processing-design-and-operation, gupta
Mineral processing-design-and-operation, gupta
 
JKSimMet Course - Part 1
JKSimMet Course - Part 1JKSimMet Course - Part 1
JKSimMet Course - Part 1
 
Jk (1)
Jk (1)Jk (1)
Jk (1)
 
The Binomial, Poisson, and Normal Distributions
The Binomial, Poisson, and Normal DistributionsThe Binomial, Poisson, and Normal Distributions
The Binomial, Poisson, and Normal Distributions
 
Presentación del curso,agosto,18,2014
Presentación del curso,agosto,18,2014Presentación del curso,agosto,18,2014
Presentación del curso,agosto,18,2014
 
Estadística: Revisión Estadística
Estadística: Revisión EstadísticaEstadística: Revisión Estadística
Estadística: Revisión Estadística
 
Simple math for anomaly detection toufic boubez - metafor software - monito...
Simple math for anomaly detection   toufic boubez - metafor software - monito...Simple math for anomaly detection   toufic boubez - metafor software - monito...
Simple math for anomaly detection toufic boubez - metafor software - monito...
 
Clase 4 diseños de bloques - final
Clase 4   diseños de bloques - finalClase 4   diseños de bloques - final
Clase 4 diseños de bloques - final
 
Diseño de experimetos pres
Diseño de experimetos presDiseño de experimetos pres
Diseño de experimetos pres
 
Guía métodos estadísticos.
Guía métodos estadísticos.Guía métodos estadísticos.
Guía métodos estadísticos.
 
Holistic modelling of mineral processing plants a practical approach
Holistic modelling of mineral processing plants   a practical approachHolistic modelling of mineral processing plants   a practical approach
Holistic modelling of mineral processing plants a practical approach
 
MINERAL PROCESSING SIMULATION USING MODSIM SOFWARE
MINERAL PROCESSING SIMULATION USING MODSIM SOFWAREMINERAL PROCESSING SIMULATION USING MODSIM SOFWARE
MINERAL PROCESSING SIMULATION USING MODSIM SOFWARE
 
sampling
samplingsampling
sampling
 
Diseños de Experimentos - CAPITULO 07 HERNANDEZ SAMPERI
Diseños de Experimentos - CAPITULO 07 HERNANDEZ SAMPERI Diseños de Experimentos - CAPITULO 07 HERNANDEZ SAMPERI
Diseños de Experimentos - CAPITULO 07 HERNANDEZ SAMPERI
 

Similar to Exploratory Statistics with R

Nanotechnology in the Czech Republic
Nanotechnology in the Czech RepublicNanotechnology in the Czech Republic
Nanotechnology in the Czech Republichelikarv
 
Trigonometry%20to%20 find%20angle%20measures
Trigonometry%20to%20 find%20angle%20measuresTrigonometry%20to%20 find%20angle%20measures
Trigonometry%20to%20 find%20angle%20measuresNene Thomas
 
3. project charter, check sheet, pareto analysis & c&e diagram & matrix
3. project charter, check sheet, pareto analysis & c&e diagram & matrix3. project charter, check sheet, pareto analysis & c&e diagram & matrix
3. project charter, check sheet, pareto analysis & c&e diagram & matrixHakeem-Ur- Rehman
 
Etalis rule ml_2011_itterative
Etalis rule ml_2011_itterativeEtalis rule ml_2011_itterative
Etalis rule ml_2011_itterativeDarko Anicic
 
An introduction to moment closure techniques
An introduction to moment closure techniquesAn introduction to moment closure techniques
An introduction to moment closure techniquesColin Gillespie
 
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...The Impact of Methods and Techniques on Outcomes from Agile Software Developm...
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...David Parsons
 

Similar to Exploratory Statistics with R (12)

Fsna tool
Fsna toolFsna tool
Fsna tool
 
Nanotechnology in the Czech Republic
Nanotechnology in the Czech RepublicNanotechnology in the Czech Republic
Nanotechnology in the Czech Republic
 
Sattose 2011
Sattose 2011Sattose 2011
Sattose 2011
 
Trigonometry%20to%20 find%20angle%20measures
Trigonometry%20to%20 find%20angle%20measuresTrigonometry%20to%20 find%20angle%20measures
Trigonometry%20to%20 find%20angle%20measures
 
3. project charter, check sheet, pareto analysis & c&e diagram & matrix
3. project charter, check sheet, pareto analysis & c&e diagram & matrix3. project charter, check sheet, pareto analysis & c&e diagram & matrix
3. project charter, check sheet, pareto analysis & c&e diagram & matrix
 
Etalis rule ml_2011_itterative
Etalis rule ml_2011_itterativeEtalis rule ml_2011_itterative
Etalis rule ml_2011_itterative
 
An introduction to moment closure techniques
An introduction to moment closure techniquesAn introduction to moment closure techniques
An introduction to moment closure techniques
 
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...The Impact of Methods and Techniques on Outcomes from Agile Software Developm...
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...
 
Fusion 2012
Fusion 2012Fusion 2012
Fusion 2012
 
07 Mgf
07 Mgf07 Mgf
07 Mgf
 
numdoc
numdocnumdoc
numdoc
 
Fit Main
Fit MainFit Main
Fit Main
 

More from Christian Robert

Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceChristian Robert
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinChristian Robert
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?Christian Robert
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Christian Robert
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Christian Robert
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking componentsChristian Robert
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihoodChristian Robert
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)Christian Robert
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerChristian Robert
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussionChristian Robert
 

More from Christian Robert (20)

Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de France
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
 
discussion of ICML23.pdf
discussion of ICML23.pdfdiscussion of ICML23.pdf
discussion of ICML23.pdf
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?
 
restore.pdf
restore.pdfrestore.pdf
restore.pdf
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihood
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
eugenics and statistics
eugenics and statisticseugenics and statistics
eugenics and statistics
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussion
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 

Recently uploaded

Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 

Recently uploaded (20)

Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 

Exploratory Statistics with R

  • 1. New operational instruments for statistical exploration (=NOISE) New operational instruments for statistical exploration (=NOISE) Christian P. Robert Universit´ Paris Dauphine e http://www.ceremade.dauphine.fr/~xian Licence MI2E, 2008–2009
  • 2. New operational instruments for statistical exploration (=NOISE) Outline 1 Simulation of random variables 2 Monte Carlo Method and EM algorithm 3 Bootstrap Method 4 Rudiments of Nonparametric Statistics
  • 3. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Chapter 1 : Simulation of random variables Introduction Random generator Non-uniform distributions (1) Non-uniform distributions (2) Markovian methods
  • 4. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Introduction Necessity to ”reproduce chance” on a computer Evaluation of the behaviour of a complex system (network, computer program, queue, particle system, atmosphere, epidemics, economic actions, &tc)
  • 5. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Introduction Necessity to ”reproduce chance” on a computer Evaluation of the behaviour of a complex system (network, computer program, queue, particle system, atmosphere, epidemics, economic actions, &tc) Determine probabilistic properties of a new statistical procedure or under an unknown distribution [bootstrap]
  • 6. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Introduction Necessity to ”reproduce chance” on a computer Evaluation of the behaviour of a complex system (network, computer program, queue, particle system, atmosphere, epidemics, economic actions, &tc) Determine probabilistic properties of a new statistical procedure or under an unknown distribution [bootstrap] Validation of a probabilistic model
  • 7. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Introduction Necessity to ”reproduce chance” on a computer Evaluation of the behaviour of a complex system (network, computer program, queue, particle system, atmosphere, epidemics, economic actions, &tc) Determine probabilistic properties of a new statistical procedure or under an unknown distribution [bootstrap] Validation of a probabilistic model Approximation of an expectation/integral for a non-standard distribution [Law of Large Numbers]
  • 8. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Introduction Necessity to ”reproduce chance” on a computer Evaluation of the behaviour of a complex system (network, computer program, queue, particle system, atmosphere, epidemics, economic actions, &tc) Determine probabilistic properties of a new statistical procedure or under an unknown distribution [bootstrap] Validation of a probabilistic model Approximation of an expectation/integral for a non-standard distribution [Law of Large Numbers] Maximisation of a weakly regular function/likelihood
  • 9. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Example (TCL for the binomial distribution) If Xn ∼ B(n, p) , Xn converges in distribution to the normal distribution: √ n→∞ p(1 − p) n (Xn /n − p) N 0, n
  • 10. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction n= 4 n= 8 n= 16 10 15 20 25 20 30 15 20 10 10 5 5 0 0 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 n= 32 n= 64 n= 128 14 10 15 20 25 15 0 2 4 6 8 10 10 5 5 0 0 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.3 0.4 0.5 0.6 0.35 0.40 0.45 0.50 0.55 0.60 0.65 n= 256 n= 512 n= 1024 30 10 20 30 40 50 5 10 15 20 25 20 5 10 0 0 0 0.40 0.45 0.50 0.55 0.60 0.44 0.46 0.48 0.50 0.52 0.54 0.56 0.58 0.46 0.48 0.50 0.52 0.54
  • 11. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Example (Stochastic minimisation) Consider the function h(x, y) = (x sin(20y) + y sin(20x))2 cosh(sin(10x)x) + (x cos(10y) − y sin(10x))2 cosh(cos(20y)y) , to be minimised.
  • 12. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Example (Stochastic minimisation) Consider the function h(x, y) = (x sin(20y) + y sin(20x))2 cosh(sin(10x)x) + (x cos(10y) − y sin(10x))2 cosh(cos(20y)y) , to be minimised. (I know that the global minimum is 0 for (x, y) = (0, 0).)
  • 13. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction 6 5 Z 34 21 0 1 0.5 1 0.5 0 Y 0 X -0.5 -0.5 -1 -1
  • 14. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Example (Stochastic minimisation (2)) Instead of solving the first order equations ∂h(x, y) ∂h(x, y) = 0, =0 ∂x ∂y and of checking that the second order conditions are met, we can generate a random sequence in R2 αj θj+1 = θj + ∆h(θj , βj ζj ) ζj 2βj
  • 15. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Example (Stochastic minimisation (2)) Instead of solving the first order equations ∂h(x, y) ∂h(x, y) = 0, =0 ∂x ∂y and of checking that the second order conditions are met, we can generate a random sequence in R2 αj θj+1 = θj + ∆h(θj , βj ζj ) ζj 2βj where ⋄ the ζj ’s are uniform on the unit circle x2 + y 2 = 1; ⋄ ∆h(θ, ζ) = h(θ + ζ) − h(θ − ζ); ⋄ (αj ) and (βj ) converge to 0
  • 16. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction 0.8 0.6 0.4 0.2 -0.2 0.0 0.2 0.4 0.6 Case when αj = 1/10 log(1 + j) et βj = 1/j
  • 17. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction The traveling salesman problem A classical allocation problem: Salesman who needs to visit n cities
  • 18. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction The traveling salesman problem A classical allocation problem: Salesman who needs to visit n cities Traveling costs between pairs of cities known [and different]
  • 19. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction The traveling salesman problem A classical allocation problem: Salesman who needs to visit n cities Traveling costs between pairs of cities known [and different] Search of the optimum circuit
  • 20. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction The traveling salesman problem A classical allocation problem: Salesman who needs to visit n cities Traveling costs between pairs of cities known [and different] Search of the optimum circuit
  • 21. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction An NP-complete problem The traveling salesman problem is an example of mathematical problems that require explosive resolution times
  • 22. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction An NP-complete problem The traveling salesman problem is an example of mathematical problems that require explosive resolution times Number of possible circuits n! and exact solutions available in O(2n ) time
  • 23. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction An NP-complete problem The traveling salesman problem is an example of mathematical problems that require explosive resolution times Number of possible circuits n! and exact solutions available in O(2n ) time Numerous practical consequences (networks, integrated circuit design, genomic sequencing, &tc.) Procter & Gamble competition, 1962
  • 24. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction An open problem Exact solution for 15, 112 German cities found in 2001 in 22.6 CPU years.
  • 25. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction An open problem Exact solution for 15, 112 Exact solution for the 24, 978 German cities found in 2001 in Sweedish cities found in 2004 in 22.6 CPU years. 84.8 CPU years.
  • 26. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Resolution via simulation The simulated annealing algorithm: Repeat Random modifications of parts of the original circuit with cost C0
  • 27. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Resolution via simulation The simulated annealing algorithm: Repeat Random modifications of parts of the original circuit with cost C0 Evaluation of the cost C of the new circuit
  • 28. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Resolution via simulation The simulated annealing algorithm: Repeat Random modifications of parts of the original circuit with cost C0 Evaluation of the cost C of the new circuit Acceptation of the new circuit with probability C0 − C exp ∧1 T
  • 29. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Resolution via simulation The simulated annealing algorithm: Repeat Random modifications of parts of the original circuit with cost C0 Evaluation of the cost C of the new circuit Acceptation of the new circuit with probability C0 − C exp ∧1 T T , temperature, is progressively reduced [Metropolis, 1953]
  • 30. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Illustration Example (400 cities) T = 1.2
  • 31. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Illustration Example (400 cities) T = 0.8
  • 32. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Illustration Example (400 cities) T = 0.4
  • 33. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Illustration Example (400 cities) T = 0.0
  • 34. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Option pricing Complicated computation of expectations/average values of options, E[CT ], necessary to evaluate the entry price (1 + r)−T E[CT ]
  • 35. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Option pricing Complicated computation of expectations/average values of options, E[CT ], necessary to evaluate the entry price (1 + r)−T E[CT ] Example (European options) Case when CT = (ST − K)+ with ST = S0 × Y1 × · · · × YT , Pr(Yi = u) = 1 − Pr(Yi = d) = p .
  • 36. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Option pricing Complicated computation of expectations/average values of options, E[CT ], necessary to evaluate the entry price (1 + r)−T E[CT ] Example (European options) Case when CT = (ST − K)+ with ST = S0 × Y1 × · · · × YT , Pr(Yi = u) = 1 − Pr(Yi = d) = p . Resolution via the simulation of the binomial rv’s Yi
  • 37. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Option pricing (cont’d) Example (Asian options) Continuous time model where + T + T 1 1 CT = S(t)dt − K ≈ S(n) − K , T 0 T n=1 with iid S(n + 1) = S(n) × exp {∆X(n + 1)} , ∆X(n) ∼ N (0, σ 2 ) .
  • 38. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Introduction Option pricing (cont’d) Example (Asian options) Continuous time model where + T + T 1 1 CT = S(t)dt − K ≈ S(n) − K , T 0 T n=1 with iid S(n + 1) = S(n) × exp {∆X(n + 1)} , ∆X(n) ∼ N (0, σ 2 ) . Resolution via the simulation of the normal rv’s ∆Xi
  • 39. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Random generator Pseudo-random generator Pivotal element of simulation techniques: they all require the availability of uniform U (0, 1) random variables via transformations
  • 40. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Random generator Pseudo-random generator Pivotal element of simulation techniques: they all require the availability of uniform U (0, 1) random variables via transformations Definition (Pseudo-random generator) Un Pseudo-random generator is a deterministic Ψ from ]0, 1[ to ]0, 1[ such that, for any starting value u0 and any n, the sequence {u0 , Ψ(u0 ), Ψ(Ψ(u0 )), . . . , Ψn (u0 )} behaves (statistically) like an iid sequence U (0, 1)
  • 41. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Random generator Pseudo-random generator Pivotal element of simulation techniques: they all require the availability of uniform U (0, 1) random variables via transformations Definition (Pseudo-random generator) Un Pseudo-random generator is a deterministic Ψ from ]0, 1[ to ]0, 1[ such that, for any starting value u0 and any n, the sequence {u0 , Ψ(u0 ), Ψ(Ψ(u0 )), . . . , Ψn (u0 )} behaves (statistically) like an iid sequence U (0, 1) ¡Paradox! While avoiding randomness, the deterministic sequence (u0 , u1 = Ψ(u0 ), . . . , un = Ψ(un−1 )) must resemble a random sequence!
  • 42. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Random generator In R, use of the procedure runif( ) Description: ‘runif’ generates random deviates. Example: u = runif(20) ‘Random.seed’ is an integer vector, containing the random number generator (RNG) state for random number generation in R. It can be saved and restored, but should not be altered by the user.
  • 43. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Random generator 0.0 0.2 0.4 0.6 0.8 1.0 500 520 540 560 580 600 uniform sample 1.5 1.0 0.5 0.0 0.0 0.2 0.4 0.6 0.8 1.0
  • 44. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Random generator In C, use of the procedure rand() / random() SYNOPSIS # include <stdlib.h> long int random(void); DESCRIPTION The random() function uses a non-linear additive feedback random number generator employing a default table of size 31 long integers to return successive pseudo-random numbers in the range from 0 to RAND MAX. The period of this random generator is very large, approximately 16*((2**31)-1). RETURN VALUE random() returns a value between 0 and RAND MAX.
  • 45. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Random generator En Scilab, use of the procedure rand() rand() : with no arguments gives a scalar whose value changes each time it is referenced. By default, random numbers are uniformly distributed in the interval (0,1). rand(’normal’) switches to a normal distribution with mean 0 and variance 1. rand(’uniform’) switches back to the uniform distribution EXAMPLE x=rand(10,10,’uniform’)
  • 46. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Random generator Example (A standard uniform generator) The congruencial generator D(x) = (ax + b) mod (M + 1). has a period of M for proper choices of (a, b) and becomes a generator on ]0, 1[ when dividing by M + 2 v = u*69069069 (1) 1.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.8 0.6 t+1 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t 1.0 1.0 0.8 0.8 0.6 0.6 t+10 t+5 0.4 0.4 0.2 0.2 0.0 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t t
  • 47. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Random generator Conclusion : Use the appropriate random generator on the computer or the software at hand instead of constructing a random generator of poor quality
  • 48. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (1) Distributions different from the uniform distribution (1) A problem formaly solved since
  • 49. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (1) Distributions different from the uniform distribution (1) A problem formaly solved since Theorem (Generic inversion) If U is a uniform random variable on [0, 1) and if FX is the cdf of −1 the random variable X, then FX (U ) is distributed like X
  • 50. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (1) Distributions different from the uniform distribution (1) A problem formaly solved since Theorem (Generic inversion) If U is a uniform random variable on [0, 1) and if FX is the cdf of −1 the random variable X, then FX (U ) is distributed like X Proof. Indeed, −1 P (FX (U ) ≤ x) = P (U ≤ FX (x)) = FX (x)
  • 51. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (1) Distributions different from the uniform distribution (1) A problem formaly solved since Theorem (Generic inversion) If U is a uniform random variable on [0, 1) and if FX is the cdf of −1 the random variable X, then FX (U ) is distributed like X Proof. Indeed, −1 P (FX (U ) ≤ x) = P (U ≤ FX (x)) = FX (x) Note. When FX is not strictly increasing, we can take −1 FX (u) = inf {x; FX (x) ≥ u}
  • 52. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (1) Applications... Binomial distribution, B(n, p), n i FX (x) = p (1 − p)n−i i i≤x −1 and FX (u) can be obtained numerically
  • 53. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (1) Applications... Binomial distribution, B(n, p), n i FX (x) = p (1 − p)n−i i i≤x −1 and FX (u) can be obtained numerically Exponential distribution, E xp(λ), −1 FX (x) = 1 − exp(λx) et FX (u) = − log(u)/λ
  • 54. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (1) Applications... Binomial distribution, B(n, p), n i FX (x) = p (1 − p)n−i i i≤x −1 and FX (u) can be obtained numerically Exponential distribution, E xp(λ), −1 FX (x) = 1 − exp(λx) et FX (u) = − log(u)/λ Cauchy distribution, C (0, 1), 1 1 −1 FX (x) = arctan(x)+ et FX (u) = tan(π(u−1/2)) π 2
  • 55. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (1) Other transformations... [Hint] Find transforms linking the distribution of interest with simpler/know distributions
  • 56. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (1) Other transformations... [Hint] Find transforms linking the distribution of interest with simpler/know distributions Example (Box-M¨ller transform) u i.i.d. For the normal distribution N (0, 1), if X1 , X2 ∼ N (0, 1), X1 + X2 ∼ χ2 , 2 2 2 arctan(X1 /X2 ) ∼ U ([0, 2π]) [Jacobian] Since the χ2 distribution is the same as the E xp(1/2) distribution, 2 using a cdf inversion produces X1 = −2 log(U1 ) sin(2πU2 ) X2 = −2 log(U1 ) cos(2πU2 )
  • 57. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (1) Example Student’s t and Fisher’s F distributions are natural byproducts of the generation of the normal and of the chi-square distributions.
  • 58. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (1) Example Student’s t and Fisher’s F distributions are natural byproducts of the generation of the normal and of the chi-square distributions. Example The Cauchy distribution can be derived from the normal i.i.d. distribution as: if X1 , X2 ∼ N (0, 1), then X1 /X2 ∼ C (0, 1)
  • 59. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (1) Example The Beta distribution B(α, β), with density Γ(α + β) α−1 fX (x) = x (1 − x)β−1 , Γ(α)Γ(β) can be derived from the Gamma distribution by: if X1 ∼ G a(α, 1), X2 ∼ G a(β, 1), then X1 ∼ B(α, β) X1 + X2
  • 60. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (1) Multidimensional distributions Consider the generation of (X1 , . . . , Xp ) ∼ f (x1 , . . . , xp ) in Rp with components that are not necessarily independent Cascade rule f (x1 , . . . , xp ) = f1 (x1 ) × f2|1 (x2 |x1 ) . . . × fp|−p (xp |x1 , . . . , xp−1 )
  • 61. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (1) Implementation Simulate for t = 1, . . . , T 1 X1 ∼ f1 (x1 ) 2 X2 ∼ f2|1 (x2 |x1 ) . . . p. Xp ∼ fp|−p (xp |x1 , . . . , xp−1 )
  • 62. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Distributions different from the uniform distribution (2) −1 FX rarely available
  • 63. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Distributions different from the uniform distribution (2) −1 FX rarely available implemented algorithm in a resident software only for standard distributions
  • 64. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Distributions different from the uniform distribution (2) −1 FX rarely available implemented algorithm in a resident software only for standard distributions inversion lemma does not apply in larger dimensions
  • 65. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Distributions different from the uniform distribution (2) −1 FX rarely available implemented algorithm in a resident software only for standard distributions inversion lemma does not apply in larger dimensions new distributions may require fast resolution
  • 66. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) The Accept-Reject Algorithm Given a distribution with density f to be simulated
  • 67. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) The Accept-Reject Algorithm Given a distribution with density f to be simulated Theorem (Fundamental theorem of simulation) 0.25 The uniform distribution on the sub-graph 0.20 0.15 Sf = {(x, u); 0 ≤ u ≤ f (x)} f(x) 0.10 produces a marginal in x with 0.05 density f . 0.00 0 2 4 6 8 10 x
  • 68. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Proof : Marginal density given by ∞ I0≤u≤f (x) du = f (x) 0 and independence from the normalisation constant
  • 69. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Proof : Marginal density given by ∞ I0≤u≤f (x) du = f (x) 0 and independence from the normalisation constant Example For a normal distribution, we just need to simulate (u, x) at random in {(u, x); 0 ≤ u ≤ exp(−x2 /2)}
  • 70. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Accept-reject algorithm 1 Find a density g that can be simulated and such that f (x) sup =M <∞ x g(x)
  • 71. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Accept-reject algorithm 1 Find a density g that can be simulated and such that f (x) sup =M <∞ x g(x) 2 Generate i.i.d. i.i.d. Y1 , Y2 , . . . ∼ g , U1 , U2 , . . . ∼ U ([0, 1])
  • 72. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Accept-reject algorithm 1 Find a density g that can be simulated and such that f (x) sup =M <∞ x g(x) 2 Generate i.i.d. i.i.d. Y1 , Y2 , . . . ∼ g , U1 , U2 , . . . ∼ U ([0, 1]) 3 Take X = Yk where k = inf{n ; Un ≤ f (Yn )/M g(Yn )}
  • 73. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Theorem (Accept–reject) The random variable produced by the above stopping rule is distributed form fX
  • 74. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Theorem (Accept–reject) The random variable produced by the above stopping rule is distributed form fX Proof (1) : We have ∞ P (X ≤ x) = P (X = Yk , Yk ≤ x) k=1 ∞ k−1 1 = 1− P (Uk ≤ f (Yk )/M g(Yk ) , Yk ≤ x) M k=1 ∞ k−1 x f (y)/M g(y) 1 = 1− du g(y)dy M −∞ 0 k=1 ∞ k−1 x 1 1 = 1− f (y)dy M M −∞ k=1
  • 75. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Proof (2) 5 4 If (X, U ) is uniform on A ⊃ B, 3 the distribution of (X, U ) 2 restricted to B is uniform on B. 1 0 −4 −2 0 2 4
  • 76. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Properties Does not require a normalisation constant
  • 77. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Properties Does not require a normalisation constant Does not require an exact upper bound M
  • 78. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Properties Does not require a normalisation constant Does not require an exact upper bound M Allows for the recycling of the Yk ’s for another density f (note that rejected Yk ’s are no longer distributed from g)
  • 79. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Properties Does not require a normalisation constant Does not require an exact upper bound M Allows for the recycling of the Yk ’s for another density f (note that rejected Yk ’s are no longer distributed from g) Requires on average M Yk ’s for one simulated X (efficiency measure)
  • 80. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Example Take f (x) = exp(−x2 /2) et g(x) = 1/(1 + x2 ) f (x) 2 √ = (1 + x2 ) e−x /2 ≤ 2/ e g(x) Probability of acceptance e/2π = 0.66
  • 81. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Theorem (Envelope) If there exists a density gm , a function gl and a constant M such that gl (x) ≤ f (x) ≤ M gm (x) , then 1 Generate X ∼ gm (x), U ∼ U[0,1] ; 2 Accept X if U ≤ gl (X)/M gm (X); 3 else, accept X if U ≤ f (X)/M gm (X) produces random variable distributed from f .
  • 82. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Uniform ratio algorithms Slice sampler Result : Uniform simulation on {(u, v); 0 ≤ u ≤ 2f (v/u)} produces X = V /U ∼ f
  • 83. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Uniform ratio algorithms Slice sampler Result : Uniform simulation on {(u, v); 0 ≤ u ≤ 2f (v/u)} produces X = V /U ∼ f Proof : Change of variable (u, v) → (x, u) with Jacobian u and marginal distribution of x provided by √ 2 2f (x) 2f (x) x∼ u du = = f (x) 0 2
  • 84. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Non-uniform distributions (2) Example 0.6 For a normal distribution, 0.4 v simulate (u, v) at random in 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 u √ 2 /4u2 √ {(u, v); 0 ≤ u ≤ 2 e−v } = {(u, v); v 2 ≤ −4 u2 log(u/ 2)}
  • 85. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Slice sampler If a uniform simulation on G = {(u, x); 0 ≤ u ≤ f (x)} is too complex [because of the inversion of x into u ≤ f (x)], we can use instead a random walk on G:
  • 86. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Slice sampler 0.010 Simulate for t = 1, . . . , T 0.008 1 ω (t+1) ∼ U[0,f (x(t) )] ; 0.006 x(t+1) ∼ UG(t+1) , where f(x) 2 0.004 G(t+1) = {y; f (y) ≥ ω (t+1) }. 0.002 0.000 0.0 0.2 0.4 0.6 0.8 1.0 x
  • 87. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Justification The random walk is exploring uniformly G: If (U (t) , X (t) ) ∼ UG , then (U (t+1) , X (t+1) ) ∼ UG .
  • 88. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Proof: Pr((U (t+1) , X (t+1) ) ∈ A × B) I0≤u′ ≤f (x) If (x′ )≥u′ (x′ ) = I0≤u≤f (x) d(x, u, x′ , u′ ) B A f (x) If (y)≥u′ dy I0≤u′ ≤f (x) If (x′ )≥u′ (x′ ) = f (x) d(x, x′ , u′ ) B A f (x) If (y)≥u′ dy If (x′ )≥u′ (x′ ) = If (x)≥u′ dx d(x′ , u′ ) B A If (y)≥u′ dy = If (x′ )≥u′ ≥0 d(x′ , u′ ) B A
  • 89. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Example (Normal distribution) For the standard normal distribution, f (x) ∝ exp(−x2 /2), a slice sampler is ω|x ∼ U[0,exp(−x2 /2)] , X|ω ∼ U √ [− √ −2 log(ω), −2 log(ω)]
  • 90. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Note The technique also operates when f is replaced with ϕ(x) ∝ f (x)
  • 91. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Note The technique also operates when f is replaced with ϕ(x) ∝ f (x) It can be generalised to the case when f is decomposed in p f (x) = fi (x) i=1
  • 92. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Example (Truncated normal distribution) If we consider instead the truncated N (−3, 1) distribution restricted to [0, 1], with density exp(−(x + 3)2 /2) f (x) = √ ∝ exp(−(x + 3)2 /2) = ϕ(x) , 2π[Φ(4) − Φ(3)]
  • 93. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Example (Truncated normal distribution) If we consider instead the truncated N (−3, 1) distribution restricted to [0, 1], with density exp(−(x + 3)2 /2) f (x) = √ ∝ exp(−(x + 3)2 /2) = ϕ(x) , 2π[Φ(4) − Φ(3)] a slice sampler is ω|x ∼ U[0,exp(−(x+3)2 /2)] , X|ω ∼ U √ [0,1∧{−3+ −2 log(ω)}]
  • 94. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods The Metropolis–Hastings algorithm Generalisation of the slice sampler to situations when the slice sampler cannot be easily implemented Idea Create a sequence (Xn )n such that, for n ‘large enough’, the density of Xn is close to f
  • 95. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods The Metropolis–Hastings algorithm (2) If f is the density of interest, we pick a proposal conditional density q(y|x) such that
  • 96. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods The Metropolis–Hastings algorithm (2) If f is the density of interest, we pick a proposal conditional density q(y|x) such that it is easy to simulate it is positive everywhere f is positive
  • 97. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Metropolis–Hastings For a current value X (t) = x(t) , 1 Generate Yt ∼ q(y|x(t) ).
  • 98. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Metropolis–Hastings For a current value X (t) = x(t) , 1 Generate Yt ∼ q(y|x(t) ). 2 Take Yt with proba. ρ(x(t) , Yt ), X (t+1) = x(t) with proba. 1 − ρ(x(t) , Yt ), where f (y) q(x|y) ρ(x, y) = min ,1 . f (x) q(y|x)
  • 99. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Properties Always accept moves to yt ’s such that f (yt ) f (xt ) ≥ q(yt |xt ) q(xt |yt )
  • 100. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Properties Always accept moves to yt ’s such that f (yt ) f (xt ) ≥ q(yt |xt ) q(xt |yt ) Does not depend on normalising constants for both f and q(·|x) (if the later is independent from x)
  • 101. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Properties Always accept moves to yt ’s such that f (yt ) f (xt ) ≥ q(yt |xt ) q(xt |yt ) Does not depend on normalising constants for both f and q(·|x) (if the later is independent from x) Never accept values of yt such that f (yt ) = 0
  • 102. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Properties Always accept moves to yt ’s such that f (yt ) f (xt ) ≥ q(yt |xt ) q(xt |yt ) Does not depend on normalising constants for both f and q(·|x) (if the later is independent from x) Never accept values of yt such that f (yt ) = 0 The sequence (x(t) )t can take repeatedly the same value
  • 103. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Properties Always accept moves to yt ’s such that f (yt ) f (xt ) ≥ q(yt |xt ) q(xt |yt ) Does not depend on normalising constants for both f and q(·|x) (if the later is independent from x) Never accept values of yt such that f (yt ) = 0 The sequence (x(t) )t can take repeatedly the same value The X (t) ’s are dependent (Marakovian) random variables links
  • 104. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Justification Joint distribution of (X (t) , X (t+1) ) If X (t) ∼ f (x(t) ), (X (t) , X (t+1) ) ∼ f (x(t) ) ρ(x(t) , x(t+1) ) × q(x(t+1) |x(t) ) [Yt accepted] + 1 − ρ(x(t) , y) q(y|x(t) ) dy Ix(t) (x(t+1) ) [Yt rejected]
  • 105. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Balance condition f (y) q(x|y) f (x) × ρ(x, y) × q(y|x) = f (x) min , 1 q(y|x) f (x) q(y|x) = min {f (y)q(x|y), f (x)q(y|x)} = f (y) × ρ(y, x) × q(x|y)
  • 106. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Balance condition f (y) q(x|y) f (x) × ρ(x, y) × q(y|x) = f (x) min , 1 q(y|x) f (x) q(y|x) = min {f (y)q(x|y), f (x)q(y|x)} = f (y) × ρ(y, x) × q(x|y) Thus the distribution of (X (t) , X (t+1) ) as the distribution of (X (t+1) , X (t) ) : if X (t) has the density f , then so does X (t+1)
  • 107. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Link with slice sampling The slice sampler is a very special case of Metropolis-Hastings algorithm where the acceptance probability is always 1
  • 108. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Link with slice sampling The slice sampler is a very special case of Metropolis-Hastings algorithm where the acceptance probability is always 1 1 for the generation of U , I0≤u′ ≤f (x) f (x)−1 I0≤u′ ≤f (x) × =1 I0≤u≤f (x) f (x)−1 I0≤u≤f (x) [joint density] [conditional density]
  • 109. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Link with slice sampling The slice sampler is a very special case of Metropolis-Hastings algorithm where the acceptance probability is always 1 1 for the generation of U , I0≤u′ ≤f (x) f (x)−1 I0≤u′ ≤f (x) × =1 I0≤u≤f (x) f (x)−1 I0≤u≤f (x) [joint density] [conditional density] 2 pour la g´n´ration de X, e e I0≤u≤f (y) I{z;u≤f (z)} (x) {z;u≤f (z)} f (z) dz × =1 I0≤u≤f (x) I{z;u≤f (z)} (y) {z;u≤f (z)} f (z) dz [joint density] [conditional density]
  • 110. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Independent proposals Proposal q independent from X (t) , denoted g as in Accept-Reject algorithms.
  • 111. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Independent proposals Proposal q independent from X (t) , denoted g as in Accept-Reject algorithms. Independent Metropolis-Hastings For the current value X (t) = x(t) , 1 Generate Yt ∼ g(y) 2 Take   Y f (Yt ) g(x(t) ) t with proba. min ,1 , X (t+1) = f (x(t) ) g(Yt )   x(t) otherwise.
  • 112. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Properties Alternative to Accept-Reject
  • 113. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Properties Alternative to Accept-Reject Avoids the computation of max f (x)/g(x)
  • 114. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Properties Alternative to Accept-Reject Avoids the computation of max f (x)/g(x) Accepts more often than Accept-Reject
  • 115. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Properties Alternative to Accept-Reject Avoids the computation of max f (x)/g(x) Accepts more often than Accept-Reject If xt achieves max f (x)/g(x), this is almost identical to Accept-Reject
  • 116. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Properties Alternative to Accept-Reject Avoids the computation of max f (x)/g(x) Accepts more often than Accept-Reject If xt achieves max f (x)/g(x), this is almost identical to Accept-Reject Except that the sequence (xt ) is not independent
  • 117. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Example (Gamma distribution) Generate a distribution Ga(α, β) from a proposal Ga(⌊α⌋, b = ⌊α⌋/α), where ⌊α⌋ is the integer part of α (this is a sum of exponentials) 1 Generate Yt ∼ Ga(⌊α⌋, ⌊α⌋/α) 2 Take   α−⌊α⌋   Yt x(t) − Yt Yt with prob. exp X (t+1) = x(t) α   x(t) else.
  • 118. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Random walk Metropolis–Hastings Proposal Yt = X (t) + εt , where εt ∼ g, independent from X (t) , and g symmetrical
  • 119. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Random walk Metropolis–Hastings Proposal Yt = X (t) + εt , where εt ∼ g, independent from X (t) , and g symmetrical Instrumental distribution with density g(y − x) Motivation local perturbation of X (t) / exploration of its neighbourhood
  • 120. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Random walk Metropolis–Hastings Starting from X (t) = x(t) 1 Generate Yt ∼ g(y − x(t) ) 2 Take   Yt f (Yt )   with proba. min 1, , f (x(t) ) X (t+1) =   [symmetry of g]  (t) x otherwise
  • 121. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Properties Always accepts higher point and sometimes lower points (see gradient algorithm)
  • 122. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Properties Always accepts higher point and sometimes lower points (see gradient algorithm) Depends on the dispersion de g
  • 123. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Properties Always accepts higher point and sometimes lower points (see gradient algorithm) Depends on the dispersion de g Average robability of acceptance ̺= min{f (x), f (y)}g(y − x) dxdy close to 1 if g has a small variance [Danger!] far from 1 if g has a large variance [Re-Danger!]
  • 124. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Example (Normal distribution) Generate N (0, 1) based on a uniform perturbation on [−δ, δ] Yt = X (t) + δωt
  • 125. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Example (Normal distribution) Generate N (0, 1) based on a uniform perturbation on [−δ, δ] Yt = X (t) + δωt Acceptance probability 2 ρ(x(t) , yt ) = exp{(x(t) − yt )/2} ∧ 1. 2
  • 126. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Example (Normal distribution (2)) Statistics based on 15000 simulations δ 0.1 0.5 1.0 mean 0.399 −0.111 0.10 variance 0.698 1.11 1.06 When δ ↑, faster exploration of the support of f .
  • 127. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Example (Normal distribution (2)) Statistics based on 15000 simulations δ 0.1 0.5 1.0 mean 0.399 −0.111 0.10 variance 0.698 1.11 1.06 When δ ↑, faster exploration of the support of f . 400 400 250 0.5 0.5 0.5 300 200 300 0.0 0.0 0.0 150 200 200 -0.5 -0.5 -0.5 100 100 100 -1.0 -1.0 -1.0 50 -1.5 -1.5 -1.5 0 0 0 -1 0 1 2 -2 0 2 -3 -2 -1 0 1 2 3 (a) (b) (c) 3 samples with δ = 0.1, 0.5 and 1.0, with convergence of empirical averages (over 15000 simulations).
  • 128. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Missing variable models Special case when the density to simulate can be written as f (x) = ˜ f (x, z)dz Z The random variable Z is then called missing data
  • 129. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Completion principe Idea ˜ Simulate f produces simulations from f
  • 130. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Completion principe Idea ˜ Simulate f produces simulations from f If ˜ (X, Z) ∼ f (x, z) , marginaly X ∼ f (x)
  • 131. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Data Augmentation Starting from x(t) , 1. ˜ Simulate Z (t+1) ∼ fZ|X (z|x(t) ) ; 2. ˜ Simuleater X (t+1) ∼ fX|Z (x|z (t+1) ) .
  • 132. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Example (Mixture of distributions) Consider the simulation (in R2 ) of the density f (µ1 , µ2 ) proportional to 100 2 2 2 /2 2 /2 e−µ1 −µ2 × 0.3 e−(xi −µ1 ) + 0.7 e−(xi −µ2 ) i=1 when the xi ’s are given/observed.
  • 133. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Echantillon de 0.3 N(2.5,1)+ 0.7 N(0,1) 0.35 3.0 0.30 2.8 0.25 2.6 0.20 µ2 0.15 2.4 0.10 2.2 0.05 0.00 2.0 −2 0 2 4 −0.4 −0.2 0.0 0.2 0.4 x µ1 Histogram of the xi ’s and level set of f (µ1 , µ2 )
  • 134. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Completion (1) Replace every sum in the density with an integral: 2 /2 2 /2 0.3 e−(xi −µ1 ) + 0.7 e−(xi −µ2 ) = I[0,0.3 e−(xi −µ1 )2 /2 ] (ui ) +I[0.3 e−(xi −µ1 )2 /2 ,0.3 e−(xi −µ1 )2 /2 +0.7 e−(xi −µ2 )2 /2 ] (ui ) dui and simulate ((µ1 , µ2 ), (U1 , . . . , Un )) = (X, Z) via Data Augmentation
  • 135. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Completion (2) Replace the Ui ’s by the ξi ’s , where 2 /2 1 si Ui ≤ 0.3 e−(xi −µ1 ) , ξi = 2 sinon Then 2 0.3 e−(xi −µ1 ) /2 Pr (ξi = 1|µ1 , µ2 ) = 0.3 e−(xi −µ1 )2 /2 + 0.7 e−(xi −µ2 )2 /2 = 1 − Pr (ξi = 2|µ1 , µ2 )
  • 136. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Conditioning (1) The conditional distribution of Z = (ξ1 , . . . , ξn ) given X = (µ1 , µ2 ) is given by 2 0.3 e−(xi −µ1 ) /2 Pr (ξi = 1|µ1 , µ2 ) = 0.3 e−(xi −µ1 )2 /2 + 0.7 e−(xi −µ2 )2 /2 = 1 − Pr (ξi = 2|µ1 , µ2 )
  • 137. New operational instruments for statistical exploration (=NOISE) Simulation of random variables Markovian methods Conditioning (2) The conditional distribution of X = (µ1 , µ2 ) given Z = (ξ1 , . . . , ξn ) is given by 2 2 2 /2 2 /2 (µ1 , µ2 )|Z ∼ e−µ1 −µ2 × e−(xi −µ1 ) × e−(xi −µ2 ) {i;ξi =1} {i;ξi =2} 2 n1 µ1 ˆ ∝ exp −(n1 + 2) µ1 − /2 n1 + 2 2 n2 µ2 ˆ × exp −(n2 + 2) µ2 − /2 n2 + 2 where nj is the number of ξi ’s equal to j and nj µj is the sum of ˆ the xi ’s associated with those ξi equal to j [Easy!]
  • 138. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Chapter 2 : Monte Carlo Methods & EM algorithm Introduction Integration by Monte Carlo method Importance functions Acceleration methods
  • 139. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Introduction Uses of simulation 1 integration I = Ef [h(X)] = h(x)f (x)dx
  • 140. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Introduction Uses of simulation 1 integration I = Ef [h(X)] = h(x)f (x)dx 2 limiting behaviour/stationarity of complex systems
  • 141. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Introduction Uses of simulation 1 integration I = Ef [h(X)] = h(x)f (x)dx 2 limiting behaviour/stationarity of complex systems 3 optimisation arg min h(x) = arg max exp{−βh(x)} β>0 x x
  • 142. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Introduction Example (Propagation of an epidemic) On a grid representing a region, a point is given by its coordinates x, y
  • 143. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Introduction Example (Propagation of an epidemic) On a grid representing a region, a point is given by its coordinates x, y The probability to catch a disease is exp(α + β · nx,y ) Px,y = In >0 1 + exp(α + β · nx,y ) x,y if nx,y denotes the number of neighbours of (x, y) who alread have this disease
  • 144. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Introduction Example (Propagation of an epidemic) On a grid representing a region, a point is given by its coordinates x, y The probability to catch a disease is exp(α + β · nx,y ) Px,y = In >0 1 + exp(α + β · nx,y ) x,y if nx,y denotes the number of neighbours of (x, y) who alread have this disease The probability to get healed is exp(δ + γ · nx,y ) Qx,y = 1 + exp(δ + γ · nx,y )
  • 145. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Introduction Example (Propagation of an epidemic (2)) Question Given (α, β, γ, δ), what is the speed of propagation of this epidemic? the average duration? the number of infected persons?
  • 146. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Integration by Monte Carlo method Monte Carlo integration Law of large numbers If X1 , . . . , Xn simulated from f , n ˆ 1 In = h(Xi ) −→ I n i=1
  • 147. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Integration by Monte Carlo method Central Limit Theorem Evaluation of the error b n 1 ˆ ˆ2 σn = (h(Xi ) − I)2 n2 i=1 and ˆ ˆ2 In ≈ N (I, σn )
  • 148. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Integration by Monte Carlo method Example (Normal) For a Gaussian distribution, E[X 4 ] = 3. Via Monte Carlo integration, n 5 50 500 5000 50,000 500,000 ˆn 1.65 5.69 3.24 3.13 3.038 I 3.029 3.0 2.5 2.0 1.5 In 1.0 0.5 0.0 5 10 50 100 500 1000 5000 10000 50000 n
  • 149. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Integration by Monte Carlo method Example (Cauchy / Normal) Consider the joint model X|θ ∼ N (θ, 1), θ ∼ C(0, 1) Once X is observed, θ is estimated by ∞ θ 2 2 e−(x−θ) /2 dθ −∞ 1+θ δ π (x) = ∞ 1 2 e−(x−θ) /2 dθ −∞ 1 + θ2
  • 150. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Integration by Monte Carlo method Example (Cauchy / Normal (2)) This representation of δ π suggests using iid variables θ1 , · · · , θm ∼ N (x, 1) and to compute m θi i=1 2 ˆπ 1 + θi δm (x) = . m 1 i=1 2 1 + θi
  • 151. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Integration by Monte Carlo method Example (Cauchy / Normal (2)) This representation of δ π suggests using iid variables θ1 , · · · , θm ∼ N (x, 1) and to compute m θi i=1 2 ˆπ 1 + θi δm (x) = . m 1 i=1 2 1 + θi By vurtue of the Law of Large Numbers, ˆπ δm (x) −→ δ π (x) quand m −→ ∞.
  • 152. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Integration by Monte Carlo method Example (Normal cdf) Approximation of the normal cdf t 1 2 Φ(t) = √ e−y /2 dy −∞ 2π by n ˆ 1 Φ(t) = IXi ≤t , n i=1 based on a sample of size n (X1 , . . . , Xn ), generated by the algorithm of Box-Muller.
  • 153. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Integration by Monte Carlo method Example (Normal cdf(2)) • Variance Φ(t)(1 − Φ(t))/n, since the variables IXi ≤t are iid Bernoulli(Φ(t)).
  • 154. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Integration by Monte Carlo method Example (Normal cdf(2)) • Variance Φ(t)(1 − Φ(t))/n, since the variables IXi ≤t are iid Bernoulli(Φ(t)). • For t close to t = 0 thea variance is about 1/4n: a precision of four decimals requires on average √ √ n = 2 104 simulations, thus, 200 millions of iterations.
  • 155. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Integration by Monte Carlo method Example (Normal cdf(2)) • Variance Φ(t)(1 − Φ(t))/n, since the variables IXi ≤t are iid Bernoulli(Φ(t)). • For t close to t = 0 thea variance is about 1/4n: a precision of four decimals requires on average √ √ n = 2 104 simulations, thus, 200 millions of iterations. • Larger [absolute] precision in the tails
  • 156. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Integration by Monte Carlo method Example (Normal cdf(3)) n 0.0 0.67 0.84 1.28 1.65 2.32 2.58 3.09 3.72 102 0.485 0.74 0.77 0.9 0.945 0.985 0.995 1 1 103 0.4925 0.7455 0.801 0.902 0.9425 0.9885 0.9955 0.9985 1 104 0.4962 0.7425 0.7941 0.9 0.9498 0.9896 0.995 0.999 0.9999 105 0.4995 0.7489 0.7993 0.9003 0.9498 0.9898 0.995 0.9989 0.9999 106 0.5001 0.7497 0.8 0.9002 0.9502 0.99 0.995 0.999 0.9999 107 0.5002 0.7499 0.8 0.9001 0.9501 0.99 0.995 0.999 0.9999 108 0.5 0.75 0.8 0.9 0.95 0.99 0.995 0.999 0.9999 Evaluation of normal quantiles by Monte Carlo based on n normal generations
  • 157. New operational instruments for statistical exploration (=NOISE) Monte Carlo Method and EM algorithm Importance functions Importance functions Alternative representation : f (x) I= h(x)f (x)dx = h(x) g(x)dx g(x)