SlideShare a Scribd company logo
1 of 20
Download to read offline
Statistics in NumPy and SciPy



        February 12, 2009
Enthought Python Distribution (EPD)

 MORE THAN SIXTY INTEGRATED PACKAGES

  • Python 2.6                     • Repository access
  • Science (NumPy, SciPy, etc.)   • Data Storage (HDF, NetCDF, etc.)
  • Plotting (Chaco, Matplotlib)   • Networking (twisted)
  • Visualization (VTK, Mayavi)    • User Interface (wxPython, Traits UI)
  • Multi-language Integration     • Enthought Tool Suite
    (SWIG,Pyrex, f2py, weave)        (Application Development Tools)
Enthought Training Courses




                 Python Basics, NumPy,
                 SciPy, Matplotlib, Chaco,
                 Traits, TraitsUI, …
PyCon


    http://us.pycon.org/2010/tutorials/

        Introduction to Traits




                                 Corran Webster
Upcoming Training Classes
  March 1 – 5, 2009
       Python for Scientists and Engineers
       Austin, Texas, USA

  March 8 – 12, 2009
       Python for Quants
       London, UK




                       http://www.enthought.com/training/
NumPy / SciPy
  Statistics
Statistics overview

 • NumPy methods and functions
   – .mean, .std, .var, .min, .max, .argmax, .argmin
   – median, nanargmax, nanargmin, nanmax,
    nanmin, nansum
 • NumPy random number generators
 • Distribution objects in SciPy (scipy.stats)
 • Many functions in SciPy
   – f_oneway, bayes_mvs
   – nanmedian, nanstd, nanmean
NumPy methods
 • All array objects have some “statistical”
   methods
  – .mean(), .std(), .var(), .max(), .min(), .argmax(),
   .argmin()
  – Take an axis keyword that allows them to work on
   N-d arrays (shown with .sum).


   axis=0               axis=1
NumPy functions

 • median
 • nan-functions (ignore nans)
   –   nanmax
   –   nanmin
   –   nanargmin
   –   nanargmax
   –   nansum
 • Can also use masks and regular functions
NumPy Random Number Generators

 • Based on Mersenne twister algorithm
 • Written using PyRex / Cython
 • Univariate (over 40)
 • Multivariate (only 3)
  – multinomial
  – dirichlet
  – multivariate_normal
 • Convenience functions
  – rand, randn, randint, ranf
Statistics
scipy.stats — CONTINUOUS DISTRIBUTIONS

over 80
continuous
distributions!

METHODS

pdf     entropy
cdf     nnlf
rvs     moment
ppf     freeze
stats
fit
sf
isf
Using stats objects
DISTRIBUTIONS


>>> from scipy.stats import norm
# Sample normal dist. 100 times.
>>> samp = norm.rvs(size=100)

>>> x = linspace(-5, 5, 100)
# Calculate probability dist.
>>> pdf = norm.pdf(x)
# Calculate cummulative Dist.
>>> cdf = norm.cdf(x)
# Calculate Percent Point Function
>>> ppf = norm.ppf(x)
Distribution objects
Every distribution can be modified by loc and scale keywords
(many distributions also have required shape arguments to select from a family)

LOCATION (loc) --- shift left (<0) or right (>0) the distribution




SCALE (scale) --- stretch (>1) or compress (<1) the distribution
Example distributions
NORM (norm) – N(µ,σ)


  Only location and scale       location   mean                    µ
  arguments:
                                scale      standard deviation      σ


LOG NORMAL (lognorm)

log(S) is N(µ, σ)
                                location   offset from zero (rarely used)
        S is lognormal
                                scale      eµ

         one shape parameter!   shape      σ
Setting location and Scale

NORMAL DISTRIBUTION


>>> from scipy.stats import norm
# Normal dist with mean=10 and std=2
>>> dist = norm(loc=10, scale=2)

>>> x = linspace(-5, 15, 100)
# Calculate probability dist.
>>> pdf = dist.pdf(x)
# Calculate cummulative dist.
>>> cdf = dist.cdf(x)

# Get 100 random samples from dist.
>>> samp = dist.rvs(size=100)

# Estimate parameters from data
>>> mu, sigma = norm.fit(samp)           .fit returns best
>>> print “%4.2f, %4.2f” % (mu, sigma)   shape + (loc, scale)
10.07, 1.95                              that explains the data
Statistics
scipy.stats — Discrete Distributions

 10 standard
 discrete
 distributions
 (plus any
 finite RV)

METHODS
pmf     moment
cdf     entropy
rvs     freeze
ppf
stats
sf
isf
Using stats objects
 CREATING NEW DISCRETE DISTRIBUTIONS


# Create loaded dice.
>>> from scipy.stats import rv_discrete
>>> xk = [1,2,3,4,5,6]
>>> pk = [0.3,0.35,0.25,0.05,
          0.025,0.025]
>>> new = rv_discrete(name='loaded',
                   values=(xk,pk))

# Calculate histogram
>>> samples = new.rvs(size=1000)
>>> bins=linspace(0.5,5.5,6)
>>> subplot(211)
>>> hist(samples,bins=bins,normed=True)

# Calculate pmf
>>> x = range(0,8)
>>> subplot(212)
>>> stem(x,new.pmf(x))
Statistics
CONTINUOUS DISTRIBUTION ESTIMATION USING GAUSSIAN KERNELS

# Sample two normal distributions
# and create a bi-modal distribution
>>> rv1 = stats.norm()
>>> rv2 = stats.norm(2.0,0.8)
>>> samples = hstack([rv1.rvs(size=100),
                        rv2.rvs(size=100)])


# Use a Gaussian kernel density to
# estimate the PDF for the samples.
>>> from scipy.stats.kde import gaussian_kde
>>> approximate_pdf = gaussian_kde(samples)
>>> x = linspace(-3,6,200)

# Compare the histogram of the samples to
# the PDF approximation.
>>> hist(samples, bins=25, normed=True)
>>> plot(x, approximate_pdf(x),'r')
Other functions in scipy.stats

 • Statistical Tests (Anderson, Wilcox, etc.)
 • Other calculations (hmean, nanmedian)
 • Work in progress
 • A great place to jump in and help
Other statistical Resources

 • scikits.statsmodels
 • RPy2
 • PyMC

More Related Content

What's hot

Introduction to Recursion (Python)
Introduction to Recursion (Python)Introduction to Recursion (Python)
Introduction to Recursion (Python)Thai Pangsakulyanont
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandasPiyush rai
 
Python Programming - VI. Classes and Objects
Python Programming - VI. Classes and ObjectsPython Programming - VI. Classes and Objects
Python Programming - VI. Classes and ObjectsRanel Padon
 
pandas - Python Data Analysis
pandas - Python Data Analysispandas - Python Data Analysis
pandas - Python Data AnalysisAndrew Henshaw
 
Data visualization in Python
Data visualization in PythonData visualization in Python
Data visualization in PythonMarc Garcia
 
Machine learning with scikitlearn
Machine learning with scikitlearnMachine learning with scikitlearn
Machine learning with scikitlearnPratap Dangeti
 
Data Analysis in Python-NumPy
Data Analysis in Python-NumPyData Analysis in Python-NumPy
Data Analysis in Python-NumPyDevashish Kumar
 
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesMatt Harrison
 
logistic regression with python and R
logistic regression with python and Rlogistic regression with python and R
logistic regression with python and RAkhilesh Joshi
 
Introduction to Python Pandas for Data Analytics
Introduction to Python Pandas for Data AnalyticsIntroduction to Python Pandas for Data Analytics
Introduction to Python Pandas for Data AnalyticsPhoenix
 
Multidimensional arrays in C++
Multidimensional arrays in C++Multidimensional arrays in C++
Multidimensional arrays in C++Ilio Catallo
 
Data file handling in python introduction,opening & closing files
Data file handling in python introduction,opening & closing filesData file handling in python introduction,opening & closing files
Data file handling in python introduction,opening & closing fileskeeeerty
 
Python Class | Python Programming | Python Tutorial | Edureka
Python Class | Python Programming | Python Tutorial | EdurekaPython Class | Python Programming | Python Tutorial | Edureka
Python Class | Python Programming | Python Tutorial | EdurekaEdureka!
 
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Simplilearn
 

What's hot (20)

Numpy
NumpyNumpy
Numpy
 
NumPy
NumPyNumPy
NumPy
 
Introduction to Recursion (Python)
Introduction to Recursion (Python)Introduction to Recursion (Python)
Introduction to Recursion (Python)
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandas
 
Python Programming - VI. Classes and Objects
Python Programming - VI. Classes and ObjectsPython Programming - VI. Classes and Objects
Python Programming - VI. Classes and Objects
 
pandas - Python Data Analysis
pandas - Python Data Analysispandas - Python Data Analysis
pandas - Python Data Analysis
 
Pandas
PandasPandas
Pandas
 
Data visualization in Python
Data visualization in PythonData visualization in Python
Data visualization in Python
 
Machine learning with scikitlearn
Machine learning with scikitlearnMachine learning with scikitlearn
Machine learning with scikitlearn
 
6-Python-Recursion PPT.pptx
6-Python-Recursion PPT.pptx6-Python-Recursion PPT.pptx
6-Python-Recursion PPT.pptx
 
Data Analysis in Python-NumPy
Data Analysis in Python-NumPyData Analysis in Python-NumPy
Data Analysis in Python-NumPy
 
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 Minutes
 
logistic regression with python and R
logistic regression with python and Rlogistic regression with python and R
logistic regression with python and R
 
Introduction to Python Pandas for Data Analytics
Introduction to Python Pandas for Data AnalyticsIntroduction to Python Pandas for Data Analytics
Introduction to Python Pandas for Data Analytics
 
Multidimensional arrays in C++
Multidimensional arrays in C++Multidimensional arrays in C++
Multidimensional arrays in C++
 
Data file handling in python introduction,opening & closing files
Data file handling in python introduction,opening & closing filesData file handling in python introduction,opening & closing files
Data file handling in python introduction,opening & closing files
 
Python Class | Python Programming | Python Tutorial | Edureka
Python Class | Python Programming | Python Tutorial | EdurekaPython Class | Python Programming | Python Tutorial | Edureka
Python Class | Python Programming | Python Tutorial | Edureka
 
Chapter 03 python libraries
Chapter 03 python librariesChapter 03 python libraries
Chapter 03 python libraries
 
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
 
Python Functions
Python   FunctionsPython   Functions
Python Functions
 

Viewers also liked

Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingScientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingEnthought, Inc.
 
The Joy of SciPy
The Joy of SciPyThe Joy of SciPy
The Joy of SciPykammeyer
 
Доклад АКТО-2012 Душкин, Смирнова
Доклад АКТО-2012 Душкин, СмирноваДоклад АКТО-2012 Душкин, Смирнова
Доклад АКТО-2012 Душкин, СмирноваDmitry Dushkin
 
Statistical inference for (Python) Data Analysis. An introduction.
Statistical inference for (Python) Data Analysis. An introduction.Statistical inference for (Python) Data Analysis. An introduction.
Statistical inference for (Python) Data Analysis. An introduction.Piotr Milanowski
 
Raspberry Pi and Scientific Computing [SciPy 2012]
Raspberry Pi and Scientific Computing [SciPy 2012]Raspberry Pi and Scientific Computing [SciPy 2012]
Raspberry Pi and Scientific Computing [SciPy 2012]Samarth Shah
 
Python для анализа данных
Python для анализа данныхPython для анализа данных
Python для анализа данныхPython Meetup
 
A look inside pandas design and development
A look inside pandas design and developmentA look inside pandas design and development
A look inside pandas design and developmentWes McKinney
 
The Heisenberg Uncertainty Principle[1]
The Heisenberg Uncertainty Principle[1]The Heisenberg Uncertainty Principle[1]
The Heisenberg Uncertainty Principle[1]guestea12c43
 
Data Analytics with Pandas and Numpy - Python
Data Analytics with Pandas and Numpy - PythonData Analytics with Pandas and Numpy - Python
Data Analytics with Pandas and Numpy - PythonChetan Khatri
 
09 jus 20101123_optimisation_salomeaster
09 jus 20101123_optimisation_salomeaster09 jus 20101123_optimisation_salomeaster
09 jus 20101123_optimisation_salomeasterOpenCascade
 
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...Ryan Rosario
 

Viewers also liked (14)

Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingScientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
 
The Joy of SciPy
The Joy of SciPyThe Joy of SciPy
The Joy of SciPy
 
Доклад АКТО-2012 Душкин, Смирнова
Доклад АКТО-2012 Душкин, СмирноваДоклад АКТО-2012 Душкин, Смирнова
Доклад АКТО-2012 Душкин, Смирнова
 
Statistical inference for (Python) Data Analysis. An introduction.
Statistical inference for (Python) Data Analysis. An introduction.Statistical inference for (Python) Data Analysis. An introduction.
Statistical inference for (Python) Data Analysis. An introduction.
 
MongoDB Replication Cluster
MongoDB Replication ClusterMongoDB Replication Cluster
MongoDB Replication Cluster
 
Raspberry Pi and Scientific Computing [SciPy 2012]
Raspberry Pi and Scientific Computing [SciPy 2012]Raspberry Pi and Scientific Computing [SciPy 2012]
Raspberry Pi and Scientific Computing [SciPy 2012]
 
Python для анализа данных
Python для анализа данныхPython для анализа данных
Python для анализа данных
 
Introduction to mongoDB
Introduction to mongoDBIntroduction to mongoDB
Introduction to mongoDB
 
A look inside pandas design and development
A look inside pandas design and developmentA look inside pandas design and development
A look inside pandas design and development
 
The Heisenberg Uncertainty Principle[1]
The Heisenberg Uncertainty Principle[1]The Heisenberg Uncertainty Principle[1]
The Heisenberg Uncertainty Principle[1]
 
Data Analytics with Pandas and Numpy - Python
Data Analytics with Pandas and Numpy - PythonData Analytics with Pandas and Numpy - Python
Data Analytics with Pandas and Numpy - Python
 
09 jus 20101123_optimisation_salomeaster
09 jus 20101123_optimisation_salomeaster09 jus 20101123_optimisation_salomeaster
09 jus 20101123_optimisation_salomeaster
 
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...
NumPy and SciPy for Data Mining and Data Analysis Including iPython, SciKits,...
 
Bayesian model averaging
Bayesian model averagingBayesian model averaging
Bayesian model averaging
 

Similar to NumPy/SciPy Statistics

Anomaly Detection with Apache Spark
Anomaly Detection with Apache SparkAnomaly Detection with Apache Spark
Anomaly Detection with Apache SparkCloudera, Inc.
 
Numpy Meetup 07/02/2013
Numpy Meetup 07/02/2013Numpy Meetup 07/02/2013
Numpy Meetup 07/02/2013Francesco
 
Feature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsFeature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsGabriel Moreira
 
Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017Gabriel Moreira
 
getting started with numpy and pandas.pptx
getting started with numpy and pandas.pptxgetting started with numpy and pandas.pptx
getting started with numpy and pandas.pptxworkvishalkumarmahat
 
Scikit-learn Cheatsheet-Python
Scikit-learn Cheatsheet-PythonScikit-learn Cheatsheet-Python
Scikit-learn Cheatsheet-PythonDr. Volkan OBAN
 
Cheat Sheet for Machine Learning in Python: Scikit-learn
Cheat Sheet for Machine Learning in Python: Scikit-learnCheat Sheet for Machine Learning in Python: Scikit-learn
Cheat Sheet for Machine Learning in Python: Scikit-learnKarlijn Willems
 
Scikit learn cheat_sheet_python
Scikit learn cheat_sheet_pythonScikit learn cheat_sheet_python
Scikit learn cheat_sheet_pythonZahid Hasan
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to pythonActiveState
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine LearningAmanBhalla14
 
Python_Cheat_Sheets.pdf
Python_Cheat_Sheets.pdfPython_Cheat_Sheets.pdf
Python_Cheat_Sheets.pdfMateoQuiguiri
 
Matplotlib adalah pustaka plotting 2D Python yang menghasilkan gambar berkual...
Matplotlib adalah pustaka plotting 2D Python yang menghasilkan gambar berkual...Matplotlib adalah pustaka plotting 2D Python yang menghasilkan gambar berkual...
Matplotlib adalah pustaka plotting 2D Python yang menghasilkan gambar berkual...HendraPurnama31
 
Introduction to R.pptx
Introduction to R.pptxIntroduction to R.pptx
Introduction to R.pptxkarthikks82
 
Python for Data Science with Anaconda
Python for Data Science with AnacondaPython for Data Science with Anaconda
Python for Data Science with AnacondaTravis Oliphant
 
Machine Learning in R
Machine Learning in RMachine Learning in R
Machine Learning in RSujaAldrin
 
Effective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyEffective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyKimikazu Kato
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonRalf Gommers
 
Simple, fast, and scalable torch7 tutorial
Simple, fast, and scalable torch7 tutorialSimple, fast, and scalable torch7 tutorial
Simple, fast, and scalable torch7 tutorialJin-Hwa Kim
 

Similar to NumPy/SciPy Statistics (20)

Anomaly Detection with Apache Spark
Anomaly Detection with Apache SparkAnomaly Detection with Apache Spark
Anomaly Detection with Apache Spark
 
Numpy Meetup 07/02/2013
Numpy Meetup 07/02/2013Numpy Meetup 07/02/2013
Numpy Meetup 07/02/2013
 
Feature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsFeature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive models
 
Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017
 
getting started with numpy and pandas.pptx
getting started with numpy and pandas.pptxgetting started with numpy and pandas.pptx
getting started with numpy and pandas.pptx
 
Scikit-learn Cheatsheet-Python
Scikit-learn Cheatsheet-PythonScikit-learn Cheatsheet-Python
Scikit-learn Cheatsheet-Python
 
Cheat Sheet for Machine Learning in Python: Scikit-learn
Cheat Sheet for Machine Learning in Python: Scikit-learnCheat Sheet for Machine Learning in Python: Scikit-learn
Cheat Sheet for Machine Learning in Python: Scikit-learn
 
Scikit learn cheat_sheet_python
Scikit learn cheat_sheet_pythonScikit learn cheat_sheet_python
Scikit learn cheat_sheet_python
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to python
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine Learning
 
Python_Cheat_Sheets.pdf
Python_Cheat_Sheets.pdfPython_Cheat_Sheets.pdf
Python_Cheat_Sheets.pdf
 
Matplotlib adalah pustaka plotting 2D Python yang menghasilkan gambar berkual...
Matplotlib adalah pustaka plotting 2D Python yang menghasilkan gambar berkual...Matplotlib adalah pustaka plotting 2D Python yang menghasilkan gambar berkual...
Matplotlib adalah pustaka plotting 2D Python yang menghasilkan gambar berkual...
 
Introduction to R.pptx
Introduction to R.pptxIntroduction to R.pptx
Introduction to R.pptx
 
Aggregate.pptx
Aggregate.pptxAggregate.pptx
Aggregate.pptx
 
Decision Tree.pptx
Decision Tree.pptxDecision Tree.pptx
Decision Tree.pptx
 
Python for Data Science with Anaconda
Python for Data Science with AnacondaPython for Data Science with Anaconda
Python for Data Science with Anaconda
 
Machine Learning in R
Machine Learning in RMachine Learning in R
Machine Learning in R
 
Effective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyEffective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPy
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for Python
 
Simple, fast, and scalable torch7 tutorial
Simple, fast, and scalable torch7 tutorialSimple, fast, and scalable torch7 tutorial
Simple, fast, and scalable torch7 tutorial
 

More from Enthought, Inc.

Talk at NYC Python Meetup Group
Talk at NYC Python Meetup GroupTalk at NYC Python Meetup Group
Talk at NYC Python Meetup GroupEnthought, Inc.
 
Scientific Applications with Python
Scientific Applications with PythonScientific Applications with Python
Scientific Applications with PythonEnthought, Inc.
 
Scientific Computing with Python Webinar March 19: 3D Visualization with Mayavi
Scientific Computing with Python Webinar March 19: 3D Visualization with MayaviScientific Computing with Python Webinar March 19: 3D Visualization with Mayavi
Scientific Computing with Python Webinar March 19: 3D Visualization with MayaviEnthought, Inc.
 
February EPD Webinar: How do I...use PiCloud for cloud computing?
February EPD Webinar: How do I...use PiCloud for cloud computing?February EPD Webinar: How do I...use PiCloud for cloud computing?
February EPD Webinar: How do I...use PiCloud for cloud computing?Enthought, Inc.
 
Parallel Processing with IPython
Parallel Processing with IPythonParallel Processing with IPython
Parallel Processing with IPythonEnthought, Inc.
 
Scientific Computing with Python Webinar: Traits
Scientific Computing with Python Webinar: TraitsScientific Computing with Python Webinar: Traits
Scientific Computing with Python Webinar: TraitsEnthought, Inc.
 
Scientific Computing with Python Webinar --- August 28, 2009
Scientific Computing with Python Webinar --- August 28, 2009Scientific Computing with Python Webinar --- August 28, 2009
Scientific Computing with Python Webinar --- August 28, 2009Enthought, Inc.
 
Scientific Computing with Python Webinar --- June 19, 2009
Scientific Computing with Python Webinar --- June 19, 2009Scientific Computing with Python Webinar --- June 19, 2009
Scientific Computing with Python Webinar --- June 19, 2009Enthought, Inc.
 
Scientific Computing with Python Webinar --- May 22, 2009
Scientific Computing with Python Webinar --- May 22, 2009Scientific Computing with Python Webinar --- May 22, 2009
Scientific Computing with Python Webinar --- May 22, 2009Enthought, Inc.
 

More from Enthought, Inc. (13)

Numpy Talk at SIAM
Numpy Talk at SIAMNumpy Talk at SIAM
Numpy Talk at SIAM
 
Talk at NYC Python Meetup Group
Talk at NYC Python Meetup GroupTalk at NYC Python Meetup Group
Talk at NYC Python Meetup Group
 
Scientific Applications with Python
Scientific Applications with PythonScientific Applications with Python
Scientific Applications with Python
 
SciPy 2010 Review
SciPy 2010 ReviewSciPy 2010 Review
SciPy 2010 Review
 
Scientific Computing with Python Webinar March 19: 3D Visualization with Mayavi
Scientific Computing with Python Webinar March 19: 3D Visualization with MayaviScientific Computing with Python Webinar March 19: 3D Visualization with Mayavi
Scientific Computing with Python Webinar March 19: 3D Visualization with Mayavi
 
Chaco Step-by-Step
Chaco Step-by-StepChaco Step-by-Step
Chaco Step-by-Step
 
February EPD Webinar: How do I...use PiCloud for cloud computing?
February EPD Webinar: How do I...use PiCloud for cloud computing?February EPD Webinar: How do I...use PiCloud for cloud computing?
February EPD Webinar: How do I...use PiCloud for cloud computing?
 
SciPy India 2009
SciPy India 2009SciPy India 2009
SciPy India 2009
 
Parallel Processing with IPython
Parallel Processing with IPythonParallel Processing with IPython
Parallel Processing with IPython
 
Scientific Computing with Python Webinar: Traits
Scientific Computing with Python Webinar: TraitsScientific Computing with Python Webinar: Traits
Scientific Computing with Python Webinar: Traits
 
Scientific Computing with Python Webinar --- August 28, 2009
Scientific Computing with Python Webinar --- August 28, 2009Scientific Computing with Python Webinar --- August 28, 2009
Scientific Computing with Python Webinar --- August 28, 2009
 
Scientific Computing with Python Webinar --- June 19, 2009
Scientific Computing with Python Webinar --- June 19, 2009Scientific Computing with Python Webinar --- June 19, 2009
Scientific Computing with Python Webinar --- June 19, 2009
 
Scientific Computing with Python Webinar --- May 22, 2009
Scientific Computing with Python Webinar --- May 22, 2009Scientific Computing with Python Webinar --- May 22, 2009
Scientific Computing with Python Webinar --- May 22, 2009
 

Recently uploaded

Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataSafe Software
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.francesco barbera
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 

Recently uploaded (20)

Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 

NumPy/SciPy Statistics

  • 1. Statistics in NumPy and SciPy February 12, 2009
  • 2. Enthought Python Distribution (EPD) MORE THAN SIXTY INTEGRATED PACKAGES • Python 2.6 • Repository access • Science (NumPy, SciPy, etc.) • Data Storage (HDF, NetCDF, etc.) • Plotting (Chaco, Matplotlib) • Networking (twisted) • Visualization (VTK, Mayavi) • User Interface (wxPython, Traits UI) • Multi-language Integration • Enthought Tool Suite (SWIG,Pyrex, f2py, weave) (Application Development Tools)
  • 3. Enthought Training Courses Python Basics, NumPy, SciPy, Matplotlib, Chaco, Traits, TraitsUI, …
  • 4. PyCon http://us.pycon.org/2010/tutorials/ Introduction to Traits Corran Webster
  • 5. Upcoming Training Classes March 1 – 5, 2009 Python for Scientists and Engineers Austin, Texas, USA March 8 – 12, 2009 Python for Quants London, UK http://www.enthought.com/training/
  • 6. NumPy / SciPy Statistics
  • 7. Statistics overview • NumPy methods and functions – .mean, .std, .var, .min, .max, .argmax, .argmin – median, nanargmax, nanargmin, nanmax, nanmin, nansum • NumPy random number generators • Distribution objects in SciPy (scipy.stats) • Many functions in SciPy – f_oneway, bayes_mvs – nanmedian, nanstd, nanmean
  • 8. NumPy methods • All array objects have some “statistical” methods – .mean(), .std(), .var(), .max(), .min(), .argmax(), .argmin() – Take an axis keyword that allows them to work on N-d arrays (shown with .sum). axis=0 axis=1
  • 9. NumPy functions • median • nan-functions (ignore nans) – nanmax – nanmin – nanargmin – nanargmax – nansum • Can also use masks and regular functions
  • 10. NumPy Random Number Generators • Based on Mersenne twister algorithm • Written using PyRex / Cython • Univariate (over 40) • Multivariate (only 3) – multinomial – dirichlet – multivariate_normal • Convenience functions – rand, randn, randint, ranf
  • 11. Statistics scipy.stats — CONTINUOUS DISTRIBUTIONS over 80 continuous distributions! METHODS pdf entropy cdf nnlf rvs moment ppf freeze stats fit sf isf
  • 12. Using stats objects DISTRIBUTIONS >>> from scipy.stats import norm # Sample normal dist. 100 times. >>> samp = norm.rvs(size=100) >>> x = linspace(-5, 5, 100) # Calculate probability dist. >>> pdf = norm.pdf(x) # Calculate cummulative Dist. >>> cdf = norm.cdf(x) # Calculate Percent Point Function >>> ppf = norm.ppf(x)
  • 13. Distribution objects Every distribution can be modified by loc and scale keywords (many distributions also have required shape arguments to select from a family) LOCATION (loc) --- shift left (<0) or right (>0) the distribution SCALE (scale) --- stretch (>1) or compress (<1) the distribution
  • 14. Example distributions NORM (norm) – N(µ,σ) Only location and scale location mean µ arguments: scale standard deviation σ LOG NORMAL (lognorm) log(S) is N(µ, σ) location offset from zero (rarely used) S is lognormal scale eµ one shape parameter! shape σ
  • 15. Setting location and Scale NORMAL DISTRIBUTION >>> from scipy.stats import norm # Normal dist with mean=10 and std=2 >>> dist = norm(loc=10, scale=2) >>> x = linspace(-5, 15, 100) # Calculate probability dist. >>> pdf = dist.pdf(x) # Calculate cummulative dist. >>> cdf = dist.cdf(x) # Get 100 random samples from dist. >>> samp = dist.rvs(size=100) # Estimate parameters from data >>> mu, sigma = norm.fit(samp) .fit returns best >>> print “%4.2f, %4.2f” % (mu, sigma) shape + (loc, scale) 10.07, 1.95 that explains the data
  • 16. Statistics scipy.stats — Discrete Distributions 10 standard discrete distributions (plus any finite RV) METHODS pmf moment cdf entropy rvs freeze ppf stats sf isf
  • 17. Using stats objects CREATING NEW DISCRETE DISTRIBUTIONS # Create loaded dice. >>> from scipy.stats import rv_discrete >>> xk = [1,2,3,4,5,6] >>> pk = [0.3,0.35,0.25,0.05, 0.025,0.025] >>> new = rv_discrete(name='loaded', values=(xk,pk)) # Calculate histogram >>> samples = new.rvs(size=1000) >>> bins=linspace(0.5,5.5,6) >>> subplot(211) >>> hist(samples,bins=bins,normed=True) # Calculate pmf >>> x = range(0,8) >>> subplot(212) >>> stem(x,new.pmf(x))
  • 18. Statistics CONTINUOUS DISTRIBUTION ESTIMATION USING GAUSSIAN KERNELS # Sample two normal distributions # and create a bi-modal distribution >>> rv1 = stats.norm() >>> rv2 = stats.norm(2.0,0.8) >>> samples = hstack([rv1.rvs(size=100), rv2.rvs(size=100)]) # Use a Gaussian kernel density to # estimate the PDF for the samples. >>> from scipy.stats.kde import gaussian_kde >>> approximate_pdf = gaussian_kde(samples) >>> x = linspace(-3,6,200) # Compare the histogram of the samples to # the PDF approximation. >>> hist(samples, bins=25, normed=True) >>> plot(x, approximate_pdf(x),'r')
  • 19. Other functions in scipy.stats • Statistical Tests (Anderson, Wilcox, etc.) • Other calculations (hmean, nanmedian) • Work in progress • A great place to jump in and help
  • 20. Other statistical Resources • scikits.statsmodels • RPy2 • PyMC

Editor's Notes

  1. &lt;&lt;1,Parallel Processing with IPython&gt;&gt;
  2. [toc] level = 2 title = Statistics # end config