SlideShare a Scribd company logo
1 of 14
Download to read offline
scikit-learn
2018 9 25
BPStudy#133
Python
■
■ IT
■
■ / 

■
■ twitter: @sfchaos
1
Python
■ 2 

4.4 scikit-learn

5.2
2
scikit-learn
■ 

3
http://scikit-learn.org/stable/index.html
4
#pydatatext p.224
scikit-learn
5
or
■
OK
6
■ 



scikit-learn 3 API
■ (estimator)
7
■ (predictor)
able to export a fitted model, using the information from its public attributes,
to an agnostic model description such as PMML (Guazzelli et al., 2009).
To illustrate the initialize-fit sequence, let us consider a supervised learning
task using logistic regression. Given the API defined above, solving this problem
is as simple as the following example.
from sklear n . linear model import Lo g isticReg r essio n
c l f = Lo g isticReg r essio n ( penalty=” l1 ” )
c l f . f i t ( X train , y tr ain )
In this snippet, a LogisticRegression estimator is first initialized by setting the
penalty hyper-parameter to "l1" for ℓ1 regularization. Other hyper-parameters
(such as C, the strength of the regularization) are not explicitly given and thus
set to the default values. Upon calling fit, a model is learned from the training
arrays X train and y train, and stored within the object for later use. Since
all estimators share the same interface, using a different learning algorithm is as
simple as replacing the constructor (the class name); to build a random forest on
the same data, one would simply replace LogisticRegression(penalty="l1")
in the snippet above by RandomForestClassifier().
not require the fit method to perform useful work, implement the estimator
interface. As we will illustrate in the next sections, this design pattern is indeed
of prime importance for consistency, composition and model selection reasons.
2.4 Predictors
The predictor interface extends the notion of an estimator by adding a predict
method that takes an array X test and produces predictions for X test, based on
the learned parameters of the estimator (we call the input to predict “X test”
in order to emphasize that predict generalizes to new data). In the case of
supervised learning estimators, this method typically returns the predicted la-
bels or values computed by the model. Continuing with the previous example,
predicted labels for X test can be obtained using the following snippet:
y pred = c l f . pr edict ( X test )
Some unsupervised learning estimators may also implement the predict in-
terface. The code in the snippet below fits a k-means model with k = 10 on
training data X train, and then uses the predict method to obtain cluster
labels (integer indices) for unseen data X test.
from sklea r n . c l u s t e r import KMeans
km = KMeans( n c l u s t e r s =10)
km. f i t ( X train )
clust pr ed = km. pr edict ( X test )
fit
predict
Lars Buitinck, et al.,
API design for machine learning software: experiences from the scikit-learn project,
https://arxiv.org/pdf/1309.0238.pdf
scikit-learn 3 API
■ (transformer)
8
as output a transformed version of X test. Preprocessing, feature selection,
feature extraction and dimensionality reduction algorithms are all provided as
transformers within the library. In our example, to standardize the input X train
to zero mean and unit variance before fitting the logistic regression estimator,
one would write:
from sklear n . pr epr ocessing import StandardScaler
s c a l e r = StandardScaler ()
s c a l e r . f i t ( X train )
X train = s c a l e r . transform ( X train )
Of course, in practice, it is important to apply the same preprocessing to the test
data X test. Since a StandardScaler estimator stores the mean and standard
deviation that it computed for the training set, transforming an unseen test set
X test maps it into the appropriate region of feature space:
X test = s c a l e r . transform ( X test )
Transformers also include a variety of learning algorithms, such as dimension
reduction (PCA, manifold learning), kernel approximation, and other mappings
from one feature space to another.
Additionally, by leveraging the fact that fit always returns the estimator it
was called on, the StandardScaler example above can be rewritten in a single
line using method chaining:
X train = StandardScaler () . f i t ( X train ) . transform ( X train
X train = s c a l e r . transform ( X train )
Of course, in practice, it is important to apply the same preprocessing to the test
data X test. Since a StandardScaler estimator stores the mean and standard
deviation that it computed for the training set, transforming an unseen test set
X test maps it into the appropriate region of feature space:
X test = s c a l e r . transform ( X test )
Transformers also include a variety of learning algorithms, such as dimension
reduction (PCA, manifold learning), kernel approximation, and other mappings
from one feature space to another.
Additionally, by leveraging the fact that fit always returns the estimator it
was called on, the StandardScaler example above can be rewritten in a single
line using method chaining:
X train = StandardScaler () . f i t ( X train ) . transform ( X train
)
Furthermore, every transformer allows fit(X train).transform(X train)
to be written as fit transform(X train). The combined fit transform method
prevents repeated computations. Depending on the transformer, it may skip only
an input validation step, or in fact use a more efficient algorithm for the transfor-
mation. In the same spirit, clustering algorithms provide a fit predict method
that is equivalent to fit followed by predict, returning cluster labels assigned
to the training samples.
fit
transform
fit_transform
■
9
http://scikit-learn.org/stable/modules/generated/
sklearn.svm.SVC.html#sklearn.svm.SVC
■ User Guide
10
http://scikit-learn.org/stable/modules/svm.html#
classification
■User Guide 

11
C=1e6 C=0.1
)
•
•
#pydatatext p.231, 232
■
■
■ 3
12
■scikit-learn API
■
1. 

2.
■ 2
1
13

More Related Content

What's hot

Matlab m files and scripts
Matlab m files and scriptsMatlab m files and scripts
Matlab m files and scriptsAmeen San
 
PCA and LDA in machine learning
PCA and LDA in machine learningPCA and LDA in machine learning
PCA and LDA in machine learningAkhilesh Joshi
 
Introduction to Control systems in scilab
Introduction to Control systems in scilabIntroduction to Control systems in scilab
Introduction to Control systems in scilabScilab
 
Scilab for real dummies j.heikell - part 2
Scilab for real dummies j.heikell - part 2Scilab for real dummies j.heikell - part 2
Scilab for real dummies j.heikell - part 2Scilab
 
Arima model (time series)
Arima model (time series)Arima model (time series)
Arima model (time series)Kumar P
 
Numerical analysis using Scilab: Error analysis and propagation
Numerical analysis using Scilab: Error analysis and propagationNumerical analysis using Scilab: Error analysis and propagation
Numerical analysis using Scilab: Error analysis and propagationScilab
 
Numerical analysis using Scilab: Numerical stability and conditioning
Numerical analysis using Scilab: Numerical stability and conditioningNumerical analysis using Scilab: Numerical stability and conditioning
Numerical analysis using Scilab: Numerical stability and conditioningScilab
 
Dynamic Pricing Of stocks
Dynamic Pricing Of stocksDynamic Pricing Of stocks
Dynamic Pricing Of stocksparc18
 
Numerical analysis using Scilab: Solving nonlinear equations
Numerical analysis using Scilab: Solving nonlinear equationsNumerical analysis using Scilab: Solving nonlinear equations
Numerical analysis using Scilab: Solving nonlinear equationsScilab
 
04 programming 2
04 programming 204 programming 2
04 programming 2Tareq Qazi
 
Arima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet Mahana
Arima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet MahanaArima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet Mahana
Arima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet MahanaAmrinder Arora
 
Data fitting in Scilab - Tutorial
Data fitting in Scilab - TutorialData fitting in Scilab - Tutorial
Data fitting in Scilab - TutorialScilab
 
Arima model
Arima modelArima model
Arima modelJassika
 

What's hot (20)

Matlab m files and scripts
Matlab m files and scriptsMatlab m files and scripts
Matlab m files and scripts
 
PCA and LDA in machine learning
PCA and LDA in machine learningPCA and LDA in machine learning
PCA and LDA in machine learning
 
Introduction to Control systems in scilab
Introduction to Control systems in scilabIntroduction to Control systems in scilab
Introduction to Control systems in scilab
 
Scilab for real dummies j.heikell - part 2
Scilab for real dummies j.heikell - part 2Scilab for real dummies j.heikell - part 2
Scilab for real dummies j.heikell - part 2
 
Arima model (time series)
Arima model (time series)Arima model (time series)
Arima model (time series)
 
Numerical analysis using Scilab: Error analysis and propagation
Numerical analysis using Scilab: Error analysis and propagationNumerical analysis using Scilab: Error analysis and propagation
Numerical analysis using Scilab: Error analysis and propagation
 
Numerical analysis using Scilab: Numerical stability and conditioning
Numerical analysis using Scilab: Numerical stability and conditioningNumerical analysis using Scilab: Numerical stability and conditioning
Numerical analysis using Scilab: Numerical stability and conditioning
 
Lab5
Lab5Lab5
Lab5
 
Matlab variables
Matlab variablesMatlab variables
Matlab variables
 
Dynamic Pricing Of stocks
Dynamic Pricing Of stocksDynamic Pricing Of stocks
Dynamic Pricing Of stocks
 
Numerical analysis using Scilab: Solving nonlinear equations
Numerical analysis using Scilab: Solving nonlinear equationsNumerical analysis using Scilab: Solving nonlinear equations
Numerical analysis using Scilab: Solving nonlinear equations
 
Matlab booklet
Matlab bookletMatlab booklet
Matlab booklet
 
04 programming 2
04 programming 204 programming 2
04 programming 2
 
Arima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet Mahana
Arima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet MahanaArima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet Mahana
Arima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet Mahana
 
Ann a Algorithms notes
Ann a Algorithms notesAnn a Algorithms notes
Ann a Algorithms notes
 
Ch06 Stack
Ch06  StackCh06  Stack
Ch06 Stack
 
Matlab tutorial 1
Matlab tutorial 1Matlab tutorial 1
Matlab tutorial 1
 
Data fitting in Scilab - Tutorial
Data fitting in Scilab - TutorialData fitting in Scilab - Tutorial
Data fitting in Scilab - Tutorial
 
Arima model
Arima modelArima model
Arima model
 
Fourier Transform Assignment Help
Fourier Transform Assignment HelpFourier Transform Assignment Help
Fourier Transform Assignment Help
 

Similar to BPstudy sklearn 20180925

maxbox starter60 machine learning
maxbox starter60 machine learningmaxbox starter60 machine learning
maxbox starter60 machine learningMax Kleiner
 
maXbox starter65 machinelearning3
maXbox starter65 machinelearning3maXbox starter65 machinelearning3
maXbox starter65 machinelearning3Max Kleiner
 
Grid search.pptx
Grid search.pptxGrid search.pptx
Grid search.pptxAbithaSam
 
Machine learning (5)
Machine learning (5)Machine learning (5)
Machine learning (5)NYversity
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningRyo Iwaki
 
maXbox starter69 Machine Learning VII
maXbox starter69 Machine Learning VIImaXbox starter69 Machine Learning VII
maXbox starter69 Machine Learning VIIMax Kleiner
 
Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...Yao Yao
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171Yaxin Liu
 
TMPA-2015: Implementing the MetaVCG Approach in the C-light System
TMPA-2015: Implementing the MetaVCG Approach in the C-light SystemTMPA-2015: Implementing the MetaVCG Approach in the C-light System
TMPA-2015: Implementing the MetaVCG Approach in the C-light SystemIosif Itkin
 
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...Pooyan Jamshidi
 
Machine Learning Guide maXbox Starter62
Machine Learning Guide maXbox Starter62Machine Learning Guide maXbox Starter62
Machine Learning Guide maXbox Starter62Max Kleiner
 
Logistic Regression using Mahout
Logistic Regression using MahoutLogistic Regression using Mahout
Logistic Regression using Mahouttanuvir
 
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHMADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHMWireilla
 
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHMADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHMijfls
 
Machinelearning Spark Hadoop User Group Munich Meetup 2016
Machinelearning Spark Hadoop User Group Munich Meetup 2016Machinelearning Spark Hadoop User Group Munich Meetup 2016
Machinelearning Spark Hadoop User Group Munich Meetup 2016Comsysto Reply GmbH
 
On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction cscpconf
 
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTION
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTIONON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTION
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTIONcscpconf
 

Similar to BPstudy sklearn 20180925 (20)

maxbox starter60 machine learning
maxbox starter60 machine learningmaxbox starter60 machine learning
maxbox starter60 machine learning
 
maXbox starter65 machinelearning3
maXbox starter65 machinelearning3maXbox starter65 machinelearning3
maXbox starter65 machinelearning3
 
Grid search.pptx
Grid search.pptxGrid search.pptx
Grid search.pptx
 
Machine learning (5)
Machine learning (5)Machine learning (5)
Machine learning (5)
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learning
 
maXbox starter69 Machine Learning VII
maXbox starter69 Machine Learning VIImaXbox starter69 Machine Learning VII
maXbox starter69 Machine Learning VII
 
Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...
 
working with python
working with pythonworking with python
working with python
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
 
TMPA-2015: Implementing the MetaVCG Approach in the C-light System
TMPA-2015: Implementing the MetaVCG Approach in the C-light SystemTMPA-2015: Implementing the MetaVCG Approach in the C-light System
TMPA-2015: Implementing the MetaVCG Approach in the C-light System
 
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
 
Machine Learning Guide maXbox Starter62
Machine Learning Guide maXbox Starter62Machine Learning Guide maXbox Starter62
Machine Learning Guide maXbox Starter62
 
Logistic Regression using Mahout
Logistic Regression using MahoutLogistic Regression using Mahout
Logistic Regression using Mahout
 
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHMADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
 
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHMADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
 
Multiple Regression
Multiple RegressionMultiple Regression
Multiple Regression
 
Machinelearning Spark Hadoop User Group Munich Meetup 2016
Machinelearning Spark Hadoop User Group Munich Meetup 2016Machinelearning Spark Hadoop User Group Munich Meetup 2016
Machinelearning Spark Hadoop User Group Munich Meetup 2016
 
On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction
 
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTION
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTIONON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTION
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTION
 
Naïve Bayes.pptx
Naïve Bayes.pptxNaïve Bayes.pptx
Naïve Bayes.pptx
 

More from Shintaro Fukushima

20230216_Python機械学習プログラミング.pdf
20230216_Python機械学習プログラミング.pdf20230216_Python機械学習プログラミング.pdf
20230216_Python機械学習プログラミング.pdfShintaro Fukushima
 
機械学習品質管理・保証の動向と取り組み
機械学習品質管理・保証の動向と取り組み機械学習品質管理・保証の動向と取り組み
機械学習品質管理・保証の動向と取り組みShintaro Fukushima
 
Materials Informatics and Python
Materials Informatics and PythonMaterials Informatics and Python
Materials Informatics and PythonShintaro Fukushima
 
最近のRのランダムフォレストパッケージ -ranger/Rborist-
最近のRのランダムフォレストパッケージ -ranger/Rborist-最近のRのランダムフォレストパッケージ -ranger/Rborist-
最近のRのランダムフォレストパッケージ -ranger/Rborist-Shintaro Fukushima
 
Why dont you_create_new_spark_jl
Why dont you_create_new_spark_jlWhy dont you_create_new_spark_jl
Why dont you_create_new_spark_jlShintaro Fukushima
 
Rユーザのためのspark入門
Rユーザのためのspark入門Rユーザのためのspark入門
Rユーザのためのspark入門Shintaro Fukushima
 
Juliaによる予測モデル構築・評価
Juliaによる予測モデル構築・評価Juliaによる予測モデル構築・評価
Juliaによる予測モデル構築・評価Shintaro Fukushima
 
機械学習を用いた予測モデル構築・評価
機械学習を用いた予測モデル構築・評価機械学習を用いた予測モデル構築・評価
機械学習を用いた予測モデル構築・評価Shintaro Fukushima
 
データサイエンスワールドからC++を眺めてみる
データサイエンスワールドからC++を眺めてみるデータサイエンスワールドからC++を眺めてみる
データサイエンスワールドからC++を眺めてみるShintaro Fukushima
 
data.tableパッケージで大規模データをサクッと処理する
data.tableパッケージで大規模データをサクッと処理するdata.tableパッケージで大規模データをサクッと処理する
data.tableパッケージで大規模データをサクッと処理するShintaro Fukushima
 
アクションマイニングを用いた最適なアクションの導出
アクションマイニングを用いた最適なアクションの導出アクションマイニングを用いた最適なアクションの導出
アクションマイニングを用いた最適なアクションの導出Shintaro Fukushima
 
統計解析言語Rにおける大規模データ管理のためのboost.interprocessの活用
統計解析言語Rにおける大規模データ管理のためのboost.interprocessの活用統計解析言語Rにおける大規模データ管理のためのboost.interprocessの活用
統計解析言語Rにおける大規模データ管理のためのboost.interprocessの活用Shintaro Fukushima
 
不均衡データのクラス分類
不均衡データのクラス分類不均衡データのクラス分類
不均衡データのクラス分類Shintaro Fukushima
 
mmapパッケージを使ってお手軽オブジェクト管理
mmapパッケージを使ってお手軽オブジェクト管理mmapパッケージを使ってお手軽オブジェクト管理
mmapパッケージを使ってお手軽オブジェクト管理Shintaro Fukushima
 
Numpy scipyで独立成分分析
Numpy scipyで独立成分分析Numpy scipyで独立成分分析
Numpy scipyで独立成分分析Shintaro Fukushima
 

More from Shintaro Fukushima (20)

20230216_Python機械学習プログラミング.pdf
20230216_Python機械学習プログラミング.pdf20230216_Python機械学習プログラミング.pdf
20230216_Python機械学習プログラミング.pdf
 
機械学習品質管理・保証の動向と取り組み
機械学習品質管理・保証の動向と取り組み機械学習品質管理・保証の動向と取り組み
機械学習品質管理・保証の動向と取り組み
 
Materials Informatics and Python
Materials Informatics and PythonMaterials Informatics and Python
Materials Informatics and Python
 
最近のRのランダムフォレストパッケージ -ranger/Rborist-
最近のRのランダムフォレストパッケージ -ranger/Rborist-最近のRのランダムフォレストパッケージ -ranger/Rborist-
最近のRのランダムフォレストパッケージ -ranger/Rborist-
 
Why dont you_create_new_spark_jl
Why dont you_create_new_spark_jlWhy dont you_create_new_spark_jl
Why dont you_create_new_spark_jl
 
Rユーザのためのspark入門
Rユーザのためのspark入門Rユーザのためのspark入門
Rユーザのためのspark入門
 
Juliaによる予測モデル構築・評価
Juliaによる予測モデル構築・評価Juliaによる予測モデル構築・評価
Juliaによる予測モデル構築・評価
 
Juliaで並列計算
Juliaで並列計算Juliaで並列計算
Juliaで並列計算
 
機械学習を用いた予測モデル構築・評価
機械学習を用いた予測モデル構築・評価機械学習を用いた予測モデル構築・評価
機械学習を用いた予測モデル構築・評価
 
データサイエンスワールドからC++を眺めてみる
データサイエンスワールドからC++を眺めてみるデータサイエンスワールドからC++を眺めてみる
データサイエンスワールドからC++を眺めてみる
 
data.tableパッケージで大規模データをサクッと処理する
data.tableパッケージで大規模データをサクッと処理するdata.tableパッケージで大規模データをサクッと処理する
data.tableパッケージで大規模データをサクッと処理する
 
アクションマイニングを用いた最適なアクションの導出
アクションマイニングを用いた最適なアクションの導出アクションマイニングを用いた最適なアクションの導出
アクションマイニングを用いた最適なアクションの導出
 
R3.0.0 is relased
R3.0.0 is relasedR3.0.0 is relased
R3.0.0 is relased
 
外れ値
外れ値外れ値
外れ値
 
Rでreproducible research
Rでreproducible researchRでreproducible research
Rでreproducible research
 
統計解析言語Rにおける大規模データ管理のためのboost.interprocessの活用
統計解析言語Rにおける大規模データ管理のためのboost.interprocessの活用統計解析言語Rにおける大規模データ管理のためのboost.interprocessの活用
統計解析言語Rにおける大規模データ管理のためのboost.interprocessの活用
 
不均衡データのクラス分類
不均衡データのクラス分類不均衡データのクラス分類
不均衡データのクラス分類
 
mmapパッケージを使ってお手軽オブジェクト管理
mmapパッケージを使ってお手軽オブジェクト管理mmapパッケージを使ってお手軽オブジェクト管理
mmapパッケージを使ってお手軽オブジェクト管理
 
Numpy scipyで独立成分分析
Numpy scipyで独立成分分析Numpy scipyで独立成分分析
Numpy scipyで独立成分分析
 
Rで学ぶロバスト推定
Rで学ぶロバスト推定Rで学ぶロバスト推定
Rで学ぶロバスト推定
 

Recently uploaded

Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 

Recently uploaded (20)

Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 

BPstudy sklearn 20180925

  • 2. ■ ■ IT ■ ■ / 
 ■ ■ twitter: @sfchaos 1
  • 3. Python ■ 2 
 4.4 scikit-learn
 5.2 2
  • 8. scikit-learn 3 API ■ (estimator) 7 ■ (predictor) able to export a fitted model, using the information from its public attributes, to an agnostic model description such as PMML (Guazzelli et al., 2009). To illustrate the initialize-fit sequence, let us consider a supervised learning task using logistic regression. Given the API defined above, solving this problem is as simple as the following example. from sklear n . linear model import Lo g isticReg r essio n c l f = Lo g isticReg r essio n ( penalty=” l1 ” ) c l f . f i t ( X train , y tr ain ) In this snippet, a LogisticRegression estimator is first initialized by setting the penalty hyper-parameter to "l1" for ℓ1 regularization. Other hyper-parameters (such as C, the strength of the regularization) are not explicitly given and thus set to the default values. Upon calling fit, a model is learned from the training arrays X train and y train, and stored within the object for later use. Since all estimators share the same interface, using a different learning algorithm is as simple as replacing the constructor (the class name); to build a random forest on the same data, one would simply replace LogisticRegression(penalty="l1") in the snippet above by RandomForestClassifier(). not require the fit method to perform useful work, implement the estimator interface. As we will illustrate in the next sections, this design pattern is indeed of prime importance for consistency, composition and model selection reasons. 2.4 Predictors The predictor interface extends the notion of an estimator by adding a predict method that takes an array X test and produces predictions for X test, based on the learned parameters of the estimator (we call the input to predict “X test” in order to emphasize that predict generalizes to new data). In the case of supervised learning estimators, this method typically returns the predicted la- bels or values computed by the model. Continuing with the previous example, predicted labels for X test can be obtained using the following snippet: y pred = c l f . pr edict ( X test ) Some unsupervised learning estimators may also implement the predict in- terface. The code in the snippet below fits a k-means model with k = 10 on training data X train, and then uses the predict method to obtain cluster labels (integer indices) for unseen data X test. from sklea r n . c l u s t e r import KMeans km = KMeans( n c l u s t e r s =10) km. f i t ( X train ) clust pr ed = km. pr edict ( X test ) fit predict Lars Buitinck, et al., API design for machine learning software: experiences from the scikit-learn project, https://arxiv.org/pdf/1309.0238.pdf
  • 9. scikit-learn 3 API ■ (transformer) 8 as output a transformed version of X test. Preprocessing, feature selection, feature extraction and dimensionality reduction algorithms are all provided as transformers within the library. In our example, to standardize the input X train to zero mean and unit variance before fitting the logistic regression estimator, one would write: from sklear n . pr epr ocessing import StandardScaler s c a l e r = StandardScaler () s c a l e r . f i t ( X train ) X train = s c a l e r . transform ( X train ) Of course, in practice, it is important to apply the same preprocessing to the test data X test. Since a StandardScaler estimator stores the mean and standard deviation that it computed for the training set, transforming an unseen test set X test maps it into the appropriate region of feature space: X test = s c a l e r . transform ( X test ) Transformers also include a variety of learning algorithms, such as dimension reduction (PCA, manifold learning), kernel approximation, and other mappings from one feature space to another. Additionally, by leveraging the fact that fit always returns the estimator it was called on, the StandardScaler example above can be rewritten in a single line using method chaining: X train = StandardScaler () . f i t ( X train ) . transform ( X train X train = s c a l e r . transform ( X train ) Of course, in practice, it is important to apply the same preprocessing to the test data X test. Since a StandardScaler estimator stores the mean and standard deviation that it computed for the training set, transforming an unseen test set X test maps it into the appropriate region of feature space: X test = s c a l e r . transform ( X test ) Transformers also include a variety of learning algorithms, such as dimension reduction (PCA, manifold learning), kernel approximation, and other mappings from one feature space to another. Additionally, by leveraging the fact that fit always returns the estimator it was called on, the StandardScaler example above can be rewritten in a single line using method chaining: X train = StandardScaler () . f i t ( X train ) . transform ( X train ) Furthermore, every transformer allows fit(X train).transform(X train) to be written as fit transform(X train). The combined fit transform method prevents repeated computations. Depending on the transformer, it may skip only an input validation step, or in fact use a more efficient algorithm for the transfor- mation. In the same spirit, clustering algorithms provide a fit predict method that is equivalent to fit followed by predict, returning cluster labels assigned to the training samples. fit transform fit_transform
  • 12. ■User Guide 
 11 C=1e6 C=0.1 ) • • #pydatatext p.231, 232