SlideShare a Scribd company logo
1 of 19
Download to read offline
高次元データの統計:スパース正則化の近似誤差と推定誤差
担当: Quasi_quant2010
第87回統計科学研究会
1
【第87回統計科学研究会】
本日理解して欲しいこと
- Lasso問題を解く際、CVしてλを決める事の意味を知る -
第87回統計科学研究会
2
 スパースって何?
 真のモデルを大雑把に近似している状態
 オラクルって何?
 モデルをスパースにした際、推定パラメータが最も望ましい状態
 導出する事
 近似誤差に関するバウントがどのような確率で成立するか
 推定誤差に関するバウントがどのような確率で成立するか
注) 証明は全て白板に書きますので、これを読んでも何を書いているかわからないと思います。
Lasso問題の定式化
 導出したいこと
第87回統計科学研究会
3
近似誤差に関するバウント
推定誤差に関するバウント
Lemma 6.1
- Basic Inequality -
第87回統計科学研究会
4
Lemma 6.1.1
- 確率不等式① -
第87回統計科学研究会
5
Lemma 6.2
- 確率不等式② -
第87回統計科学研究会
6
Corollary 6.1
- Consistency of the Lasso -
第87回統計科学研究会
7
Lemma 6.3
第87回統計科学研究会
8
Theorem 6.1
- Oracle Inequality -
第87回統計科学研究会
9
Corollary 6.2
- ある確率1-αでOracle Inequalityが成立 -
第87回統計科学研究会
10
第87回統計科学研究会
11
Why Approximation
Sparse Solution!!
Theorem 6.2
- Linear approximation of the truth -
第87回統計科学研究会
12
Theorem 6.3
- handling smallish coefficients -
第87回統計科学研究会
13
What is compatibility constant
第87回統計科学研究会
14
Lemma 6.19
- the larger S, the smaller the compatibility constant -
第87回統計科学研究会
15
Lasso問題の定式化
第87回統計科学研究会
16
Lemma 6.21
- The (L;S)-compatibility constant is the solution of a Lasso -
第87回統計科学研究会
17
Variable Secreening with the Lasso
 irrepresentable conditions show that the Lasso,
or any weighted variant, typically selects too
many variables
 We shall therefore aim at estimators with oracle
prediction error, yet having not too many false
positives
 Chap7に続く
第87回統計科学研究会
18
参考文献
 Statistics for High-Dimensional Data:Methods,
Theory and Applications, 2011, P.Buhlmann,
S.A.van de Gerr , Springer
 On the conditions used to prove oracle results for
the Lasso, 2009, S.A. van de Geer and P.
Buehlmann, Electronic Journal of Statistics 3:1360-
1392.
 The adaptive and the thresholded Lasso for
potentially misspecified models, 2011, S.A.van de
Gerr, P.Buehlmann and S.Zhu, Electronic Journal of
Statistics 5, 688-749
第87回統計科学研究会
19

More Related Content

More from Takanori Nakai

Learning Better Embeddings for Rare Words Using Distributional Representations
Learning Better Embeddings for Rare Words Using Distributional RepresentationsLearning Better Embeddings for Rare Words Using Distributional Representations
Learning Better Embeddings for Rare Words Using Distributional RepresentationsTakanori Nakai
 
Preference-oriented Social Networks_Group Recommendation and Inference
Preference-oriented Social Networks_Group Recommendation and InferencePreference-oriented Social Networks_Group Recommendation and Inference
Preference-oriented Social Networks_Group Recommendation and InferenceTakanori Nakai
 
Analysis of Learning from Positive and Unlabeled Data
Analysis of Learning from Positive and Unlabeled DataAnalysis of Learning from Positive and Unlabeled Data
Analysis of Learning from Positive and Unlabeled DataTakanori Nakai
 
Positive Unlabeled Learning for Deceptive Reviews Detection
Positive Unlabeled Learning for Deceptive Reviews DetectionPositive Unlabeled Learning for Deceptive Reviews Detection
Positive Unlabeled Learning for Deceptive Reviews DetectionTakanori Nakai
 
Modeling Mass Protest Adoption in Social Network Communities using Geometric ...
Modeling Mass Protest Adoption in Social Network Communities using Geometric ...Modeling Mass Protest Adoption in Social Network Communities using Geometric ...
Modeling Mass Protest Adoption in Social Network Communities using Geometric ...Takanori Nakai
 
Similarity component analysis
Similarity component analysisSimilarity component analysis
Similarity component analysisTakanori Nakai
 
Query driven context aware recommendation
Query driven context aware recommendationQuery driven context aware recommendation
Query driven context aware recommendationTakanori Nakai
 
Unsupervised Graph-based Topic Labelling using DBpedia
Unsupervised Graph-based Topic Labelling using DBpediaUnsupervised Graph-based Topic Labelling using DBpedia
Unsupervised Graph-based Topic Labelling using DBpediaTakanori Nakai
 
Psychological Advertising_Exploring User Psychology for Click Prediction in S...
Psychological Advertising_Exploring User Psychology for Click Prediction in S...Psychological Advertising_Exploring User Psychology for Click Prediction in S...
Psychological Advertising_Exploring User Psychology for Click Prediction in S...Takanori Nakai
 
PUCKモデルの適用例:修論を仕上げた後、個人的にやっていたリサーチ
PUCKモデルの適用例:修論を仕上げた後、個人的にやっていたリサーチPUCKモデルの適用例:修論を仕上げた後、個人的にやっていたリサーチ
PUCKモデルの適用例:修論を仕上げた後、個人的にやっていたリサーチTakanori Nakai
 
金利期間構造について:Forward Martingale Measureの導出
金利期間構造について:Forward Martingale Measureの導出金利期間構造について:Forward Martingale Measureの導出
金利期間構造について:Forward Martingale Measureの導出Takanori Nakai
 
Topic discovery through data dependent and random projections
Topic discovery through data dependent and random projectionsTopic discovery through data dependent and random projections
Topic discovery through data dependent and random projectionsTakanori Nakai
 

More from Takanori Nakai (12)

Learning Better Embeddings for Rare Words Using Distributional Representations
Learning Better Embeddings for Rare Words Using Distributional RepresentationsLearning Better Embeddings for Rare Words Using Distributional Representations
Learning Better Embeddings for Rare Words Using Distributional Representations
 
Preference-oriented Social Networks_Group Recommendation and Inference
Preference-oriented Social Networks_Group Recommendation and InferencePreference-oriented Social Networks_Group Recommendation and Inference
Preference-oriented Social Networks_Group Recommendation and Inference
 
Analysis of Learning from Positive and Unlabeled Data
Analysis of Learning from Positive and Unlabeled DataAnalysis of Learning from Positive and Unlabeled Data
Analysis of Learning from Positive and Unlabeled Data
 
Positive Unlabeled Learning for Deceptive Reviews Detection
Positive Unlabeled Learning for Deceptive Reviews DetectionPositive Unlabeled Learning for Deceptive Reviews Detection
Positive Unlabeled Learning for Deceptive Reviews Detection
 
Modeling Mass Protest Adoption in Social Network Communities using Geometric ...
Modeling Mass Protest Adoption in Social Network Communities using Geometric ...Modeling Mass Protest Adoption in Social Network Communities using Geometric ...
Modeling Mass Protest Adoption in Social Network Communities using Geometric ...
 
Similarity component analysis
Similarity component analysisSimilarity component analysis
Similarity component analysis
 
Query driven context aware recommendation
Query driven context aware recommendationQuery driven context aware recommendation
Query driven context aware recommendation
 
Unsupervised Graph-based Topic Labelling using DBpedia
Unsupervised Graph-based Topic Labelling using DBpediaUnsupervised Graph-based Topic Labelling using DBpedia
Unsupervised Graph-based Topic Labelling using DBpedia
 
Psychological Advertising_Exploring User Psychology for Click Prediction in S...
Psychological Advertising_Exploring User Psychology for Click Prediction in S...Psychological Advertising_Exploring User Psychology for Click Prediction in S...
Psychological Advertising_Exploring User Psychology for Click Prediction in S...
 
PUCKモデルの適用例:修論を仕上げた後、個人的にやっていたリサーチ
PUCKモデルの適用例:修論を仕上げた後、個人的にやっていたリサーチPUCKモデルの適用例:修論を仕上げた後、個人的にやっていたリサーチ
PUCKモデルの適用例:修論を仕上げた後、個人的にやっていたリサーチ
 
金利期間構造について:Forward Martingale Measureの導出
金利期間構造について:Forward Martingale Measureの導出金利期間構造について:Forward Martingale Measureの導出
金利期間構造について:Forward Martingale Measureの導出
 
Topic discovery through data dependent and random projections
Topic discovery through data dependent and random projectionsTopic discovery through data dependent and random projections
Topic discovery through data dependent and random projections
 

高次元データの統計:スパース正則化の近似誤差と推定誤差