SlideShare a Scribd company logo
1 of 8
Download to read offline
Fast and Provably Good Seedings for k-Means
O. Bachem, M. Lucic, S. Hassani, A. Krause
Presented by Kimikazu Kato,
Silver Egg Technology Co., Ltd.
Algorithm of k-Means clustering
Determine initial
centroids
Update centroids and
membership of clusters
gradually
Improvement of this part
Existing results:
k-means++:
sampling according to some metric
Bachem et al. 2016:
Performance improvement using
MCMC, but has some assumption about
the distribution of the data
Proposed:
Another MCMC based algorithm
without assumption of the distribution
Outline
Related researches
kmeans++
Draw
accoding to
Intuition:
Choose initial centroids from the
input data so that they scatter as
widely as possible
Bachem et al. 2016
Intended to overcome the
shortcoming of kmeans++: the
marginalization cost
Metropolitan Hastings algorithm,
which utilizes rejection sampling
to emulate the distribution.
But have some assumption on the
input data.
as a centroid
C: set of centroids which are
already chosen
Proposed Algorithm
Update from the preceding result: rejection criterion
The convergence is mathematically proved.
Experimental Results 1/3
Experimental Results 2/3
Experimental Results 3/3
Conclusion
• Novel algorithm for the initialization of
centroids in kmeans
• Theoretical guarantee on the convergence
and the trade-off of accuracy and speed
• Experimentally good result

More Related Content

What's hot

Fraud Detection for Insurance Claims
Fraud Detection for Insurance ClaimsFraud Detection for Insurance Claims
Fraud Detection for Insurance ClaimsYit Wei (Jason) Chia
 
New three dimensional space vector based switching signal generation techniqu...
New three dimensional space vector based switching signal generation techniqu...New three dimensional space vector based switching signal generation techniqu...
New three dimensional space vector based switching signal generation techniqu...I3E Technologies
 
Application of cgpann in solar irradiance
Application of cgpann in solar irradianceApplication of cgpann in solar irradiance
Application of cgpann in solar irradianceJawad Khan
 
Uncertainty aware multidimensional ensemble data visualization and exploration
Uncertainty aware multidimensional ensemble data visualization and explorationUncertainty aware multidimensional ensemble data visualization and exploration
Uncertainty aware multidimensional ensemble data visualization and explorationSubhashis Hazarika
 
Kalman filter(nanheekim)
Kalman filter(nanheekim)Kalman filter(nanheekim)
Kalman filter(nanheekim)Nanhee Kim
 
9A_1_On automatic mapping of environmental data using adaptive general regres...
9A_1_On automatic mapping of environmental data using adaptive general regres...9A_1_On automatic mapping of environmental data using adaptive general regres...
9A_1_On automatic mapping of environmental data using adaptive general regres...GISRUK conference
 
Multisensor data fusion in object tracking applications
Multisensor data fusion in object tracking applicationsMultisensor data fusion in object tracking applications
Multisensor data fusion in object tracking applicationsSayed Abulhasan Quadri
 
OptimalForecast_10162015
OptimalForecast_10162015OptimalForecast_10162015
OptimalForecast_10162015Alejandro Komai
 

What's hot (11)

Fraud Detection for Insurance Claims
Fraud Detection for Insurance ClaimsFraud Detection for Insurance Claims
Fraud Detection for Insurance Claims
 
New three dimensional space vector based switching signal generation techniqu...
New three dimensional space vector based switching signal generation techniqu...New three dimensional space vector based switching signal generation techniqu...
New three dimensional space vector based switching signal generation techniqu...
 
Improved k-means
Improved k-meansImproved k-means
Improved k-means
 
Application of cgpann in solar irradiance
Application of cgpann in solar irradianceApplication of cgpann in solar irradiance
Application of cgpann in solar irradiance
 
Uncertainty aware multidimensional ensemble data visualization and exploration
Uncertainty aware multidimensional ensemble data visualization and explorationUncertainty aware multidimensional ensemble data visualization and exploration
Uncertainty aware multidimensional ensemble data visualization and exploration
 
Kalman filter(nanheekim)
Kalman filter(nanheekim)Kalman filter(nanheekim)
Kalman filter(nanheekim)
 
Data fusion with kalman filtering
Data fusion with kalman filteringData fusion with kalman filtering
Data fusion with kalman filtering
 
9A_1_On automatic mapping of environmental data using adaptive general regres...
9A_1_On automatic mapping of environmental data using adaptive general regres...9A_1_On automatic mapping of environmental data using adaptive general regres...
9A_1_On automatic mapping of environmental data using adaptive general regres...
 
Multisensor data fusion in object tracking applications
Multisensor data fusion in object tracking applicationsMultisensor data fusion in object tracking applications
Multisensor data fusion in object tracking applications
 
OptimalForecast_10162015
OptimalForecast_10162015OptimalForecast_10162015
OptimalForecast_10162015
 
Fall detection
Fall detectionFall detection
Fall detection
 

Viewers also liked

Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN DecodersConditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoderssuga93
 
Learning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentLearning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentHiroyuki Fukuda
 
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”Kazuto Fukuchi
 
Introduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmIntroduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmKatsuki Ohto
 
Interaction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsInteraction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsKen Kuroki
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Toru Fujino
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...Shuhei Yoshida
 
時系列データ3
時系列データ3時系列データ3
時系列データ3graySpace999
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learningmooopan
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowTatsuya Shirakawa
 
[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence LearningDeep Learning JP
 
NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics  NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics Koichi Hamada
 
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...Kusano Hitoshi
 
Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Kentaro Minami
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learningKazuki Fujikawa
 
ICML2016読み会 概要紹介
ICML2016読み会 概要紹介ICML2016読み会 概要紹介
ICML2016読み会 概要紹介Kohei Hayashi
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural NetworksSeiya Tokui
 

Viewers also liked (18)

Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN DecodersConditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoders
 
Learning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentLearning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descent
 
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
 
Introduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmIntroduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithm
 
Interaction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsInteraction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and Physics
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)
 
Value iteration networks
Value iteration networksValue iteration networks
Value iteration networks
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
 
時系列データ3
時系列データ3時系列データ3
時系列データ3
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learning
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive Flow
 
[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning
 
NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics  NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics
 
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
 
Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learning
 
ICML2016読み会 概要紹介
ICML2016読み会 概要紹介ICML2016読み会 概要紹介
ICML2016読み会 概要紹介
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks
 

Similar to Fast and Probvably Seedings for k-Means

Off-Policy Deep Reinforcement Learning without Exploration.pdf
Off-Policy Deep Reinforcement Learning without Exploration.pdfOff-Policy Deep Reinforcement Learning without Exploration.pdf
Off-Policy Deep Reinforcement Learning without Exploration.pdfPo-Chuan Chen
 
A HYBRID CLUSTERING ALGORITHM FOR DATA MINING
A HYBRID CLUSTERING ALGORITHM FOR DATA MININGA HYBRID CLUSTERING ALGORITHM FOR DATA MINING
A HYBRID CLUSTERING ALGORITHM FOR DATA MININGcscpconf
 
IDA 2015: Efficient model selection for regularized classification by exploit...
IDA 2015: Efficient model selection for regularized classification by exploit...IDA 2015: Efficient model selection for regularized classification by exploit...
IDA 2015: Efficient model selection for regularized classification by exploit...George Balikas
 
Cost optimized reliability test planning rev 7
Cost optimized reliability test planning rev 7Cost optimized reliability test planning rev 7
Cost optimized reliability test planning rev 7ASQ Reliability Division
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmIRJET Journal
 
Lawry-Daniel.doc
Lawry-Daniel.docLawry-Daniel.doc
Lawry-Daniel.docbutest
 
Sequential estimation of_discrete_choice_models__copy_-4
Sequential estimation of_discrete_choice_models__copy_-4Sequential estimation of_discrete_choice_models__copy_-4
Sequential estimation of_discrete_choice_models__copy_-4YoussefKitane
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniquestalktoharry
 
Amy Stidworthy - Optimising local air quality models with sensor data - DMUG17
Amy Stidworthy - Optimising local air quality models with sensor data - DMUG17Amy Stidworthy - Optimising local air quality models with sensor data - DMUG17
Amy Stidworthy - Optimising local air quality models with sensor data - DMUG17IES / IAQM
 
Pillar k means
Pillar k meansPillar k means
Pillar k meansswathi b
 
Pca and kpca of ecg signal
Pca and kpca of ecg signalPca and kpca of ecg signal
Pca and kpca of ecg signales712
 
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...IRJET Journal
 
Slides TSALBP ACO 2008
Slides TSALBP ACO 2008Slides TSALBP ACO 2008
Slides TSALBP ACO 2008Manuel ChiSe
 
Reliable ABC model choice via random forests
Reliable ABC model choice via random forestsReliable ABC model choice via random forests
Reliable ABC model choice via random forestsChristian Robert
 
ENHANCING COMPUTATIONAL EFFORTS WITH CONSIDERATION OF PROBABILISTIC AVAILABL...
ENHANCING COMPUTATIONAL EFFORTS WITH CONSIDERATION OF  PROBABILISTIC AVAILABL...ENHANCING COMPUTATIONAL EFFORTS WITH CONSIDERATION OF  PROBABILISTIC AVAILABL...
ENHANCING COMPUTATIONAL EFFORTS WITH CONSIDERATION OF PROBABILISTIC AVAILABL...Raja Larik
 
Premeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringPremeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringIJCSIS Research Publications
 
Research Summary: Scalable Algorithms for Nearest-Neighbor Joins on Big Traje...
Research Summary: Scalable Algorithms for Nearest-Neighbor Joins on Big Traje...Research Summary: Scalable Algorithms for Nearest-Neighbor Joins on Big Traje...
Research Summary: Scalable Algorithms for Nearest-Neighbor Joins on Big Traje...Alex Klibisz
 

Similar to Fast and Probvably Seedings for k-Means (20)

Off-Policy Deep Reinforcement Learning without Exploration.pdf
Off-Policy Deep Reinforcement Learning without Exploration.pdfOff-Policy Deep Reinforcement Learning without Exploration.pdf
Off-Policy Deep Reinforcement Learning without Exploration.pdf
 
A HYBRID CLUSTERING ALGORITHM FOR DATA MINING
A HYBRID CLUSTERING ALGORITHM FOR DATA MININGA HYBRID CLUSTERING ALGORITHM FOR DATA MINING
A HYBRID CLUSTERING ALGORITHM FOR DATA MINING
 
Master's Thesis Presentation
Master's Thesis PresentationMaster's Thesis Presentation
Master's Thesis Presentation
 
IDA 2015: Efficient model selection for regularized classification by exploit...
IDA 2015: Efficient model selection for regularized classification by exploit...IDA 2015: Efficient model selection for regularized classification by exploit...
IDA 2015: Efficient model selection for regularized classification by exploit...
 
Cost optimized reliability test planning rev 7
Cost optimized reliability test planning rev 7Cost optimized reliability test planning rev 7
Cost optimized reliability test planning rev 7
 
Data clustering
Data clustering Data clustering
Data clustering
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means Algorithm
 
Lawry-Daniel.doc
Lawry-Daniel.docLawry-Daniel.doc
Lawry-Daniel.doc
 
Sequential estimation of_discrete_choice_models__copy_-4
Sequential estimation of_discrete_choice_models__copy_-4Sequential estimation of_discrete_choice_models__copy_-4
Sequential estimation of_discrete_choice_models__copy_-4
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniques
 
Amy Stidworthy - Optimising local air quality models with sensor data - DMUG17
Amy Stidworthy - Optimising local air quality models with sensor data - DMUG17Amy Stidworthy - Optimising local air quality models with sensor data - DMUG17
Amy Stidworthy - Optimising local air quality models with sensor data - DMUG17
 
Pillar k means
Pillar k meansPillar k means
Pillar k means
 
Pca and kpca of ecg signal
Pca and kpca of ecg signalPca and kpca of ecg signal
Pca and kpca of ecg signal
 
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
 
Slides TSALBP ACO 2008
Slides TSALBP ACO 2008Slides TSALBP ACO 2008
Slides TSALBP ACO 2008
 
Reliable ABC model choice via random forests
Reliable ABC model choice via random forestsReliable ABC model choice via random forests
Reliable ABC model choice via random forests
 
ENHANCING COMPUTATIONAL EFFORTS WITH CONSIDERATION OF PROBABILISTIC AVAILABL...
ENHANCING COMPUTATIONAL EFFORTS WITH CONSIDERATION OF  PROBABILISTIC AVAILABL...ENHANCING COMPUTATIONAL EFFORTS WITH CONSIDERATION OF  PROBABILISTIC AVAILABL...
ENHANCING COMPUTATIONAL EFFORTS WITH CONSIDERATION OF PROBABILISTIC AVAILABL...
 
Premeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringPremeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means Clustering
 
Research Summary: Scalable Algorithms for Nearest-Neighbor Joins on Big Traje...
Research Summary: Scalable Algorithms for Nearest-Neighbor Joins on Big Traje...Research Summary: Scalable Algorithms for Nearest-Neighbor Joins on Big Traje...
Research Summary: Scalable Algorithms for Nearest-Neighbor Joins on Big Traje...
 
Model selection
Model selectionModel selection
Model selection
 

More from Kimikazu Kato

Tokyo webmining 2017-10-28
Tokyo webmining 2017-10-28Tokyo webmining 2017-10-28
Tokyo webmining 2017-10-28Kimikazu Kato
 
機械学習ゴリゴリ派のための数学とPython
機械学習ゴリゴリ派のための数学とPython機械学習ゴリゴリ派のための数学とPython
機械学習ゴリゴリ派のための数学とPythonKimikazu Kato
 
Pythonを使った機械学習の学習
Pythonを使った機械学習の学習Pythonを使った機械学習の学習
Pythonを使った機械学習の学習Kimikazu Kato
 
Pythonで機械学習入門以前
Pythonで機械学習入門以前Pythonで機械学習入門以前
Pythonで機械学習入門以前Kimikazu Kato
 
Pythonによる機械学習
Pythonによる機械学習Pythonによる機械学習
Pythonによる機械学習Kimikazu Kato
 
Introduction to behavior based recommendation system
Introduction to behavior based recommendation systemIntroduction to behavior based recommendation system
Introduction to behavior based recommendation systemKimikazu Kato
 
Pythonによる機械学習の最前線
Pythonによる機械学習の最前線Pythonによる機械学習の最前線
Pythonによる機械学習の最前線Kimikazu Kato
 
Sparse pca via bipartite matching
Sparse pca via bipartite matchingSparse pca via bipartite matching
Sparse pca via bipartite matchingKimikazu Kato
 
正しいプログラミング言語の覚え方
正しいプログラミング言語の覚え方正しいプログラミング言語の覚え方
正しいプログラミング言語の覚え方Kimikazu Kato
 
Introduction to NumPy for Machine Learning Programmers
Introduction to NumPy for Machine Learning ProgrammersIntroduction to NumPy for Machine Learning Programmers
Introduction to NumPy for Machine Learning ProgrammersKimikazu Kato
 
Recommendation System --Theory and Practice
Recommendation System --Theory and PracticeRecommendation System --Theory and Practice
Recommendation System --Theory and PracticeKimikazu Kato
 
A Safe Rule for Sparse Logistic Regression
A Safe Rule for Sparse Logistic RegressionA Safe Rule for Sparse Logistic Regression
A Safe Rule for Sparse Logistic RegressionKimikazu Kato
 
特定の不快感を与えるツイートの分類と自動生成について
特定の不快感を与えるツイートの分類と自動生成について特定の不快感を与えるツイートの分類と自動生成について
特定の不快感を与えるツイートの分類と自動生成についてKimikazu Kato
 
Effective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyEffective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyKimikazu Kato
 
【論文紹介】Approximate Bayesian Image Interpretation Using Generative Probabilisti...
【論文紹介】Approximate Bayesian Image Interpretation Using Generative Probabilisti...【論文紹介】Approximate Bayesian Image Interpretation Using Generative Probabilisti...
【論文紹介】Approximate Bayesian Image Interpretation Using Generative Probabilisti...Kimikazu Kato
 
About Our Recommender System
About Our Recommender SystemAbout Our Recommender System
About Our Recommender SystemKimikazu Kato
 
ネット通販向けレコメンドシステム提供サービスについて
ネット通販向けレコメンドシステム提供サービスについてネット通販向けレコメンドシステム提供サービスについて
ネット通販向けレコメンドシステム提供サービスについてKimikazu Kato
 

More from Kimikazu Kato (20)

Tokyo webmining 2017-10-28
Tokyo webmining 2017-10-28Tokyo webmining 2017-10-28
Tokyo webmining 2017-10-28
 
機械学習ゴリゴリ派のための数学とPython
機械学習ゴリゴリ派のための数学とPython機械学習ゴリゴリ派のための数学とPython
機械学習ゴリゴリ派のための数学とPython
 
Pythonを使った機械学習の学習
Pythonを使った機械学習の学習Pythonを使った機械学習の学習
Pythonを使った機械学習の学習
 
Pythonで機械学習入門以前
Pythonで機械学習入門以前Pythonで機械学習入門以前
Pythonで機械学習入門以前
 
Pythonによる機械学習
Pythonによる機械学習Pythonによる機械学習
Pythonによる機械学習
 
Introduction to behavior based recommendation system
Introduction to behavior based recommendation systemIntroduction to behavior based recommendation system
Introduction to behavior based recommendation system
 
Pythonによる機械学習の最前線
Pythonによる機械学習の最前線Pythonによる機械学習の最前線
Pythonによる機械学習の最前線
 
Sparse pca via bipartite matching
Sparse pca via bipartite matchingSparse pca via bipartite matching
Sparse pca via bipartite matching
 
正しいプログラミング言語の覚え方
正しいプログラミング言語の覚え方正しいプログラミング言語の覚え方
正しいプログラミング言語の覚え方
 
養成読本と私
養成読本と私養成読本と私
養成読本と私
 
Introduction to NumPy for Machine Learning Programmers
Introduction to NumPy for Machine Learning ProgrammersIntroduction to NumPy for Machine Learning Programmers
Introduction to NumPy for Machine Learning Programmers
 
Recommendation System --Theory and Practice
Recommendation System --Theory and PracticeRecommendation System --Theory and Practice
Recommendation System --Theory and Practice
 
A Safe Rule for Sparse Logistic Regression
A Safe Rule for Sparse Logistic RegressionA Safe Rule for Sparse Logistic Regression
A Safe Rule for Sparse Logistic Regression
 
特定の不快感を与えるツイートの分類と自動生成について
特定の不快感を与えるツイートの分類と自動生成について特定の不快感を与えるツイートの分類と自動生成について
特定の不快感を与えるツイートの分類と自動生成について
 
Effective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyEffective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPy
 
Sapporo20140709
Sapporo20140709Sapporo20140709
Sapporo20140709
 
【論文紹介】Approximate Bayesian Image Interpretation Using Generative Probabilisti...
【論文紹介】Approximate Bayesian Image Interpretation Using Generative Probabilisti...【論文紹介】Approximate Bayesian Image Interpretation Using Generative Probabilisti...
【論文紹介】Approximate Bayesian Image Interpretation Using Generative Probabilisti...
 
Zuang-FPSGD
Zuang-FPSGDZuang-FPSGD
Zuang-FPSGD
 
About Our Recommender System
About Our Recommender SystemAbout Our Recommender System
About Our Recommender System
 
ネット通販向けレコメンドシステム提供サービスについて
ネット通販向けレコメンドシステム提供サービスについてネット通販向けレコメンドシステム提供サービスについて
ネット通販向けレコメンドシステム提供サービスについて
 

Recently uploaded

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Fast and Probvably Seedings for k-Means

  • 1. Fast and Provably Good Seedings for k-Means O. Bachem, M. Lucic, S. Hassani, A. Krause Presented by Kimikazu Kato, Silver Egg Technology Co., Ltd.
  • 2. Algorithm of k-Means clustering Determine initial centroids Update centroids and membership of clusters gradually Improvement of this part Existing results: k-means++: sampling according to some metric Bachem et al. 2016: Performance improvement using MCMC, but has some assumption about the distribution of the data Proposed: Another MCMC based algorithm without assumption of the distribution Outline
  • 3. Related researches kmeans++ Draw accoding to Intuition: Choose initial centroids from the input data so that they scatter as widely as possible Bachem et al. 2016 Intended to overcome the shortcoming of kmeans++: the marginalization cost Metropolitan Hastings algorithm, which utilizes rejection sampling to emulate the distribution. But have some assumption on the input data. as a centroid C: set of centroids which are already chosen
  • 4. Proposed Algorithm Update from the preceding result: rejection criterion The convergence is mathematically proved.
  • 8. Conclusion • Novel algorithm for the initialization of centroids in kmeans • Theoretical guarantee on the convergence and the trade-off of accuracy and speed • Experimentally good result