SlideShare a Scribd company logo
1 of 23
Download to read offline
MANIFOLDS IN SEMI-SUPERVISED LEARNING
Monojit Basu
Director, TechYugadi IT Solutions & Consulting, Bangalore
EXTENDED
2
Outline
● Semi-supervized Learning and Graph-based Algorithms
● Data Distribution on Manifold and Multi-manifold
● Classification Algorithms with Manifold Regularization
● Implementation Hints
● Closing Remarks
3
Outline
● Semi-supervized Learning and Graph-based Algorithms
● Data Distribution on Manifold and Multi-manifold
● Classification Algorithms with Manifold Regularization
● Implementation Hints
● Closing Remarks
4
Semi-supervized Learning: Overview
● Training Samples consist of data with and without class label
● Images with and without captions
● Text with and without tags, ..
● Model is built with both labeled and unlabeled data
Prob(y|x) Prob(x)
● Smoothness Property: If two data points are close, their labels
should be similar
Label Data
Based on labeled samples Based on both labeled
and unlabeled samples
5
Graph-based Algorithms For SSL
● There are many many ways of exploiting smoothness property
● A simplistic baseline approach is self-training (not graph-based)
● Graph-based Algorithms are particularly effective
● Label Propagation
● Random-Walk
● Min-Cut
● Density-based Distances
● Local and Global Consistency
● Using Graph Kernels, ..
6
Label Propagation
● Generates a weighted graph where edges between similar
neighbours have higher weights (Zhu and Ghahramani, 2002)
● Defines a transition matrix:
● Tij = probability of node i ‘jumping’ into node j, that is, taking up j’s label
● Repeatedly multiplies the current label matrix with the transition
matrix (which itself gets updated)
● Until labels on all nodes stabilize (convergence)
● In effect labels propagate from labeled to unlabeled nodes
1
1
1
00
0 unlabeled
7
Outline
● Semi-supervized Learning and Graph-based Algorithms
● Data Distribution on Manifold and Multi-manifold
● Classification Algorithms with Manifold Regularization
● Implementation Hints
● Closing Remarks
8
Manifold Structures
● Data (nodes) are distributed over low and high density regions
● Two nodes that are geometrically close may not be similar
● Or equivalently, the geometry / distance measure should be redefined
● Euclidean distances and weights based on them may not work
● Such data is said to lie on a manifold
● Although not necessary, manifold structures are often
observed with high-dimensional data
● More complex scenario: data may not lie on a single manifold
● This is called multi-manifold structure
9
Single Manifold Structures
SWISS ROLL TWO MOONS
10
Multi-manifold Structures
$
Dollar Symbol
Surface Sphere
11
Outline
● Semi-supervized Learning and Graph-based Algorithms
● Data Distribution on Manifold and Multi-manifold
● Classification Algorithms with Manifold Regularization
● Implementation Hints
● Closing Remarks
12
Manifold Regularization
● This is the technical term for semi-supervized classification of
data distributed on a (single) manifold (Belkin et al., 2006)
● Key is to establish connectivity between similar nodes by
staying along a high-density region
● Mathematically it involves
● Computing a matrix L derived from the ordinary weight matrix W
● Taking the top n eigenvalues of L
● Computing an indicator function using the dot product of a data point
with the eigenvalues
● It is based on a theory known as Kernel Hilbert Spaces
13
Maniford Regularization (Schematic)
DATA
W
L=D-W
Eigen(L)
dotxData Point >0
+ve
-ve
CLASS LABELS
14
Multi-manifold Regularization
● This is the technical term for semi-supervized classification of
data distributed on a multi-manifold (Goldberg et al., 2009)
● Single manifold algorithm still starts with Euclidean distances,
but reformulates steps based on the derived matrix L
● Multi-manifold algorithm straight away changes distance
metrics
● It is based on Hellinger distances H, and
● A Mahalnabis k-nearest neighbor graph computed from H
● Complete algorithm is much longer, involving spectral
clustering and self-training on each cluster
15
Multi-manifold Regularization (Schematic)
DATA
Σs
Sample Cov. Mat.
H
kNN graph
Spectral Clustering
Self-trained Clusters
16
Multi-view Semi-supervised Learning
● Multi-view learning involves two or more independent
projections for each data point
● Classic Example: web-page classification using
● Bag of words
● Links to other web-pages
● Instead of representing data as (X, y) where y is class label, it
may be represented as (X1, X2, y), where Xi are views
● Somewhat related to multimodal learning (like video and
audio)
17
Multi-view Manifold Regularization
● Can manifold regularization be extended to multi-view data
● Yes, algorithms exist, based on strong mathematical
foundations, like Sindhwani and Rosenberg, 2008
● There is actually a generic pattern for multi-view semi-
supervized learning, called co-training
● Sindhwani et al., extends co-training with an algorithm called
co-regularization
● It reduces the problem to a convex optimization to minimize a
loss function
● The total loss function depends on individual class predictors
for each view, and a couple of regularization hyperparameters
18
Outline
● Semi-supervized Learning and Graph-based Algorithms
● Data Distribution on Manifold and Multi-manifold
● Classification Algorithms with Manifold Regularization
● Implementation Hints
● Closing Remarks
19
Python Implementation
● An implementation of some of these algorithms in Python 3.x is
published on github:
https://github.com/techyugadi/manifold_ssl
● These algorithms offer an interface similar to scikit-learn
● There are some programs to generate synthetic data and also
use the MNIST handwritten digits data
● Note: scikit-learn as of now supports only label propagation
algorithm for semi-supervized learning
● R package has more algorithms but not maifold regularization
● This is early-access release, more algorithms to be published !
20
Outline
● Semi-supervized Learning and Graph-based Algorithms
● Data Distribution on Manifold and Multi-manifold
● Classification Algorithms with Manifold Regularization
● Implementation Hints
● Closing Remarks
21
Summary
● Manifold regularization is an improvement over the standard
label propagation algorithm for semi-supervised learning
● It may lead to better results when data is distributed over a
manifold or multi-manifold
● This class of algorithms cover a wide range of scenarios,
including multi-view datasets
● These algorithms can be implemented in Python using
common numpy and linear algebra packages (see github)
22
References
● Zhu and Ghahramani, 2002: Learning from Labeled and
Unlabeled Data with Label Propagation
● Belkin, Niyogi and Sindhwani, 2006: Manifold Regularization:
A Geometric Framework for Learning from Labeled and
Unlabeled Examples
● Sindhwani and Rosenberg, 2008: An RKHS for Multi-View
Learning and Manifold Co-Regularization
● Goldberg, Zhu, Singh, Xu and Nowak, 2009: Multi-Manifold
Semi-Supervised Learning
23
THANK YOU
monojit@techyugadi.com

More Related Content

Similar to NODES 2020 extended - Manifolds in semi-supervised learning

Challenges in Large Scale Machine Learning
Challenges in Large Scale  Machine LearningChallenges in Large Scale  Machine Learning
Challenges in Large Scale Machine LearningSudarsun Santhiappan
 
MLlib and Machine Learning on Spark
MLlib and Machine Learning on SparkMLlib and Machine Learning on Spark
MLlib and Machine Learning on SparkPetr Zapletal
 
Web Traffic Time Series Forecasting
Web Traffic  Time Series ForecastingWeb Traffic  Time Series Forecasting
Web Traffic Time Series ForecastingBillTubbs
 
ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksSeunghyun Hwang
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroDaniel Marcous
 
Object Oriented, Design patterns and data modelling worshop
Object Oriented, Design patterns and data modelling worshopObject Oriented, Design patterns and data modelling worshop
Object Oriented, Design patterns and data modelling worshopMohammad Shawahneh
 
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]Pramati Technologies
 
Single Responsibility Principle
Single Responsibility PrincipleSingle Responsibility Principle
Single Responsibility PrincipleBADR
 
NYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell Rebo
NYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell ReboNYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell Rebo
NYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell ReboMaryam Farooq
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper reviewMazen Aly
 
Machine Learning in the Financial Industry
Machine Learning in the Financial IndustryMachine Learning in the Financial Industry
Machine Learning in the Financial IndustrySubrat Panda, PhD
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Anant Corporation
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...thanhdowork
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImageryRAHUL BHOJWANI
 
Introduction to Machine Learning with Spark
Introduction to Machine Learning with SparkIntroduction to Machine Learning with Spark
Introduction to Machine Learning with Sparkdatamantra
 
Machine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowMachine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowAnant Corporation
 
End to end MLworkflows
End to end MLworkflowsEnd to end MLworkflows
End to end MLworkflowsAdam Gibson
 
OpenHPI - Parallel Programming Concepts - Week 6
OpenHPI - Parallel Programming Concepts - Week 6OpenHPI - Parallel Programming Concepts - Week 6
OpenHPI - Parallel Programming Concepts - Week 6Peter Tröger
 
Deep Semi-supervised Learning methods
Deep Semi-supervised Learning methodsDeep Semi-supervised Learning methods
Deep Semi-supervised Learning methodsPrincy Joy
 

Similar to NODES 2020 extended - Manifolds in semi-supervised learning (20)

Challenges in Large Scale Machine Learning
Challenges in Large Scale  Machine LearningChallenges in Large Scale  Machine Learning
Challenges in Large Scale Machine Learning
 
MLlib and Machine Learning on Spark
MLlib and Machine Learning on SparkMLlib and Machine Learning on Spark
MLlib and Machine Learning on Spark
 
Web Traffic Time Series Forecasting
Web Traffic  Time Series ForecastingWeb Traffic  Time Series Forecasting
Web Traffic Time Series Forecasting
 
ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention Networks
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
 
Object Oriented, Design patterns and data modelling worshop
Object Oriented, Design patterns and data modelling worshopObject Oriented, Design patterns and data modelling worshop
Object Oriented, Design patterns and data modelling worshop
 
C3 w3
C3 w3C3 w3
C3 w3
 
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
 
Single Responsibility Principle
Single Responsibility PrincipleSingle Responsibility Principle
Single Responsibility Principle
 
NYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell Rebo
NYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell ReboNYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell Rebo
NYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell Rebo
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper review
 
Machine Learning in the Financial Industry
Machine Learning in the Financial IndustryMachine Learning in the Financial Industry
Machine Learning in the Financial Industry
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
Introduction to Machine Learning with Spark
Introduction to Machine Learning with SparkIntroduction to Machine Learning with Spark
Introduction to Machine Learning with Spark
 
Machine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowMachine Learning Orchestration with Airflow
Machine Learning Orchestration with Airflow
 
End to end MLworkflows
End to end MLworkflowsEnd to end MLworkflows
End to end MLworkflows
 
OpenHPI - Parallel Programming Concepts - Week 6
OpenHPI - Parallel Programming Concepts - Week 6OpenHPI - Parallel Programming Concepts - Week 6
OpenHPI - Parallel Programming Concepts - Week 6
 
Deep Semi-supervised Learning methods
Deep Semi-supervised Learning methodsDeep Semi-supervised Learning methods
Deep Semi-supervised Learning methods
 

More from Neo4j

LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...Neo4j
 
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4jGraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4jNeo4j
 
GraphSummit Milan - Neo4j: The Art of the Possible with Graph
GraphSummit Milan - Neo4j: The Art of the Possible with GraphGraphSummit Milan - Neo4j: The Art of the Possible with Graph
GraphSummit Milan - Neo4j: The Art of the Possible with GraphNeo4j
 
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...Neo4j
 
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale IbridaUNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale IbridaNeo4j
 
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...Neo4j
 
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptxFrom Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptxNeo4j
 
Novo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMsNovo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMsNeo4j
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansNeo4j
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...Neo4j
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosNeo4j
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Neo4j
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 

More from Neo4j (20)

LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
 
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4jGraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
 
GraphSummit Milan - Neo4j: The Art of the Possible with Graph
GraphSummit Milan - Neo4j: The Art of the Possible with GraphGraphSummit Milan - Neo4j: The Art of the Possible with Graph
GraphSummit Milan - Neo4j: The Art of the Possible with Graph
 
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
 
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale IbridaUNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
 
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
 
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptxFrom Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
 
Novo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMsNovo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMs
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 

Recently uploaded

Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 

Recently uploaded (20)

Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 

NODES 2020 extended - Manifolds in semi-supervised learning

  • 1. MANIFOLDS IN SEMI-SUPERVISED LEARNING Monojit Basu Director, TechYugadi IT Solutions & Consulting, Bangalore EXTENDED
  • 2. 2 Outline ● Semi-supervized Learning and Graph-based Algorithms ● Data Distribution on Manifold and Multi-manifold ● Classification Algorithms with Manifold Regularization ● Implementation Hints ● Closing Remarks
  • 3. 3 Outline ● Semi-supervized Learning and Graph-based Algorithms ● Data Distribution on Manifold and Multi-manifold ● Classification Algorithms with Manifold Regularization ● Implementation Hints ● Closing Remarks
  • 4. 4 Semi-supervized Learning: Overview ● Training Samples consist of data with and without class label ● Images with and without captions ● Text with and without tags, .. ● Model is built with both labeled and unlabeled data Prob(y|x) Prob(x) ● Smoothness Property: If two data points are close, their labels should be similar Label Data Based on labeled samples Based on both labeled and unlabeled samples
  • 5. 5 Graph-based Algorithms For SSL ● There are many many ways of exploiting smoothness property ● A simplistic baseline approach is self-training (not graph-based) ● Graph-based Algorithms are particularly effective ● Label Propagation ● Random-Walk ● Min-Cut ● Density-based Distances ● Local and Global Consistency ● Using Graph Kernels, ..
  • 6. 6 Label Propagation ● Generates a weighted graph where edges between similar neighbours have higher weights (Zhu and Ghahramani, 2002) ● Defines a transition matrix: ● Tij = probability of node i ‘jumping’ into node j, that is, taking up j’s label ● Repeatedly multiplies the current label matrix with the transition matrix (which itself gets updated) ● Until labels on all nodes stabilize (convergence) ● In effect labels propagate from labeled to unlabeled nodes 1 1 1 00 0 unlabeled
  • 7. 7 Outline ● Semi-supervized Learning and Graph-based Algorithms ● Data Distribution on Manifold and Multi-manifold ● Classification Algorithms with Manifold Regularization ● Implementation Hints ● Closing Remarks
  • 8. 8 Manifold Structures ● Data (nodes) are distributed over low and high density regions ● Two nodes that are geometrically close may not be similar ● Or equivalently, the geometry / distance measure should be redefined ● Euclidean distances and weights based on them may not work ● Such data is said to lie on a manifold ● Although not necessary, manifold structures are often observed with high-dimensional data ● More complex scenario: data may not lie on a single manifold ● This is called multi-manifold structure
  • 11. 11 Outline ● Semi-supervized Learning and Graph-based Algorithms ● Data Distribution on Manifold and Multi-manifold ● Classification Algorithms with Manifold Regularization ● Implementation Hints ● Closing Remarks
  • 12. 12 Manifold Regularization ● This is the technical term for semi-supervized classification of data distributed on a (single) manifold (Belkin et al., 2006) ● Key is to establish connectivity between similar nodes by staying along a high-density region ● Mathematically it involves ● Computing a matrix L derived from the ordinary weight matrix W ● Taking the top n eigenvalues of L ● Computing an indicator function using the dot product of a data point with the eigenvalues ● It is based on a theory known as Kernel Hilbert Spaces
  • 14. 14 Multi-manifold Regularization ● This is the technical term for semi-supervized classification of data distributed on a multi-manifold (Goldberg et al., 2009) ● Single manifold algorithm still starts with Euclidean distances, but reformulates steps based on the derived matrix L ● Multi-manifold algorithm straight away changes distance metrics ● It is based on Hellinger distances H, and ● A Mahalnabis k-nearest neighbor graph computed from H ● Complete algorithm is much longer, involving spectral clustering and self-training on each cluster
  • 15. 15 Multi-manifold Regularization (Schematic) DATA Σs Sample Cov. Mat. H kNN graph Spectral Clustering Self-trained Clusters
  • 16. 16 Multi-view Semi-supervised Learning ● Multi-view learning involves two or more independent projections for each data point ● Classic Example: web-page classification using ● Bag of words ● Links to other web-pages ● Instead of representing data as (X, y) where y is class label, it may be represented as (X1, X2, y), where Xi are views ● Somewhat related to multimodal learning (like video and audio)
  • 17. 17 Multi-view Manifold Regularization ● Can manifold regularization be extended to multi-view data ● Yes, algorithms exist, based on strong mathematical foundations, like Sindhwani and Rosenberg, 2008 ● There is actually a generic pattern for multi-view semi- supervized learning, called co-training ● Sindhwani et al., extends co-training with an algorithm called co-regularization ● It reduces the problem to a convex optimization to minimize a loss function ● The total loss function depends on individual class predictors for each view, and a couple of regularization hyperparameters
  • 18. 18 Outline ● Semi-supervized Learning and Graph-based Algorithms ● Data Distribution on Manifold and Multi-manifold ● Classification Algorithms with Manifold Regularization ● Implementation Hints ● Closing Remarks
  • 19. 19 Python Implementation ● An implementation of some of these algorithms in Python 3.x is published on github: https://github.com/techyugadi/manifold_ssl ● These algorithms offer an interface similar to scikit-learn ● There are some programs to generate synthetic data and also use the MNIST handwritten digits data ● Note: scikit-learn as of now supports only label propagation algorithm for semi-supervized learning ● R package has more algorithms but not maifold regularization ● This is early-access release, more algorithms to be published !
  • 20. 20 Outline ● Semi-supervized Learning and Graph-based Algorithms ● Data Distribution on Manifold and Multi-manifold ● Classification Algorithms with Manifold Regularization ● Implementation Hints ● Closing Remarks
  • 21. 21 Summary ● Manifold regularization is an improvement over the standard label propagation algorithm for semi-supervised learning ● It may lead to better results when data is distributed over a manifold or multi-manifold ● This class of algorithms cover a wide range of scenarios, including multi-view datasets ● These algorithms can be implemented in Python using common numpy and linear algebra packages (see github)
  • 22. 22 References ● Zhu and Ghahramani, 2002: Learning from Labeled and Unlabeled Data with Label Propagation ● Belkin, Niyogi and Sindhwani, 2006: Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples ● Sindhwani and Rosenberg, 2008: An RKHS for Multi-View Learning and Manifold Co-Regularization ● Goldberg, Zhu, Singh, Xu and Nowak, 2009: Multi-Manifold Semi-Supervised Learning