Machine Learning as a Daily Work for a Programmer- Volodymyr Vorobiov

RUBYGARAGE2017
TECHNOLOGYMATTERS
What It Is And How It Works
Machine Learning
Volodymyr Vorobiov
Software Development Consultant at RubyGarage

Machine learning is a subset of artificial
intelligence whose goalis to give computers
the ability to teach themselves, whereas
artificial intelligence is a general concept
of smart machines.
In other words, artificial intelligence is
implemented through machine learning or
- to be more precise
- through machine learning algorithms.
RUBYGARAGE2017
TECHNOLOGYMATTERS
artificial
intelligence
TEACH YOUR COMPUTER

EXAMPLES OF
HOW MACHINE LEARNING
IS USED IN THE REAL
WORLD
RUBYGARAGE2017
TECHNOLOGYMATTERS
- Facial recognition
- Voice recognition
- Text recognition
- Diagnostics into medicine,
- Self-driving cars
- Robots behavior adjustment
- Ads targeting,
- Predictions in financial trading
- Virtual and augmented reality
- Astronomy and space ???

The 21st century is the age of data.
It’s literally everywhere. In fact,
there has been an exponential
growth in the volume of data over
the past decade; the total amount
of data doubles every two years.
Most of it, however, isn’t used.
Huge volumes of data can be tagged,
structured, and analyzed,
revealing a lot of valuable information.
Only machine learning algorithms
can easily cope with this task.
RUBYGARAGE2017
TECHNOLOGYMATTERS
WHY THE FUTURE BELONGS TO MACHINE LEARNING

RUBYGARAGE2017
TECHNOLOGYMATTERS
HOW MACHINE LEARNING WORKS
Preprocessing Learinng Evaluation Prediction
Labels
Raw
Data
Labels
Labels
Final Model New DataTraining Dataset
Test Dataset
Learning
Algorithm
(putting data into
the necessary shape)
(creating a model with
the help of training data)
(model assessment
using test data)
application of the model)

TOOLS
RUBYGARAGE2017
TECHNOLOGYMATTERS
- Python
- Pandas - Powerful data analysis library for Python
Pandas is a powerful data analysis Python library that provides flexible and fast data structures
for processing “relational” or “labeled” data. This is a fundamental data analysis toolkit
in Python.
- Scikit-learn - Machine Learning in Python
These are simple and effective open-source tools for data mining and analysis.
- Statsmodels
This is a Python module providing functions and classes to estimate different statistical models
as well as to conduct tests and explore statistical data. The Statsmodels module offers a
comprehensive list of result statistics.
- Matplotlib
Matplotlib is a Python 2D plotting library that releases publication quality figures in multiple
formats and interactive environments in different platforms.

The quality of the data and the amount of useful information that it
contains are key factors that determine how well a machine learning
algorithm can learn. Therefore, it is absolutely critical that we make
sure to examine and preprocess a dataset before we feed it to a learning algorithm.
- Removing and imputing missing values from the dataset
- Getting categorical data into shape for machine learning algorithms
- Selecting relevant features for the model construction
RUBYGARAGE2017
TECHNOLOGYMATTERS
DATA PREPROCESSING

DATA PREPROCESSING DATASET PRESENTATION
RUBYGARAGE2017
TECHNOLOGYMATTERS
Independent variables Dependent variables

RUBYGARAGE2017
TECHNOLOGYMATTERS
DEALING WITH MISSING DATA
Most computational tools are unable to handle such missing values
or would produce unpredictable results if we simply ignored them.
Therefore, it is crucial that we take care of those missing values
before we proceed with further analyses.

- Eliminating samples or features with missing values
The easiest solution to this problem is simply to remove samples with missing
values from a dataset.
However, this seemingly handy approach has a number of drawbacks.
For example, removing too many of such samples is likely to compromise
the quality of the analysis.
- Imputing missing values
The solution is to use various interpolation techniques that help to “guess”
the missing values from other samples in a dataset.
RUBYGARAGE2017
TECHNOLOGYMATTERS
DEALING WITH MISSING DATA

IMPUTING MISSING VALUES RESULTS

RUBYGARAGE2017
TECHNOLOGYMATTERS
HANDLING CATEGORICAL DATA

RUBYGARAGE2017
TECHNOLOGYMATTERS
DUMMY VARIABLES

RUBYGARAGE2017
TECHNOLOGYMATTERS
DUMMY VARIABLE TRAP

PARTITIONING A DATASET INTO TRAINING AND TEST SETS

TRAINING AND TEST SETS RESULTS

BRINGING FEATURES ONTO THE SAME SCALE

TRAINING AND SELECTING A PREDICTIVE MODEL
RUBYGARAGE2017
TECHNOLOGYMATTERS
- Supervised learning
- Regression
- Classification
- Unsupervised learning
- Clustering
- Dimensionality Reduction
- Reinforcement Learning
- Association Rule Learning
- Natural Language Processing
- Deep Learning
- Model Selection

SUPERVISED LEARNING
RUBYGARAGE2017
TECHNOLOGYMATTERS
For making predictions about the future
Regression
For predicting continuous outcomes
Classification
For predicting class labels

REGRESSION
RUBYGARAGE2017
TECHNOLOGYMATTERS
Regression models (both linear and non-linear) are used for predicting a real value,
like salary for example. If your independent variable is time,
then you are forecasting future values, otherwise your model is predicting
present but unknown values.

SIMPLE LINEAR REGRESSION
RUBYGARAGE2017
TECHNOLOGYMATTERS
y
x
Constant Coefficent
Dependent variable (DV) Independent variable (IV)
y = b + b*x1 10

DATASET PRESENTATION. EXPERIENCE AND SALARY.
RUBYGARAGE2017
TECHNOLOGYMATTERS

SIMPLE LINEAR REGRESSION TRAINING

SIMPLE LINEAR REGRESSION TRAINING
RUBYGARAGE2017
TECHNOLOGYMATTERS

MULTIPLE LINEAR REGRESSION
RUBYGARAGE2017
TECHNOLOGYMATTERS
Constant Coefficent
Dependent variable (DV) Independent variable (IVs)
y = b + b*x + b*x ... + b*x1 1 2 2 n n0

DATASET PRESENTATION. INVESTMENT FUND STATISTIC.
RUBYGARAGE2017
TECHNOLOGYMATTERS

MULTIPLE LINEAR REGRESSION TRAINING

EVALUATING REGRESSION MODELS PERFORMANCE
RUBYGARAGE2017
TECHNOLOGYMATTERS
1. All-in
2. Backward Elimination
3. Forward Selection
4. Bidirectional Elimination
5. Score Comparison
Stepwise
Regression

BACKWARD ELIMINATION
RUBYGARAGE2017
TECHNOLOGYMATTERS
STEP 1: Select a significance level to stay in the model (e.g. SL = 0.05)
STEP 2: Fit the full model with all possible predictors
STEP 3: Consider the predictor with the highest P-value. If P > SL, go to STEP 4, otherwise go to FIN
STEP 4: Remove the predictor
STEP 5: Fit model without this variable*

BACKWARD ELIMINATION TRAINING STEP 1

BACKWARD ELIMINATION TRAINING STEP 4

EVALUATING PERFORMANCE R-SQUARED
RUBYGARAGE2017
TECHNOLOGYMATTERS
SUM (y - y^) -> min
2
ii
Experience
Simple Linear Regression:
Salary ($)
y^i
yi

EVALUATING PERFORMANCE R-SQUARED
RUBYGARAGE2017
TECHNOLOGYMATTERS
SS = SUM (y - y^)
2
i ires
SS = SUM (y - y )
2
i avgtot
yavg
Experience
Simple Linear Regression:
Salary ($)

EVALUATING PERFORMANCE ADJUSTED R-SQUARED
RUBYGARAGE2017
TECHNOLOGYMATTERS
p - number of regressors
n - sample size

POLYNOMIAL REGRESSION
RUBYGARAGE2017
TECHNOLOGYMATTERS
y
x
y = b + b x + b x1 1 2 1
2
0

POLYNOMIAL REGRESSION. DATASET PRESENTATION.
BLUFFING DETECTOR
RUBYGARAGE2017
TECHNOLOGYMATTERS

POLYNOMIAL REGRESSION. FITTING THE DATASET

POLYNOMIAL REGRESSION. TRAINING THE MODEL

POLYNOMIAL REGRESSION RESULTS
RUBYGARAGE2017
TECHNOLOGYMATTERS

SUPPORT VECTOR REGRESSION BASED ON SUPPORT VECTOR
MACHINE

SUPPORT VECTOR REGRESSION. RESULTS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

WHAT IF
RUBYGARAGE2017
TECHNOLOGYMATTERS
X1
X2

DECISION TREE REGRESSION
RUBYGARAGE2017
TECHNOLOGYMATTERS
Split 4
Split 2
Split 1
Split 3
200
20 40
170
X1
X2
1023
0.7-64.1300.5
65.7
Y

DECISION
TREE REGRESSION
RUBYGARAGE2017
TECHNOLOGYMATTERS
X < 201
X < 2002
300.5 65.7 1023
-64.1 0.7
X < 1702
X < 401
yes no
yes no yes no
yes no

DECISION TREE REGRESSION TRAINING

DECISION TREE REGRESSION RESULT
RUBYGARAGE2017
TECHNOLOGYMATTERS

ENSEMBLE LEARNING. RANDOM FOREST REGRESSION.
RUBYGARAGE2017
TECHNOLOGYMATTERS
STEP 1: Pick at random K data points from the Training set.
STEP 2: Build the Decision Tree associated to these K data points.
STEP 3: Choose the number Ntree of trees you want to build and repeat STEPS 1 & 2
STEP 4: For a new data point, make each one of your Ntree trees predict the value of Y
to for the data point in question, and assign the new data point the average across
all of the predicted Y values.

RANDOM FOREST REGRESSION TRAINING

RANDOM FOREST REGRESSION RESULT
RUBYGARAGE2017
TECHNOLOGYMATTERS

REGRESSION MODELS. PROS AND CONS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

CLASSIFICATION
RUBYGARAGE2017
TECHNOLOGYMATTERS
Unlike regression where you predict a continuous number,
you use classification to predict a category.
There is a wide variety of classification applications from medicine to marketing.

LOGISTIC REGRESSION
RUBYGARAGE2017
TECHNOLOGYMATTERS
This is new:
Action (Y/N)
Age
We know this:
Salary ($)
Experience
y = b0 + b1*x

LOGISTIC REGRESSION
RUBYGARAGE2017
TECHNOLOGYMATTERS
Action (Y/N) Action (Y/N)
Age Age

LOGISTIC REGRESSION
RUBYGARAGE2017
TECHNOLOGYMATTERS

LOGISTIC REGRESSION PREDICTION
RUBYGARAGE2017
TECHNOLOGYMATTERS

DATASET PRESENTATION. SOCIAL NETWORK ADS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

LOGISTIC REGRESSION. PREPROCESSING

LOGISTIC REGRESSION. TRAINING SET RESULTS
RUBYGARAGE2017
TECHNOLOGYMATTERS

LOGISTIC REGRESSION. TEST SET RESULTS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

K-NEAREST NEIGHBORS
RUBYGARAGE2017
TECHNOLOGYMATTERS

K-NEAREST NEIGHBORS
RUBYGARAGE2017
TECHNOLOGYMATTERS
STEP 1: Choose the number K of neighbors
STEP 2: Take the K nearest neighbors of the new data point, according to the Euclidean distance
STEP 3: Among these K neighbors, count the number of data points in each category
STEP 4: Assign the new data point to the category where you counted the most neighbors
Your Model is Ready

K-NEAREST NEIGHBORS
RUBYGARAGE2017
TECHNOLOGYMATTERS
Category 1: 3 neighbors
Category 2: 2 neighbors

K-NEAREST NEIGHBORS. TRAINING SET RESULTS
RUBYGARAGE2017
TECHNOLOGYMATTERS

K-NEAREST NEIGHBORS. TEST SET RESULTS
RUBYGARAGE2017
TECHNOLOGYMATTERS

SUPPORT VECTOR MACHINES
RUBYGARAGE2017
TECHNOLOGYMATTERS

SUPPORT VECTOR MACHINES TRAINING

RUBYGARAGE2017
TECHNOLOGYMATTERS
SUPPORT VECTOR MACHINES. TRAINING SET RESULTS.

RUBYGARAGE2017
TECHNOLOGYMATTERS
SUPPORT VECTOR MACHINES. TEST SET RESULTS.

RUBYGARAGE2017
TECHNOLOGYMATTERS
KERNEL SVM

NAIVE BAYES. TRAINING SET RESULTS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

NAIVE BAYES. TEST SET RESULTS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

NAIVE BAYES
RUBYGARAGE2017
TECHNOLOGYMATTERS
Bayes Theorem

DRIVER OR WAALKER.
RUBYGARAGE2017
TECHNOLOGYMATTERS

NAIVE BAYES. BAYES THEOREM. WALKS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

NAIVE BAYES. BAYES THEOREM. DRIVES.
RUBYGARAGE2017
TECHNOLOGYMATTERS

NAIVE BAYES. P(WALKS).
RUBYGARAGE2017
TECHNOLOGYMATTERS

NAIVE BAYES. P(X).
RUBYGARAGE2017
TECHNOLOGYMATTERS

NAIVE BAYES. P(X|WALKS).
RUBYGARAGE2017
TECHNOLOGYMATTERS

NAIVE BAYES. P(WALKS|X).
RUBYGARAGE2017
TECHNOLOGYMATTERS

NAIVE BAYES. P(DRIVES|X).
RUBYGARAGE2017
TECHNOLOGYMATTERS

NAIVE BAYES
RUBYGARAGE2017
TECHNOLOGYMATTERS

NAIVE BAYES. NEW WALKER.
RUBYGARAGE2017
TECHNOLOGYMATTERS

DECISION TREE CLASSIFICATION
RUBYGARAGE2017
TECHNOLOGYMATTERS

DECISION TREE CLASSIFICATION. TRAINING.

DECISION TREE CLASSIFICATION. TRAINING SET RESULTS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

DECISION TREE CLASSIFICATION. TEST SET RESULTS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

RUBYGARAGE2017
TECHNOLOGYMATTERS
RANDOM FOREST CLASSIFICATION
STEP 1: Pick at random K data points from the Training set.
STEP 2: Build the Decision Tree associated to these K data points.
STEP 3: Choose the number Ntree of trees you want to build and repeat STEPS 1 & 2
STEP 4: For a new data point, make each one of your Ntree trees predict the category to
which the data point belongs, and assign the new data point to the category that wins
the majority vote.

RANDOM FOREST CLASSIFICATION. TRAINING

RUBYGARAGE2017
TECHNOLOGYMATTERS
RANDOM FOREST CLASSIFICATION. TRAINING SET RESULTS

RUBYGARAGE2017
TECHNOLOGYMATTERS
RANDOM FOREST CLASSIFICATION. TEST SET RESULTS.

EVALUATING CLASSIFICATION MODELS PERFORMANCE.
FALSE POSITIVES & FALSE NEGATIVES.
RUBYGARAGE2017
TECHNOLOGYMATTERS

RUBYGARAGE2017
TECHNOLOGYMATTERS
EVALUATING CLASSIFICATION MODELS PERFORMANCE.
CONFUSION MATRIX.

CLASSIFICATION MODELS. PROS AND CONS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

CLUSTERING
RUBYGARAGE2017
TECHNOLOGYMATTERS
Clustering is similar to classification, but the basis is different.
In Clustering you don’t know what you are looking for,
and you are trying to identify some segments or clusters in your data.
When you use clustering algorithms on your dataset,
unexpected things can suddenly pop up like structures,
clusters and groupings you would have never thought of otherwise.

K-MEANS CLUSTERING
RUBYGARAGE2017
TECHNOLOGYMATTERS

K-MEANS CLUSTERING
RUBYGARAGE2017
TECHNOLOGYMATTERS
STEP 1: Choose the number K of clusters
STEP 2: Select at random K points, the centroids (not necessarily from your dataset)
STEP 3: Assign each data point to the closest centroid -> That forms K clusters
STEP 4: Compute and place the new centroid of each cluster
STEP 5: Reassign each data point to the new closest centroid.
If any reassignment took place, go to STEP 4, otherwise go to FIN.
Your Model is Ready

RUBYGARAGE2017
TECHNOLOGYMATTERS
K-MEANS CLUSTERING

K-MEANS CLUSTERING RANDOM INITIALIZATION PROBLEM
RUBYGARAGE2017
TECHNOLOGYMATTERS

RUBYGARAGE2017
TECHNOLOGYMATTERS
K-MEANS CLUSTERING RANDOM INITIALIZATION PROBLEM

K-MEANS SELECTING THE NUMBER OF CLUSTERS
RUBYGARAGE2017
TECHNOLOGYMATTERS

DATASET PRESENTATION. MALL CUSTOMERS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

K-MEANS. TRAINING. OPTIMAL NUMBER OF CLUSTERS.

K-MEANS. OPTIMAL NUMBER OF CLUSTERS RESULTS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

K-MEANS. RESULT
RUBYGARAGE2017
TECHNOLOGYMATTERS

HIERARCHICAL CLUSTERING
RUBYGARAGE2017
TECHNOLOGYMATTERS

RUBYGARAGE2017
TECHNOLOGYMATTERS
HIERARCHICAL CLUSTERING AGGLOMERATIVE
STEP 1: Make each data point a single-point cluster That forms N clusters
STEP 2: Take the two closest data points and make them one cluster That forms N-1 clusters
STEP 3: Take the two closest clusters and make them one cluster That forms N-2 clusters
STEP 4: Repeat STEP 3 until there is only one cluster
FIN

HIERARCHICAL CLUSTERING AGGLOMERATIVE
RUBYGARAGE2017
TECHNOLOGYMATTERS

HIERARCHICAL CLUSTERING DENDROGRAMS
RUBYGARAGE2017
TECHNOLOGYMATTERS

HIERARCHICAL CLUSTERING DENDROGRAMS
RUBYGARAGE2017
TECHNOLOGYMATTERS
4 clusters

DENDROGRAMS OPTIMAL NUMBER OF CLUSTERS
RUBYGARAGE2017
TECHNOLOGYMATTERS

DENDROGRAM. FINDING THE OPTIMAL NUMBER
OF CLUSTERS.

DENDROGRAM. RESULTS
RUBYGARAGE2017
TECHNOLOGYMATTERS

HIERARCHICAL CLUSTERING. TRAINING.

HIERARCHICAL CLUSTERING RESULT
RUBYGARAGE2017
TECHNOLOGYMATTERS

CLUSTERING MODELS. PROS AND CONS
RUBYGARAGE2017
TECHNOLOGYMATTERS

REINFORCEMENT LEARNING
Reinforcement Learning is a branch of Machine Learning,
also called Online Learning. It is used to solve interacting
problems where the data observed up to time t is considered
to decide which action to take at time t + 1.
It is also used for Artificial Intelligence when training machines to perform
tasks such as walking. Desired outcomes provide the AI with reward,
undesired with punishment. Machines learn through trial and error.
RUBYGARAGE2017
TECHNOLOGYMATTERS

THE MULTI-ARMED BANDIT PROBLEM
Hot to bet to maximize your return
RUBYGARAGE2017
TECHNOLOGYMATTERS

THE MULTI-ARMED BANDIT PROBLEM
RUBYGARAGE2017
TECHNOLOGYMATTERS

UPPER CONFIDENCE BOUND ALGORITHM
RUBYGARAGE2017
TECHNOLOGYMATTERS

RANDOM SELECTION. RESULTS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

UPPER CONFIDENCE BOUND. TRAINING.

UPPER CONFIDENCE BOUND. RESULTS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

THOMPSON SAMPLING ALGORITHM
RUBYGARAGE2017
TECHNOLOGYMATTERS

BAYESIAN INFERENCE
RUBYGARAGE2017
TECHNOLOGYMATTERS

BAYESIAN INFERENCE. EXPLANATION.
RUBYGARAGE2017
TECHNOLOGYMATTERS

CREATING DISTRIBUTION BASED ON AN INITIAL DATA
RUBYGARAGE2017
TECHNOLOGYMATTERS

PULLING RANDOM VALUES FROM DISTRIBUTIONS
RUBYGARAGE2017
TECHNOLOGYMATTERS

ADJUSTING THE PERCEPTION OF THE WORLD
RUBYGARAGE2017
TECHNOLOGYMATTERS

THE FINAL MODEL
RUBYGARAGE2017
61
TECHNOLOGYMATTERS

UCB VS THOMPSON SAMPLING
RUBYGARAGE2017
TECHNOLOGYMATTERS

THOMPSON SAMPLING ALGORITHM. TRAINING.

THOMPSON SAMPLING ALGORITHM. RESULTS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

NATURAL LANGUAGE PROCESSING
RUBYGARAGE2017
TECHNOLOGYMATTERS
Natural Language Processing (or NLP) is applying Machine Learning
models to text and language.
Teaching machines to understand what is said in spoken
and written word is the focus of Natural.

RUBYGARAGE2017
TECHNOLOGYMATTERS
Language Processing.
Whenever you dictate something into your iPhone / Android device
that is then converted to text, that’s an NLP algorithm in action.

RUBYGARAGE2017
TECHNOLOGYMATTERS
You can use NLP on an article to predict some
categories of the articles you are trying to segment.
You can use NLP on a book to predict the genre of the book.

RUBYGARAGE2017
TECHNOLOGYMATTERS
A very well-known model in NLP is the Bag of Words model.
It is a model used to preprocess the texts to classify before
fitting the classification algorithms on the observations
containing the texts.

DATASET PRESENTATION. RESTAURANT REVIEWS.
RUBYGARAGE2017
TECHNOLOGYMATTERS

NLP. TRAINING. IMPORTING THE DATASET
AND CLEANING THE TEXTS.

NLP. TRAINING. CLEANING THE TEXTS. RESULTS.

NLP. TRAINING. CREATING THE BAG OF WORDS MODEL.

NLP. CREATING THE BAG OF WORDS MODEL.

NLP. TRAINING. SPLITTING THE DATASET
INTO THE TRAINING SET AND TEST SET.

NLP. TRAINING. FITTING NAIVE BAYES TO THE TRAINING SET.

NLP. TRAINING. PREDICTING AND MAKING
THE CONFUSION MATRIX.

NLP. CONFUSION MATRIX. RESULTS.

THE NEURON
RUBYGARAGE2017
TECHNOLOGYMATTERS

HOW DO NEURAL NETWORKS LEARN?
RUBYGARAGE2017
TECHNOLOGYMATTERS

NEURAL NETWORKS
RUBYGARAGE2017
TECHNOLOGYMATTERS

RUBYGARAGE2017
TECHNOLOGYMATTERS
TO BE
CONTINUED

Machine Learning as a Daily Work for a Programmer- Volodymyr Vorobiov

Recommended

Recommended

More Related Content

Similar to Machine Learning as a Daily Work for a Programmer- Volodymyr Vorobiov

Similar to Machine Learning as a Daily Work for a Programmer- Volodymyr Vorobiov (20)

More from Ruby Meditation

More from Ruby Meditation (20)

Recently uploaded

Recently uploaded (20)

Machine Learning as a Daily Work for a Programmer- Volodymyr Vorobiov