Introduction to Deep Learning

Intro to Deep Learning
Partners in Business IT Conference, USU, 3-24-16
by Adam Rogers, Data Scientist at Jane.com

What is a Neural Network?
http://promentumspeakers.com/what-happens-when-a-mental-illness-expert-has-an-
extremely-rare-stroke/
http://neuromastersoftware.com/neural-network-theory-introduction/

What is a Neural Network?
• Composed of Neurons
• Neurons have Activation Functions
• Neurons make up Layers
• Input Layer -> Hidden Layer(s) ->Output Layer

Neuron
Basic processing unit
Receives output of all neurons from
previous layer
Gets weighted sum and passes
through Activation Function to
calculate output
http://wwwold.ece.utep.edu/research/webfuzzy/docs/kk-thesis/kk-thesis-html/node14.html

Layer
• Where the magic happens
• Array of neurons that process outputs of previous layer (or input layer)
• Layers can have specific properties and activation functions

Activation Function
Performed on output of each neuron
Determines what (or whether) to pass on from each neuron to the next layer
Creates Non-linearity in your network
This is key to NN’s learning power!

Activation Functions
Sigmoid
http://www.saedsayad.com/artificial_neural_network.htm
Step
http://en.wikibooks.org/wiki/Artificial_Neural_Networks/Acti
vation_Functions#Continuous_Log-Sigmoid_Function
ReLU
http://cs231n.github.io/neural-networks-1/
And many others...

The Perceptron
The most basic Neural Network

Perceptron - Learning
• Gradient Descent (kind of)
• Run training example through network
• Calculate error of output and update weights
• Specifically:
• If output = 0 when it should be 1
• Add input vector to weight vector
• If output = 1 when it should be 0
• Subtract input vector from weight vector

Multilayer (Deep) Networks
• Perceptrons fail on complex problems (basically all real-world problems)
• Need better activation functions and more layers
• How to train (update weights of) multiple layers simultaneously?

Backpropagation!
• Most math-heavy part of NN’s
• Run training example through network and calculate error derivative on
output
• From there, calculate error derivatives of hidden unit outputs
• Then calculate error derivative on weights and update accordingly

Backpropagation
https://class.coursera.org/neuralnets-2012-001/lecture/39

Feel the Power
With hidden layers and better activation functions, NN’s work on far more
complicated problems
Non-linearity of activations allow untangling complicated data
http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/

Convolutional NN’s
http://colah.github.io/posts/2014-
07-Conv-Nets-Modular/

Magical CNN’s
http://arxiv.org/pdf/1504.03293v1.pdf
Landrover!

CNN in Action!
http://demo.caffe.berkeleyvision.org/

Recurrent Neural Networks
• What if we have time series or sequential data where “memory” of prior data
points is relevant?
• Answering questions from an article
• What if we have variable input sizes?
• Text inputs
• What if we need variable output sizes?
• Captioning images
• Recurrent Neural Networks!

RNN’s
http://karpathy.github.io/assets/rnn/diags.jpeg

RNN Magic
http://arxiv.org/pdf/1411.4
389v3.pdf

RNN Magic
Ask Your Neurons: A Neural-based
Approach to Answering Questions
about Images
http://arxiv.org/pdf/1505.01121.pdf

RNN Magic
DRAW: A Recurrent Neural
Network For Image Generation
http://arxiv.org/pdf/1502.04623.pd
f

Deep Learning in NLP
How to represent relationships between words in a way that computers
understand?
Answer: Word Embeddings

Word Embeddings
http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/

Deep Learning in NLP
• Useful in many NLP techniques
• Sentiment Analysis
• Document Classification
• Question Analysis
• Language Translation
• Speech Recognition
• etc.

Tools
• Theano/Keras (Python tensor library)
• http://deeplearning.net/software/theano/
• http://keras.io/
• Caffe (Berkeley C++ deep learning library)
• http://caffe.berkeleyvision.org/
• Torch (Lua machine learning)
• https://github.com/dmlc/cxxnet
• TensorFlow from Google
• http://www.tensorflow.org/

A Word of Warning
• While powerful, NN’s are very complicated
• Millions to billions of parameters can lead to overfitting
• Can take months to train (days on GPU)
• Not the answer to every problem
• Often a simpler algorithm will do almost as well or even better

Summary
• Neural Nets are a powerful and versatile tool
• Useful in many domains
• Broad and deep field of machine learning on its own
• Weigh complexity of NN’s against performance increase
• Many times simpler algorithms work almost as well

Introduction to Deep Learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Introduction to Deep Learning

Similar to Introduction to Deep Learning (20)

Recently uploaded

Recently uploaded (20)

Introduction to Deep Learning