Introduction to Computer Vision (uapycon 2017)

Introduction to Computer Vision
Anton Kasyanov
antonkasyanov.com

About me
• Long time Python enthusiast
• Graduated from RWTH Aachen, Germany
• Defended Computer Vision Bachelor and
Master theses
• Work in Ecoisme as Backend and
Algorithms Engineer
Visual Odometry(later)

Outline
• Binary and grayscale processing
• Edges and lines detection
• Global and local descriptors
• Further topics

Binary images
• A matrix of 0’s and 1’s
• Simplest case
• Distinguish back- and foreground

Greyscale images
• More complicated
• Every pixel can be [0; 1] ﬂoat
• Higher variance
• Often are better then color images

Thresholding
• Grayscale to binary
• Use domain knowledge to pick T
• Or calculate it via intra-class variance
minimisation (Otsu method)
PerceptualandSensoryAugmentedComputingComputerVisionWS14/15
Thresholding
• Grayscale image Binary mask
• Different variants
One-sided
Two-sided
Set membership
B. Leibe
otherwise,0
,if,1
,
TjiF
jiFT
otherwise,0
,if,1
, 21 TjiFT
jiFT
otherwise,0
,if,1
,
ZjiF
jiFT
Image Source: http://homepages.inf.ed.ac.uk/rbf/HIPR2/
13
Thresholding
Grayscale image Binary mask
Different variants
One-sided
otherwise,0
,if,1
,
TjiF
jiFT

Niblack Threshold
Solution - use local window W with independent thresholds
Computing
Local Binarization [Niblack’86]
• Estimate a local threshold within a small neighborhood
window W
where k [-1,0] is a user-defined parameter.
W

Intensity gradient
• Filter image with Gaussian derivatives approximation
G = G2
x + G2
y

Canny edge detector
• Filter with Gaussian derivative
• Use gradient magnitude
• Threshold

Canny edge detector
• Lines have different thickness
• Parts may not survive thresholding
• Use 2 thresholds, one (stronger) to
start edges, second (weaker) to
continue
Thin
Thin
Thick

Line ﬁtting
• Many objects are characterised by straight lines
rceptualandSensoryAugmentedComputingomputerVisionWS14/15
Example: Line Fitting
• Why fit lines?
Many objects are characterized by presence of straight lines
• Wait, why aren’t we done just by running edge detection?

Why not to use edges?
• Noise
• Extra points
• Some parts can be missing

Hough Transform
• Voting technique
• What lines can be ﬁt to point?
• Extract lines that have most votes

Hough space
To each line in image space corresponds a point in Hough space.
x
y
Image space
y = a0x + b0
Hough space
a0
b0
a
b
y = a1x + b1
a1
b1

Hough space
To each point in image space corresponds a line in Hough space.
x
y
Image space Hough space
a
b
x0
y0
x1
y1
b = -x0a + y0
b = -x1a + y1

Hough transform
• Due to numerical reasons, it is better to use polar representation:
infinite values, undefined for vertical lines.
• Point in image space
Sinusoid segment in
Hough space
dyx sincos
[0,0]
d
x
y
: perpendicular dist
from line to origin
: angle the perpend
makes with the x-axis
d
Slide adapted from Steve Seitz

How to tell if images are similar?

How to tell if images are similar?
• Compare pixels
• Not robust to noise, lightning, position change
• Idea - capture color statistics via 3D color histograms
Color Histograms
• Color statistics
Here: RGB as an example
Given: tristimulus R,G,B for each pixel
Compute 3D histogram
– H(R,G,B) = #(pixels with color (R,G,B))
23
[Swain & Ballard, 1991]B. Leibe

Color histogramsPerceptualandSensoryAugmentedComputingComputerVisionWS14/15
Color Histograms
• Robust representation
26

Comparison
• Euclidian distance
• Mahalanobis Distance
• Chi-Square
• Earth Movers Distance
omp. Measures: Earth Movers Distance
Motivation: Moving Earth
≠

ProblemsrceptualandSensoryAugmentedComputingmputerVisionWS14/15
Challenges: Robustness
Illumination Object pose Clutter
ViewpointIntra-class
appearance
Occlusions
Image credit: Kristen Grauman

Problems
erceptualandSensoryAugmentedComputingomputerVisionWS14/15
Application: Image Matching
by Diva Sian
by swashford
Image credit: Flickr

Hard?ptualandSensoryAugmentedComputinguterVisionWS14/15 Harder Case
by Diva Sian by scgbt
Image credit: Flickr

Do you see the match?
Harder Still?
18
NASA Mars Rover images

Do you see the match?
Answer Below (Look for tiny colored squares)
19
B. Leibe
NASA Mars Rover images
with SIFT feature matches
(Figure by Noah Snavely)
Slide credit: Steve Seitz

Panorama stitching
ndSensoryAugmentedComputingsionWS14/15
Application: Image Stitching
• Procedure:

1. RepeatablePerceptualandSensoryAugmentedComputingComputerVisionWS14/15
• Problem 1:
Detect the same point independently in both images
25
B. Leibe
No chance to match!
We need a repeatable detector!
Slide credit: Darya Frolova, Denis Simakov
Different points, no change to match!

2. Distinctive
Should be easy to ﬁnd correspondence!
• Problem 1:
Detect the same point independently in both images
• Problem 2:
For each point correctly recognize the corresponding one
26
B. Leibe
We need a reliable and distinctive descriptor!
?
Slide credit: Darya Frolova, Denis Simakov

Corners as keypoints
• Repeatable
• Distinctive
• Easy to ﬁnd via image gradient ﬁltering
Corners as Distinctive Interest Points
• Design criteria
We should easily recognize the point by looking through a small
window (locality)
Shifting the window in any direction should give a large change
in intensity (good localization)
35
“edge”:
no change along
the edge direction
“corner”:
significant change
in all directions
“flat” region:
no change in all
directions

Hessian detector
SensoryAugmentedComputingnWS14/15
Hessian Detector [Beaudet78]
• Hessian determinant
yyxy
xyxx
II
II
IHessian )(
Ixx
IyyIxy
• Hessian determinant
2
))(det( xyyyxx IIIIHessian
2)^(. xyyyxx III
In Matlab:
yyxy
xyxx
II
II
IHessian )(
Ixx
IyyIxy
• Use 2nd derivatives over image
Hessian Detector – Responses [Beaudet78]
5
Slide credit: Krystian Mikolajczyk

Descriptors
• Ok, now we know how to localise keypoints
• But how do we actually compare them?
rceptualandSensoryAugmentedComputingmputerVisionWS14/15
Local Descriptors
• We know how to detect points
• Next question:
How to describe them for matching?
?
Point descriptor should be:
1. Invariant

SIFT
• Scale Invariant Feature Transform
• A 16x16 neighbourhood around the keypoint is taken. It is divided into 16
sub-blocks of 4x4 size. For each sub-block, 8 bin orientation histogram is
created. So a total of 128 bin values are available. It is represented as a
vector to form keypoint descriptor.

SIFT
• Very robust to illumination changes
• Fast - real time capable
• Became very popular and wide-spread

SIFT
PerceptualandSensoryAugmentedComputingComputerVisionWS14/15 Can handle significant changes in illumination
– Sometimes even day vs. night (below)
Fast and efficient—can run in real time
Lots of code available
– http://people.csail.mit.edu/albert/ladypack/wiki/index.php/Known_implementations_of_SIFT
60
B. Leibe
Slide credit: Steve Seitz
Image credit: Steve Seitz

Ada Boost
• Machine learning classification method
• Have lots (hundreds) of weak classifiers
• Every week classifier is correct in > 50% cases
• Calculate weights for all of them
• Combine them into a one strong classifier
2-class classification problem
Given: training set X = {x1, …, xN}
with target values T = {t1, …, tN }, tn 2 {-1,1}.
Associated weights W={w1, …, wN} for each training point.
Basic steps
In each iteration, AdaBoost trains a new weak classifier hm(x)
based on the current weighting coefficients W(m).
We then adapt the weighting coefficients for each point
– Increase wn if xn was misclassified by hm(x).
– Decrease wn if xn was classified correctly by hm(x).
Make predictions using the final combined model
17
B. Leibe
H(x) = sign
Ã MX
m=1
®mhm(x)
!
{x1,…,xn}
Image credit: Kristen Grauman

Viola–Jones
• Use AdaBoost to detect objects (faces)
• Every week classiﬁer is just a sum of
binary ﬁlters

Further topics
• Classiﬁcation via Deep Neural Nets
• Visual-inertial odometry
• Stereo vision
• Object tracking

Visual-inertial odometry
• Track position using 2 cameras and
IMU
• Useful when no GPS or similar
available

Keyframe-Based Visual-Inertial Online SLAM with Relocalization
https://arxiv.org/pdf/1702.02175.pdf

Python libraries
• scipy/numpy
• scikit-image
• OpenCV

OpenCV snippets
edges = cv2.Canny(img,100,200)
lines = cv2.HoughLines(edges,1,np.pi/180,200)
sift = cv2.xfeatures2d.SIFT_create()
kp, des = sift.detectAndCompute(img, None)

Thank you!
slides: antonkasyanov.com

Introduction to Computer Vision (uapycon 2017)

Recommended

Recommended

More Related Content

Similar to Introduction to Computer Vision (uapycon 2017)

Similar to Introduction to Computer Vision (uapycon 2017) (20)

More from Anton Kasyanov

More from Anton Kasyanov (7)

Recently uploaded

Recently uploaded (20)

Introduction to Computer Vision (uapycon 2017)