This document provides an introduction to computer vision. It begins with an outline of topics covered, including binary and grayscale image processing, edge and line detection, descriptors, and further topics like classification and visual-odometry. Thresholding techniques are described for converting grayscale to binary images. The Canny edge detector and Hough transform for line detection are explained. Color histograms and SIFT descriptors are introduced for comparing image similarity. AdaBoost and Viola-Jones object detection are overviewed. Popular Python libraries like OpenCV are listed for computer vision tasks.
2. About me
• Long time Python enthusiast
• Graduated from RWTH Aachen, Germany
• Defended Computer Vision Bachelor and
Master theses
• Work in Ecoisme as Backend and
Algorithms Engineer
Visual Odometry(later)
12. Niblack Threshold
Solution - use local window W with independent thresholds
Computing
Local Binarization [Niblack’86]
• Estimate a local threshold within a small neighborhood
window W
where k [-1,0] is a user-defined parameter.
W
16. Canny edge detector
• Filter with Gaussian derivative
• Use gradient magnitude
• Threshold
17. Canny edge detector
• Lines have different thickness
• Parts may not survive thresholding
• Use 2 thresholds, one (stronger) to
start edges, second (weaker) to
continue
Thin
Thin
Thick
18.
19.
20. Line fitting
• Many objects are characterised by straight lines
rceptualandSensoryAugmentedComputingomputerVisionWS14/15
Example: Line Fitting
• Why fit lines?
Many objects are characterized by presence of straight lines
• Wait, why aren’t we done just by running edge detection?
21. Why not to use edges?
• Noise
• Extra points
• Some parts can be missing
22. Hough Transform
• Voting technique
• What lines can be fit to point?
• Extract lines that have most votes
23. Hough space
To each line in image space corresponds a point in Hough space.
x
y
Image space
y = a0x + b0
Hough space
a0
b0
a
b
y = a1x + b1
a1
b1
24.
25. Hough space
To each point in image space corresponds a line in Hough space.
x
y
Image space Hough space
a
b
x0
y0
x1
y1
b = -x0a + y0
b = -x1a + y1
26. Hough transform
• Due to numerical reasons, it is better to use polar representation:
PerceptualandSensoryAugmentedComputingComputerVisionWS14/15
infinite values, undefined for vertical lines.
• Point in image space
Sinusoid segment in
Hough space
dyx sincos
[0,0]
d
x
y
: perpendicular dist
from line to origin
: angle the perpend
makes with the x-axis
d
Slide adapted from Steve Seitz
29. How to tell if images are similar?
• Compare pixels
• Not robust to noise, lightning, position change
• Idea - capture color statistics via 3D color histograms
PerceptualandSensoryAugmentedComputingComputerVisionWS14/15
Color Histograms
• Color statistics
Here: RGB as an example
Given: tristimulus R,G,B for each pixel
Compute 3D histogram
– H(R,G,B) = #(pixels with color (R,G,B))
23
[Swain & Ballard, 1991]B. Leibe
35. Do you see the match?
PerceptualandSensoryAugmentedComputingComputerVisionWS14/15
Harder Still?
18
NASA Mars Rover images
36. Do you see the match?
PerceptualandSensoryAugmentedComputingComputerVisionWS14/15
Answer Below (Look for tiny colored squares)
19
B. Leibe
NASA Mars Rover images
with SIFT feature matches
(Figure by Noah Snavely)
Slide credit: Steve Seitz
39. 2. Distinctive
Should be easy to find correspondence!
PerceptualandSensoryAugmentedComputingComputerVisionWS14/15
• Problem 1:
Detect the same point independently in both images
• Problem 2:
For each point correctly recognize the corresponding one
26
B. Leibe
We need a reliable and distinctive descriptor!
?
Slide credit: Darya Frolova, Denis Simakov
40. Corners as keypoints
• Repeatable
• Distinctive
• Easy to find via image gradient filtering
PerceptualandSensoryAugmentedComputingComputerVisionWS14/15
Corners as Distinctive Interest Points
• Design criteria
We should easily recognize the point by looking through a small
window (locality)
Shifting the window in any direction should give a large change
in intensity (good localization)
35
“edge”:
no change along
the edge direction
“corner”:
significant change
in all directions
“flat” region:
no change in all
directions
41. Hessian detector
SensoryAugmentedComputingnWS14/15
Hessian Detector [Beaudet78]
• Hessian determinant
yyxy
xyxx
II
II
IHessian )(
Ixx
IyyIxy
PerceptualandSensoryAugmentedComputingComputerVisionWS14/15
• Hessian determinant
2
))(det( xyyyxx IIIIHessian
2)^(. xyyyxx III
In Matlab:
yyxy
xyxx
II
II
IHessian )(
Ixx
IyyIxy
• Use 2nd derivatives over image
PerceptualandSensoryAugmentedComputingComputerVisionWS14/15
Hessian Detector – Responses [Beaudet78]
5
Slide credit: Krystian Mikolajczyk
42. Descriptors
• Ok, now we know how to localise keypoints
• But how do we actually compare them?
rceptualandSensoryAugmentedComputingmputerVisionWS14/15
Local Descriptors
• We know how to detect points
• Next question:
How to describe them for matching?
?
Point descriptor should be:
1. Invariant
43. SIFT
• Scale Invariant Feature Transform
• A 16x16 neighbourhood around the keypoint is taken. It is divided into 16
sub-blocks of 4x4 size. For each sub-block, 8 bin orientation histogram is
created. So a total of 128 bin values are available. It is represented as a
vector to form keypoint descriptor.
44. SIFT
• Very robust to illumination changes
• Fast - real time capable
• Became very popular and wide-spread
45. SIFT
PerceptualandSensoryAugmentedComputingComputerVisionWS14/15 Can handle significant changes in illumination
– Sometimes even day vs. night (below)
Fast and efficient—can run in real time
Lots of code available
– http://people.csail.mit.edu/albert/ladypack/wiki/index.php/Known_implementations_of_SIFT
60
B. Leibe
Slide credit: Steve Seitz
Image credit: Steve Seitz
46. Ada Boost
• Machine learning classification method
• Have lots (hundreds) of weak classifiers
• Every week classifier is correct in > 50% cases
• Calculate weights for all of them
• Combine them into a one strong classifier
2-class classification problem
Given: training set X = {x1, …, xN}
with target values T = {t1, …, tN }, tn 2 {-1,1}.
Associated weights W={w1, …, wN} for each training point.
Basic steps
In each iteration, AdaBoost trains a new weak classifier hm(x)
based on the current weighting coefficients W(m).
We then adapt the weighting coefficients for each point
– Increase wn if xn was misclassified by hm(x).
– Decrease wn if xn was classified correctly by hm(x).
Make predictions using the final combined model
17
B. Leibe
H(x) = sign
à MX
m=1
®mhm(x)
!
{x1,…,xn}
Image credit: Kristen Grauman
47. Viola–Jones
• Use AdaBoost to detect objects (faces)
• Every week classifier is just a sum of
binary filters