SlideShare a Scribd company logo
1 of 141
Download to read offline
How to Come Up With New
    Research Ideas?
          Jia-Bin Huang
     jbhuang0604@gmail.com


            Taiwan
           May , 2010




                             1 / 94
What this talk is about?
   Five approaches to come up with new ideas in computer vision.
   Extensive case studies (i.e., more than one hundred papers).
   A common sense talk. No complicate theories or equations.
       I wish someone told me this before.

Reference
   The content of this talk is greatly inspired by “Raskar Idea
   Hexagon".




                                                                   2 / 94
What this talk is about?
   Five approaches to come up with new ideas in computer vision.
   Extensive case studies (i.e., more than one hundred papers).
   A common sense talk. No complicate theories or equations.
       I wish someone told me this before.

Reference
   The content of this talk is greatly inspired by “Raskar Idea
   Hexagon".




                                                                   2 / 94
What this talk is about?
   Five approaches to come up with new ideas in computer vision.
   Extensive case studies (i.e., more than one hundred papers).
   A common sense talk. No complicate theories or equations.
       I wish someone told me this before.

Reference
   The content of this talk is greatly inspired by “Raskar Idea
   Hexagon".




                                                                   2 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?



                                                                3 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                                4 / 94
Active Topics in Computer Vision
[Szeliski Computer Vision: Algorithms and Applications 2010]

     Digital image processing                Blocks world, line labeling
       Generalized cylinders                    Pictorial structures
      Stereo correspondence                       Intrinsic images
            Optical flow                        Structure from motion
          Image pyramids                      Scale-space processing
           Shape from X                      Physically-based modeling
           Regularization                     Markov Random Fields
           Kalman filters                     3D range data processing
        Projective invariants                       Factorization
       Physics-based vision                          Graph cuts
          Particle filtering                 Energy-based segmentation
   Face recognition and detection               Subspace methods
  Image-based modeling/rendering            Texture synthesis/inpainting
    Computational photography                Feature-based recognition
     MRF inference algorithms                         Learning

                                                                           5 / 94
What can we learn from the past?
   The topics are diverse and evolve over time.




   The ways to come up with new ideas are similar. There are
   patterns to follow.




                                                               6 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                                7 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                                8 / 94
Seek different dimensions   neXt = X d




   The only difference between a rut
   and a grave is their dimensions. -
            Ellen Glasgow



                                         9 / 94
Seek different dimensions                     neXt = X d




Idea
     Can we increase/replace/transform the dimensions of the original
     problem to get new problems/solutions?

What kind of dimensions can we work on?
 1   Concrete dimensions (e.g., space, time, frequency)
 2   Abstract dimensions (e.g., properties)




                                                                   10 / 94
EX 1-1. Content-Aware Media Resizing
[Avidan et al. SIGGRAPH 07] [Rubinstein et al. SIGGRAPH 08]




Ideas
     Extend dimensions from 2D image to 3D video: image re-targeting
     ⇒ video re-targeting
     Other dimensions? E.g., 4D light field, infrared image, range
     image.
                                                                    11 / 94
EX 1-2. Video Stitching
[Rav-Acha et al. CVPR 05]




         Input video                Dynamic Panorama
Ideas
     Extend dimensions from image to video, i.e., Image Panorama ⇒
     Video Mosaics with Non-Chronological Time
     Increase the time dimension in both input and output



                                                                12 / 94
EX 1-3. Multi-Image Fusion
[Agarwala et al. SIGGRAPH 04]




Ideas
     Extend from single input image to multiple input images ⇒ Digital
     Photomontage
     Increase the dimension in input only.
                                                                    13 / 94
EX 1-4. Computation Photography (Coded
Photography)
[Raskar et al. SIGGRAPH 04, 06, 08] [Levin et al. SIGGRAPH 07]




Ideas
     Coded Photography: reversibly encode information about the
     scene in a single photograph
     Coding in Time (Exposure), Coded Illumination, Coding in Space
     (aperture), and Coded Wavelength
     Replace the dimension to code information of the light field


                                                                   14 / 94
EX 1-1. Photography in Low Light Conditions




       Flash                 Blurred                  Noisy

What we can do ?
   Flash → Changes the overall scene appearance (cold and gray)
   Long exposure time (hand shake) → Blurred image
   Short exposure time (insufficient light) → Noisy image




                                                              15 / 94
EX 1-1-1. Flash/non-Flash Photography
[Petschnigg et al. SIGGRAPH 2004]




        Flash                No flash      Detail transfer with denoising

Ideas
     The original problem (taking a good photo in low light
     environments from single image) is difficult.
     Increase the dimension of input (flash/no-flash image pair) make
     the problem much easier.

                                                                     16 / 94
EX 1-1-2. Image Deblurring with Blurred/Noisy Image
Pairs
[Yuan et al. SIGGRAPH 2007]




     Blurred            Noisy      Enhanced noisy    Deblurred result

Ideas
     The original problem (taking a good photo in low light and flash
     prohibited environments from single image) is difficult.
     Increase the dimension of input (Blurred/Noisy image pair) make
     the problem much easier.


                                                                       17 / 94
EX 1-1-3. Robust Flash Deblurring
[Zhou et al. CVPR 2010]




Ideas
     The original problem (taking a good photo in low light
     environments from single image) is difficult.
     Increase the dimension of input (Blurred/Flash image pair) make
     the problem much easier.



                                                                   18 / 94
EX 1-1-4. Dark Flash Photography
[Krishnan et al. SIGGRAPH 2009]




Ideas
     The original problem (taking a good photo in low light
     environments from single image) is difficult.
     Increase the dimension of input (Dark Flash/Noisy image pair)
     make the problem much easier.
                                                                     19 / 94
EX 1-2. Brute-Force Vision
[Hays and Efros SIGGRAPH 07] [Dale et al. ICCV 09] [Agarwal et al. ICCV 09]
[Furukawa et al. ICCV 09]




Ideas
     Utilize a large collection of photos.
                                                                              20 / 94
EX 2-1. X Alignment/Registration (pixel, object, scene)
[Liu et al. CVPR 08, ECCV 08] [Berg et al. CVPR 05]




                                                      21 / 94
EX 2-2. Shape from X (shading, texture, specular)
[Lobay and Forsyth IJCV 06] [Fleming et al JOV 04] [Adato et al ICCV 07]




                shading                                 specular




                texture                               specular flow
                                                                           22 / 94
EX 2-3. Depth from X (stereo, (de-)focus, coded
aperture, diffusion, occlusion, semantic label)
[Levin et al. SIGGRAPH 07] [Hoiem et al. ICCV 07] [Liu et al. CVPR 10] [Zhou et al.
CVPR 10]




         Coded Aperture                           Semantic Labels




           Occlusion                                 Diffusion

                                                                                  23 / 94
EX 2-4. Infer X from a single image (geometric,
geography, illumination)
[Hoiem et al. ICCV 05] [Hays and Efros CVPR 08] [Lalonde et al. ICCV 09]




                                                                     Geometric




                                                                    Geography




                                                                    Illumination
                                                                              24 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                               25 / 94
Combine two or more topics   neXt = X + Y




   To steal ideas from one person is
   plagiarism. To steal from many is
       research. - Wilson Mizner



                                            26 / 94
Combine two or more topics                 neXt = X + Y



Idea
     Can we combine two or more topics to get new problems or
     solutions?

What kind of topics can we combine?
 1   X, Y are methods
 2   X, Y are problems
 3   X, Y are areas




                                                                27 / 94
EX 1-1. Viola-Jones Object Detection Framework
[Viola and Jones CVPR 2001]




 Simple feature      Integral img   Boosting      Cascade structure

Ideas
     Paper title: Rapid Object Detection using a Boosted Cascade of
     Simple Features
     Viola-Jones object detection framework = Integral Images (simple
     feature)(1984) + AdaBoost(1997) + Cascade Architecture(long
     time ago)


                                                                      28 / 94
EX 1-2. SIFT Flow = SIFT + Optical Flow
[Liu et al. ECCV 08 CVPR 09]




    Motion hallucination
                                   Label transfer
Ideas
     Dense sampling in time : optical flow :: dense sampling in world
     images : SIFT flow
                                                                       29 / 94
EX 1-3. Visual Tracking with Online Multiple Instance
Boosting
[Babenko et al. CVPR 09]




Ideas
     MILTrack = Multiple Instance Boosting (2005) + Online Boosting
     Tracking (2006)
                                                                      30 / 94
EX 2-1. High Dynamic Range Image Reconstruction
from Hand-held Cameras
[Lu et al. CVPR 2009]




Ideas
     HDR from from Hand-held Cameras = High Dynamic Range
     Image Reconstruction + Image Deblurring
                                                            31 / 94
EX 2-2. Human Body Understanding
[Guan et al. ICCV 09]




Ideas
     Human Body Understanding = Shape Reconstruction + Pose
     Estimation

                                                              32 / 94
EX 2-3. Image Understanding
detection, tracking, recognition, segmentation, reconstruction, scene classification,
event recognition




                                                                                       33 / 94
EX 2-3-1. Detection + Tracking
[Andriluka et al. CVPR 08]




Ideas
      People detection and people tracking are highly correlated
      problems.
      Combine two problems can potentially achieve improved
      performance on individual tasks.



                                                                   34 / 94
EX 2-3-2. Object Attribute + Recognition
[Farhadi et al. CVPR 09] [Lampert et al. CVPR 09]




Ideas
     Describe image by attributes
     Enable knowledge transfer to recognition class with no visual
     examples
                                                                     35 / 94
EX 2-3-2. Object Recognition + Detection
[Yeh et al. CVPR 09]




Ideas
     Concurrent object localization and recognition
                                                      36 / 94
EX 2-3-3. Image Segmentation + Object Recognition
+ Event Recognition
[Li et al. CVPR 09]




Ideas
      Combine scene classification, image segmentation, image
      annotation
      All three tasks are mutually beneficial
                                                               37 / 94
EX 3-1. SixthSense - A Wearable Gestural Interface
[Mistry and Maes TED 2009]




Ideas
     SixthSense = Computer Vision (e.g., tracking, recognition) +
     Internet
                                                                    38 / 94
EX 3-2. Sikuli:Picture-driven computing
[Yeh et al. UIST 09] [Chang et al. CHI 10]




Ideas
      1. Readability/usability, 2. GUI serialization, 3. Computer vision
      on computer-generated figures
                                                                           39 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                               40 / 94
Re-think the research directions          ¯
                                   neXt = X




If at first, the idea is not absurd, then
         there is no hope for it -
              Albert Einstein



                                              41 / 94
Re-think the research directions                         ¯
                                                  neXt = X



Ideas
     Are the current research directions really make sense? What’s the
     key problem?

What could we do?
 1   Re-formulate the original problem.
 2   Analyze, compare existing approaches. Provide insight to the
     problems.




                                                                    42 / 94
EX 1-1. Beyond Sliding Windows
[Lampert et al. CVPR 08]




             Rectangle set              Branch and bound search

Ideas
     Sliding window search ⇔ brand-and-bound search
     Represent a set of rectangles with 4 intervals
     Use brand-and-bound to find the optimal rectangle (object
     localization) efficiently

                                                                  43 / 94
EX 1-2. Beyond Categories
[Malisiewicz and Efros CVPR 08, NIPS 09]




Ideas
     Explicit categorization ⇔ Implicit categorization
     Ask "what is this like?" (association), instead of "what is it?"
     (categorization)
                                                                        44 / 94
EX 1-3. Motion-Invariant Photography
[Levin et al. SIGGRAPH 08] [Cho et al. ICCP 10]




Ideas
     Still camera ⇔ Moving camera (parabolic exposures)
     Enable the use of spatial-invariant blur kernel estimation


                                                                  45 / 94
EX 1-4. Super-resolution from Single Image
[Glasner et al. ICCV 09]




Ideas
      Clasical multi-image SR/Example-based SR ⇔ Single SR
      framework
                                                             46 / 94
EX 2-1. In Defense of ...
[Boiman et al. CVPR 08] [Hartley PAMI 97]



Nearest-Neighbor Based Image Classification
     Quantization of local image descriptors (used to generate
     "bags-of-words", codebooks).
     Computation of "Image-to-Image" distance, instead of
     "Image-to-Class" distance
     The performance ranks among the top leading learning-based
     image classifiers

The 8-point Algorithm for the fundamental matrix
     Normalization, Normalization, Normalization!
     Performs almost as well as the best iterative algorithm


                                                                  47 / 94
EX 2-2. Understanding blind deconvolution
[Levin et al. CVPR 2009]




Ideas
     Blind deconvolution: recover sharp image x from the blurred one
     (y = k ⊗ x + n).
     MAPx,k estimation often favors no-blur explanations.
     MAPk can be accurately estimated since the kernel size is often
     smaller than the image size.
     Blind deconvolution should be address in this way: MAPk +
     non-blind deconvolution.




                                                                       48 / 94
EX 2-3. Understanding camera trade-offs
[Levin et al. ECCV 08]




Ideas
      Traditional optics evaluation: 2D image sharpness (eg, Modulation
      Transfer Function)
      Modern camera evaluation: How well does the recorded data
      allow us to estimate the visual world - the lightfield?
                                                                    49 / 94
EX 2-4. What is a good image segment?
[Bagon et al. ECCV 08]




Ideas
     Good image segment as one which can be easily composed using
     its own pieces, but is difficult to compose using pieces from other
     parts of the image
                                                                    50 / 94
EX 2-5. Lambertian Reflectance and Linear
Subspaces
[Basri and Jacobs PAMI 03]




Ideas
     The set of all Lambertian reflectance functions (the mapping from
     surface normals to intensities) obtained with arbitrary distant light
     sources lies close to a 9D linear subspace.
     Explain prior empirical results using linear subspace methods.

                                                                        51 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                               52 / 94
Use powerful tools, find suitable problems neXt = X ↑




If the only tool you have is a hammer,
  you tend to see every problem as a
        nail. - Abraham Maslow



                                                  53 / 94
Use powerful tools, find suitable problems neXt = X ↑


What kinds of tools should we understand?
   Calculus of Variations
   Dimensionality Reduction
   Spectral Methods (specifically, spectral clustering)
   Probabilistic Graphical Model
   Structured Prediction
   Bilateral Filtering
   Sparse Representation
   and more ... spectral method/theory, information theory, (convex)
   optimization, etc



                                                                   54 / 94
EX 1. Calculus of Variations (1/2)
From Calculus to Calculus of Variations
      Calculus                    Calculus of Variations
      Functions              Functionals (functions of functions)
                                               x
      f: Rn → R             f: F → R, f (u) = x12 L(x, u(x), u (x))dx
                  (x)                                        df (u)
    Derivative dfdx                          Variation        du
 lim∆x→0 f (x+∆x)−f (x)
              ∆x              lim   →0
                                         f (u+ δx)−f (u) ∂
                                                   f (x + ∆u)|
                                                         ∂            =0
   Local extremum                       Local extremum
       df (x)
        dx = 0                      Euler-Lagrange equation

Total Variation (TV)
                       x1
           TV(y) =    x0    |y |dx: The "oscillation strength" of y(x)




                                                                           55 / 94
EX 1. Calculus of Variations (2/2)
Total Variation Denoising/Inpainting




Applications in computer vision
    Optical flow [Horn and Schunck AI 81]
    Shape from shading [Horn and Brooks CVGIP 86]
    Edge detection [PAMI 87]
    Anisotropic diffusion [Perona and Malik PAMI 90]
    Active contours model [Kass et al. IJCV 98]
    Image segmentation [Morel and Solimini 95]
    Image restoration [Aubert and Vese SIAM Journal on NA 97]   56 / 94
EX 1. Calculus of Variations (2/2)
Total Variation Denoising/Inpainting




Applications in computer vision
    Optical flow [Horn and Schunck AI 81]
    Shape from shading [Horn and Brooks CVGIP 86]
    Edge detection [PAMI 87]
    Anisotropic diffusion [Perona and Malik PAMI 90]
    Active contours model [Kass et al. IJCV 98]
    Image segmentation [Morel and Solimini 95]
    Image restoration [Aubert and Vese SIAM Journal on NA 97]   56 / 94
EX 2. Dimensionality Reduction (1/2)

Why we need dimensionality reduction?
Since high-dimensional data is everywhere (e.g., images, human gene
distributions, weather prediction), we need dimensionality reduction for
 1   processing data efficiently.
 2   estimating the distributions of data accuratly (curse of
     dimensionality)
 3   finding meaningful representation of data

Classification of dimensionality reduction methods
               Global structure preserved    Local structure preserved
  Linear               PCA, LDA                      LPP, NPE
 Nonlinear     ISOMAP, Kernel PCA, DM              LLE, LE, HE


                                                                     57 / 94
EX 2. Dimensionality Reduction (1/2)

Why we need dimensionality reduction?
Since high-dimensional data is everywhere (e.g., images, human gene
distributions, weather prediction), we need dimensionality reduction for
 1   processing data efficiently.
 2   estimating the distributions of data accuratly (curse of
     dimensionality)
 3   finding meaningful representation of data

Classification of dimensionality reduction methods
               Global structure preserved    Local structure preserved
  Linear               PCA, LDA                      LPP, NPE
 Nonlinear     ISOMAP, Kernel PCA, DM              LLE, LE, HE


                                                                     57 / 94
EX 2. Dimensionality Reduction (2/2)
Applications in computer vision
    Subspace as constraints
        Structure from motion [Tomasi and Kanade IJCV 92], Optical flow
        [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face
        alignment [Saragih et al. ICCV 09]
    Face recognition (e.g., PCA, LDA, LPP)
        PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97],
        LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09]
    Motion segmentation
        subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV
        06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09]
    Lighting
        linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades
        et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02]
    Visual tracking
        incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR
        08]
                                                                        58 / 94
EX 2. Dimensionality Reduction (2/2)
Applications in computer vision
    Subspace as constraints
        Structure from motion [Tomasi and Kanade IJCV 92], Optical flow
        [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face
        alignment [Saragih et al. ICCV 09]
    Face recognition (e.g., PCA, LDA, LPP)
        PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97],
        LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09]
    Motion segmentation
        subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV
        06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09]
    Lighting
        linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades
        et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02]
    Visual tracking
        incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR
        08]
                                                                        58 / 94
EX 2. Dimensionality Reduction (2/2)
Applications in computer vision
    Subspace as constraints
        Structure from motion [Tomasi and Kanade IJCV 92], Optical flow
        [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face
        alignment [Saragih et al. ICCV 09]
    Face recognition (e.g., PCA, LDA, LPP)
        PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97],
        LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09]
    Motion segmentation
        subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV
        06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09]
    Lighting
        linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades
        et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02]
    Visual tracking
        incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR
        08]
                                                                        58 / 94
EX 2. Dimensionality Reduction (2/2)
Applications in computer vision
    Subspace as constraints
        Structure from motion [Tomasi and Kanade IJCV 92], Optical flow
        [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face
        alignment [Saragih et al. ICCV 09]
    Face recognition (e.g., PCA, LDA, LPP)
        PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97],
        LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09]
    Motion segmentation
        subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV
        06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09]
    Lighting
        linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades
        et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02]
    Visual tracking
        incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR
        08]
                                                                        58 / 94
EX 2. Dimensionality Reduction (2/2)
Applications in computer vision
    Subspace as constraints
        Structure from motion [Tomasi and Kanade IJCV 92], Optical flow
        [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face
        alignment [Saragih et al. ICCV 09]
    Face recognition (e.g., PCA, LDA, LPP)
        PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97],
        LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09]
    Motion segmentation
        subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV
        06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09]
    Lighting
        linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades
        et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02]
    Visual tracking
        incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR
        08]
                                                                        58 / 94
EX 3. Spectral Clustering (1/3)
Why spectral clustering is popular?
      Can be solved efficiently by standard linear algebra software
      Very often outperform traditional clustering algorithms

Spectral clustering algorithm
Input: a set of data points
  1   Construct a similarity graph, e.g., -neighbor, k-nearest neighbor,
      fully connected
  2   Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym )
  3   Compute the first k (with smallest eigenvalues) eigenvectors of L,
      v1 , · · · , vk
  4   Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns
  5   Cluster the row vectors yi with the k-means algorithm into cluster
      C1 , · · · , Ck
Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci }
                                                                         59 / 94
EX 3. Spectral Clustering (1/3)
Why spectral clustering is popular?
      Can be solved efficiently by standard linear algebra software
      Very often outperform traditional clustering algorithms

Spectral clustering algorithm
Input: a set of data points
  1   Construct a similarity graph, e.g., -neighbor, k-nearest neighbor,
      fully connected
  2   Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym )
  3   Compute the first k (with smallest eigenvalues) eigenvectors of L,
      v1 , · · · , vk
  4   Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns
  5   Cluster the row vectors yi with the k-means algorithm into cluster
      C1 , · · · , Ck
Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci }
                                                                         59 / 94
EX 3. Spectral Clustering (1/3)
Why spectral clustering is popular?
      Can be solved efficiently by standard linear algebra software
      Very often outperform traditional clustering algorithms

Spectral clustering algorithm
Input: a set of data points
  1   Construct a similarity graph, e.g., -neighbor, k-nearest neighbor,
      fully connected
  2   Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym )
  3   Compute the first k (with smallest eigenvalues) eigenvectors of L,
      v1 , · · · , vk
  4   Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns
  5   Cluster the row vectors yi with the k-means algorithm into cluster
      C1 , · · · , Ck
Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci }
                                                                         59 / 94
EX 3. Spectral Clustering (1/3)
Why spectral clustering is popular?
      Can be solved efficiently by standard linear algebra software
      Very often outperform traditional clustering algorithms

Spectral clustering algorithm
Input: a set of data points
  1   Construct a similarity graph, e.g., -neighbor, k-nearest neighbor,
      fully connected
  2   Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym )
  3   Compute the first k (with smallest eigenvalues) eigenvectors of L,
      v1 , · · · , vk
  4   Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns
  5   Cluster the row vectors yi with the k-means algorithm into cluster
      C1 , · · · , Ck
Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci }
                                                                         59 / 94
EX 3. Spectral Clustering (1/3)
Why spectral clustering is popular?
      Can be solved efficiently by standard linear algebra software
      Very often outperform traditional clustering algorithms

Spectral clustering algorithm
Input: a set of data points
  1   Construct a similarity graph, e.g., -neighbor, k-nearest neighbor,
      fully connected
  2   Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym )
  3   Compute the first k (with smallest eigenvalues) eigenvectors of L,
      v1 , · · · , vk
  4   Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns
  5   Cluster the row vectors yi with the k-means algorithm into cluster
      C1 , · · · , Ck
Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci }
                                                                         59 / 94
EX 3. Spectral Clustering (1/3)
Why spectral clustering is popular?
      Can be solved efficiently by standard linear algebra software
      Very often outperform traditional clustering algorithms

Spectral clustering algorithm
Input: a set of data points
  1   Construct a similarity graph, e.g., -neighbor, k-nearest neighbor,
      fully connected
  2   Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym )
  3   Compute the first k (with smallest eigenvalues) eigenvectors of L,
      v1 , · · · , vk
  4   Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns
  5   Cluster the row vectors yi with the k-means algorithm into cluster
      C1 , · · · , Ck
Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci }
                                                                         59 / 94
EX 3. Spectral Clustering (1/3)
Why spectral clustering is popular?
      Can be solved efficiently by standard linear algebra software
      Very often outperform traditional clustering algorithms

Spectral clustering algorithm
Input: a set of data points
  1   Construct a similarity graph, e.g., -neighbor, k-nearest neighbor,
      fully connected
  2   Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym )
  3   Compute the first k (with smallest eigenvalues) eigenvectors of L,
      v1 , · · · , vk
  4   Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns
  5   Cluster the row vectors yi with the k-means algorithm into cluster
      C1 , · · · , Ck
Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci }
                                                                         59 / 94
EX 3. Spectral Clustering (2/3)


Why it works?
   Graph Cut Point of View: Construct a partition that minimize the
   weight across the cut (the well-known mincut problem) while
   balancing the clusters (e.g., RatioCut, Normalized cut).
   Random Walks Point of View: When minimizing Ncut, we
   actually look for a cut through the graph such that a random walk
   seldom transitions from one cluster to another.
   Perturbation Theory Point of View: The distance between
   eigenvectors from the ideal and nearly ideal graph Laplacian is
   bounded by a constant times a norm of the error matrix. If the
   perturbations are not small enough, then the k-means algorithm
   will still separate the groups from each other.



                                                                  60 / 94
EX 3. Spectral Clustering (2/3)


Why it works?
   Graph Cut Point of View: Construct a partition that minimize the
   weight across the cut (the well-known mincut problem) while
   balancing the clusters (e.g., RatioCut, Normalized cut).
   Random Walks Point of View: When minimizing Ncut, we
   actually look for a cut through the graph such that a random walk
   seldom transitions from one cluster to another.
   Perturbation Theory Point of View: The distance between
   eigenvectors from the ideal and nearly ideal graph Laplacian is
   bounded by a constant times a norm of the error matrix. If the
   perturbations are not small enough, then the k-means algorithm
   will still separate the groups from each other.



                                                                  60 / 94
EX 3. Spectral Clustering (2/3)


Why it works?
   Graph Cut Point of View: Construct a partition that minimize the
   weight across the cut (the well-known mincut problem) while
   balancing the clusters (e.g., RatioCut, Normalized cut).
   Random Walks Point of View: When minimizing Ncut, we
   actually look for a cut through the graph such that a random walk
   seldom transitions from one cluster to another.
   Perturbation Theory Point of View: The distance between
   eigenvectors from the ideal and nearly ideal graph Laplacian is
   bounded by a constant times a norm of the error matrix. If the
   perturbations are not small enough, then the k-means algorithm
   will still separate the groups from each other.



                                                                  60 / 94
EX 3. Spectral Clustering (3/3)
[Shi and Malik PAMI 02]

                  Eigenvectors carry contour information.




                                                            61 / 94
EX 4. Probabilistic Graphical Model (1/2)




What is probabilistic graphical models?
    A marriage between probability theory and graph theory.
    A natural tool for dealing with uncertainty and complexity
    Provides a way to view all probablistic systems (e.g., mixture
    models, factor analysis, hidden Markov models, Kalman filters and
    Ising models) as instances of a common underlying formalism.




                                                                 62 / 94
EX 4. Probabilistic Graphical Model (2/2)




                                            63 / 94
EX 5. Structured Prediction (1/2)


What is structured prediction?
    Structured prediction is a framework for solving problems of
    classification or regression in which the output variables are
    mutually dependent or constrained.
    Lots of examples
        Natural language parsing
        Machine translation
        Object segmentation
        Gene prediction
        Protein alignment
        Numerous tasks in computational linguistics, speech, vision,
        biology.




                                                                       64 / 94
EX 5. Structured Prediction (1/2)


What is structured prediction?
    Structured prediction is a framework for solving problems of
    classification or regression in which the output variables are
    mutually dependent or constrained.
    Lots of examples
        Natural language parsing
        Machine translation
        Object segmentation
        Gene prediction
        Protein alignment
        Numerous tasks in computational linguistics, speech, vision,
        biology.




                                                                       64 / 94
EX 5. Structured Prediction (2/2)
Applications [Lampert et al. ECCV 08] [Desai et al. ICCV 09]




                                                               65 / 94
EX 6. Bilateral Filtering (1/3)
What’s Bilateral Filtering?
    A technique to smooth images while preserving edges
    Ubiquitous in image processing, computational photography




                                                                66 / 94
EX 6. Bilateral Filtering (2/3)
[Bennett and McMillan SIGGRAPH 05] [Eisemann and Durand SIGGRAPH 04] [Jones
et al. SIGGRAPH 03] [Winnem¨oller et al. SIGGRAPH 06] [Bae et al. SIGGRAPH 02]




                                                                            67 / 94
EX 6. Bilateral Filtering (3/3)
How does bilateral filter relate with other methods?




Intepretation
    Bilateral filter is equivalent to mode filtering in local histograms
    Bilateral filter can be interpreted in term of robust statistics since it
    is related to a cost function
    Bilateral filter is a discretization of a particular kind of a
    PDE-based anisotropic diffusion
                                                                         68 / 94
EX 6. Bilateral Filtering (3/3)
How does bilateral filter relate with other methods?




Intepretation
    Bilateral filter is equivalent to mode filtering in local histograms
    Bilateral filter can be interpreted in term of robust statistics since it
    is related to a cost function
    Bilateral filter is a discretization of a particular kind of a
    PDE-based anisotropic diffusion
                                                                         68 / 94
EX 7. Sparse Representation (1/4)

Ideas
   Natural signals (e.g. audio, image) usually admit sparse
   representation (i.e., can be well represented by a linear
   combination of a few atom signals)
   Successfully applied to various areas in signal/image precessing,
   vision and graphics.




                                                                  69 / 94
EX 7. Sparse Representation (2/4)
Image Restoration [Aharon et al. TSP 06] [Julien et al. TIP 08]




                           denoising                              Inpainting




                          Demoisaic                               Inpainting

                                                                               70 / 94
EX 7. Sparse Representation (3/4)
Classification [Wright et al. PAMI 09] [Julien et al. CVPR ECCV NIPS 08]




            face recognition                         edge detection




         texture classification                     pixel classification



                                                                          71 / 94
EX 7. Sparse Representation (4/4)
Compressive sensing [donoho TIT 06] [Candes and Tao TIT 05 06]




        and more (e.g., low-rank matrix completion, robust PCA)
                                                                  72 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                               73 / 94
Add an appropriate adjective   neXt = Adj + X




  There is only one religion, though
 there are a hundred versions of it. -
       George Bernard Shaw



                                                74 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
EX 1. Linear ⇔ Non-linear




    Hard to find a straingt line to seperate them into two cluster?

Ideas
   Linear methods may not capture the nonlinear structure in the
   original data representation
   Nonlinear methods
        Kernel tricks (e.g., Kernel PCA, Kernel LDA, Kernel SVM, etc)
        Manifold learning (e.g., ISOMAP, LLE, Laplacian eigenmap, etc)

                                                                         76 / 94
EX 1. Linear ⇔ Non-linear




    Hard to find a straingt line to seperate them into two cluster?

Ideas
   Linear methods may not capture the nonlinear structure in the
   original data representation
   Nonlinear methods
        Kernel tricks (e.g., Kernel PCA, Kernel LDA, Kernel SVM, etc)
        Manifold learning (e.g., ISOMAP, LLE, Laplacian eigenmap, etc)

                                                                         76 / 94
EX 2. Generative ⇔ Discriminative


Classification task : X → Y
    Generative classifier estimate class-conditional pdfs P(X|Y) and
    prior probabilities P(Y)
        Naive Bayes, Mixtures of Gaussians, Mixtures of experts, Hidden
        Markov Models (HMM), Sigmoidal belief networks, Bayesian
        networks, Markov random fields (MRF)
    Discriminative classifier estimate posterior probabilities P(Y|X)
        Logistic regression, SVMs, Traditional neural networks, Nearest
        neighbor, Conditional Random Fields (CRF)
    Bayes’ rule
                                      P(X|Y)P(Y)
                           P(Y|X) =
                                         P(X)
    Two different perspectives in viewing a problem


                                                                          77 / 94
EX 2. Generative ⇔ Discriminative


Classification task : X → Y
    Generative classifier estimate class-conditional pdfs P(X|Y) and
    prior probabilities P(Y)
        Naive Bayes, Mixtures of Gaussians, Mixtures of experts, Hidden
        Markov Models (HMM), Sigmoidal belief networks, Bayesian
        networks, Markov random fields (MRF)
    Discriminative classifier estimate posterior probabilities P(Y|X)
        Logistic regression, SVMs, Traditional neural networks, Nearest
        neighbor, Conditional Random Fields (CRF)
    Bayes’ rule
                                      P(X|Y)P(Y)
                           P(Y|X) =
                                         P(X)
    Two different perspectives in viewing a problem


                                                                          77 / 94
EX 2. Generative ⇔ Discriminative


Classification task : X → Y
    Generative classifier estimate class-conditional pdfs P(X|Y) and
    prior probabilities P(Y)
        Naive Bayes, Mixtures of Gaussians, Mixtures of experts, Hidden
        Markov Models (HMM), Sigmoidal belief networks, Bayesian
        networks, Markov random fields (MRF)
    Discriminative classifier estimate posterior probabilities P(Y|X)
        Logistic regression, SVMs, Traditional neural networks, Nearest
        neighbor, Conditional Random Fields (CRF)
    Bayes’ rule
                                      P(X|Y)P(Y)
                           P(Y|X) =
                                         P(X)
    Two different perspectives in viewing a problem


                                                                          77 / 94
EX 3. Rule-based / Hand-designed ⇔ Leanring-based




               Hard to find rules to recognize digits?

Ideas
   It may be difficult to design a set of rule to do certain task such as
   handwritten digit recognition
   Turn to machine learning methods instead


                                                                      78 / 94
EX 4. Single scale ⇔ Multi-scale
[Zelnik-Manor and Perona NIPS 04]




Ideas
     We live in a multi-scale world (atom ↔ universe)
     Image pyraimds / scale-space theory / wavelet representation →
     all attempt to capture the multi-scale properties in signal/images.



                                                                       79 / 94
EX 5. Single step ⇔ Progressive
[Yuan et al. SIGGRAPH 08]




Ideas
     Some problems are difficult to solve in one step → solve it
     progressively
                                                                  80 / 94
EX 6. Batch processing ⇔ Incremental / Online
processing
Ideas
   Online methods can handle potentially infinite data samples and
   time-varied data

Examples
   PCA → Incremental PCA (many variants)
   LDA → Incremental LDA (many variants)
   SVM → Incremental and decremental SVM [Cauwenberghs and
   Poggio NIPS 01]
   Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] →
   Online dictionary learning [Mairal et al. ICML/JMLR 09]
   AdaBoosting → Online boosting [Grabner and Bischof CVPR 06]
   Multiple instance boosting → Online multiple instance boosting
   [Babenko et al. CVPR 09]
                                                                    81 / 94
EX 6. Batch processing ⇔ Incremental / Online
processing
Ideas
   Online methods can handle potentially infinite data samples and
   time-varied data

Examples
   PCA → Incremental PCA (many variants)
   LDA → Incremental LDA (many variants)
   SVM → Incremental and decremental SVM [Cauwenberghs and
   Poggio NIPS 01]
   Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] →
   Online dictionary learning [Mairal et al. ICML/JMLR 09]
   AdaBoosting → Online boosting [Grabner and Bischof CVPR 06]
   Multiple instance boosting → Online multiple instance boosting
   [Babenko et al. CVPR 09]
                                                                    81 / 94
EX 6. Batch processing ⇔ Incremental / Online
processing
Ideas
   Online methods can handle potentially infinite data samples and
   time-varied data

Examples
   PCA → Incremental PCA (many variants)
   LDA → Incremental LDA (many variants)
   SVM → Incremental and decremental SVM [Cauwenberghs and
   Poggio NIPS 01]
   Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] →
   Online dictionary learning [Mairal et al. ICML/JMLR 09]
   AdaBoosting → Online boosting [Grabner and Bischof CVPR 06]
   Multiple instance boosting → Online multiple instance boosting
   [Babenko et al. CVPR 09]
                                                                    81 / 94
EX 6. Batch processing ⇔ Incremental / Online
processing
Ideas
   Online methods can handle potentially infinite data samples and
   time-varied data

Examples
   PCA → Incremental PCA (many variants)
   LDA → Incremental LDA (many variants)
   SVM → Incremental and decremental SVM [Cauwenberghs and
   Poggio NIPS 01]
   Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] →
   Online dictionary learning [Mairal et al. ICML/JMLR 09]
   AdaBoosting → Online boosting [Grabner and Bischof CVPR 06]
   Multiple instance boosting → Online multiple instance boosting
   [Babenko et al. CVPR 09]
                                                                    81 / 94
EX 6. Batch processing ⇔ Incremental / Online
processing
Ideas
   Online methods can handle potentially infinite data samples and
   time-varied data

Examples
   PCA → Incremental PCA (many variants)
   LDA → Incremental LDA (many variants)
   SVM → Incremental and decremental SVM [Cauwenberghs and
   Poggio NIPS 01]
   Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] →
   Online dictionary learning [Mairal et al. ICML/JMLR 09]
   AdaBoosting → Online boosting [Grabner and Bischof CVPR 06]
   Multiple instance boosting → Online multiple instance boosting
   [Babenko et al. CVPR 09]
                                                                    81 / 94
EX 6. Batch processing ⇔ Incremental / Online
processing
Ideas
   Online methods can handle potentially infinite data samples and
   time-varied data

Examples
   PCA → Incremental PCA (many variants)
   LDA → Incremental LDA (many variants)
   SVM → Incremental and decremental SVM [Cauwenberghs and
   Poggio NIPS 01]
   Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] →
   Online dictionary learning [Mairal et al. ICML/JMLR 09]
   AdaBoosting → Online boosting [Grabner and Bischof CVPR 06]
   Multiple instance boosting → Online multiple instance boosting
   [Babenko et al. CVPR 09]
                                                                    81 / 94
EX 7. Fixed ⇔ Adaptive / Dynamic
[Elad and Aharon TIP 06]




Ideas
     Adaptive approaches usually outperform the predefined/fixed
     ones.
                                                                 82 / 94
EX 8. Parametric ⇔ Non-parametric
Probability density estimation
    Parametric
        Assumes a specific functional form with paramter θ
             e.g., Gaussian distribution with unknown mean and variance, mixture
             of Gaussians
        Parameter estimation
             Estimative approach: p(x) = p(x|θbest )
             Bayesian approach p(x) = a(θ)p(x|θ)dθ
    Non-parametric
        Do not assume a specific form of the probability distributions
             e.g., Histogram, kernel density estimation (or Parzen window method)




                                                                              83 / 94
EX 8. Parametric ⇔ Non-parametric
Probability density estimation
    Parametric
        Assumes a specific functional form with paramter θ
             e.g., Gaussian distribution with unknown mean and variance, mixture
             of Gaussians
        Parameter estimation
             Estimative approach: p(x) = p(x|θbest )
             Bayesian approach p(x) = a(θ)p(x|θ)dθ
    Non-parametric
        Do not assume a specific form of the probability distributions
             e.g., Histogram, kernel density estimation (or Parzen window method)




                                                                              83 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 10. Z - aware
[Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10]




                       motion-aware video resizing
Make your method be aware of potential failure cases
     Motion (e.g., video processing)
     Content (e.g., image processing)
     Semantic (e.g., image and video indexing/retrival)
     Context (e.g., image understanding)
     Occlusion (e.g., detection/tracking)
                                                           85 / 94
EX 10. Z - aware
[Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10]




                       motion-aware video resizing
Make your method be aware of potential failure cases
     Motion (e.g., video processing)
     Content (e.g., image processing)
     Semantic (e.g., image and video indexing/retrival)
     Context (e.g., image understanding)
     Occlusion (e.g., detection/tracking)
                                                           85 / 94
EX 10. Z - aware
[Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10]




                       motion-aware video resizing
Make your method be aware of potential failure cases
     Motion (e.g., video processing)
     Content (e.g., image processing)
     Semantic (e.g., image and video indexing/retrival)
     Context (e.g., image understanding)
     Occlusion (e.g., detection/tracking)
                                                           85 / 94
EX 10. Z - aware
[Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10]




                       motion-aware video resizing
Make your method be aware of potential failure cases
     Motion (e.g., video processing)
     Content (e.g., image processing)
     Semantic (e.g., image and video indexing/retrival)
     Context (e.g., image understanding)
     Occlusion (e.g., detection/tracking)
                                                           85 / 94
EX 10. Z - aware
[Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10]




                       motion-aware video resizing
Make your method be aware of potential failure cases
     Motion (e.g., video processing)
     Content (e.g., image processing)
     Semantic (e.g., image and video indexing/retrival)
     Context (e.g., image understanding)
     Occlusion (e.g., detection/tracking)
                                                           85 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                               86 / 94
What is a bad idea?



   Naive combination of two or more methods
       Avoid a pipeline system paper
   Blind application of tools
       Use X feature and Y classifier without motivation and justification
   Follow the hype
       Too many competitors
   Do just because it can be done
       Do the right things, not just do things right




                                                                           87 / 94
88 / 94
89 / 94
90 / 94
91 / 94
92 / 94
93 / 94
Thank you for your kind attention.
             Questions?
For more complete materials, please visit my blog
http://jbhuang0604.blogspot.com/




                                                    94 / 94

More Related Content

What's hot

Crisp dm
Crisp dmCrisp dm
Crisp dmakbkck
 
Design Thinking Workshop - By the people for the people
Design Thinking Workshop - By the people for the peopleDesign Thinking Workshop - By the people for the people
Design Thinking Workshop - By the people for the peopleRafael Citadella Daron
 
How to Build Data Science Teams
How to Build Data Science TeamsHow to Build Data Science Teams
How to Build Data Science TeamsGanes Kesari
 
Introduction to Machine Learning
Introduction to Machine Learning   Introduction to Machine Learning
Introduction to Machine Learning snehal_152
 
Knowledge Graphs and Generative AI
Knowledge Graphs and Generative AIKnowledge Graphs and Generative AI
Knowledge Graphs and Generative AINeo4j
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data sciencebhavesh lande
 
Generative Models and ChatGPT
Generative Models and ChatGPTGenerative Models and ChatGPT
Generative Models and ChatGPTLoic Merckel
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringTuri, Inc.
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfPremNaraindas1
 
Storytelling with Data - See | Show | Tell | Engage
Storytelling with Data - See | Show | Tell | EngageStorytelling with Data - See | Show | Tell | Engage
Storytelling with Data - See | Show | Tell | EngageAmit Kapoor
 
Landscape of AI/ML in 2023
Landscape of AI/ML in 2023Landscape of AI/ML in 2023
Landscape of AI/ML in 2023HyunJoon Jung
 
Large Language Models Bootcamp
Large Language Models BootcampLarge Language Models Bootcamp
Large Language Models BootcampData Science Dojo
 
Data Visualization and Dashboard Design
Data Visualization and Dashboard DesignData Visualization and Dashboard Design
Data Visualization and Dashboard DesignJacques Warren
 
Machine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And ApplicationsMachine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And ApplicationsSlideTeam
 
Presentation on Big Data Analytics
Presentation on Big Data AnalyticsPresentation on Big Data Analytics
Presentation on Big Data AnalyticsS P Sajjan
 
ChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPXiachongFeng
 

What's hot (20)

Stanford AI Report 2023
Stanford AI Report 2023Stanford AI Report 2023
Stanford AI Report 2023
 
Crisp dm
Crisp dmCrisp dm
Crisp dm
 
Design Thinking Workshop - By the people for the people
Design Thinking Workshop - By the people for the peopleDesign Thinking Workshop - By the people for the people
Design Thinking Workshop - By the people for the people
 
How to Build Data Science Teams
How to Build Data Science TeamsHow to Build Data Science Teams
How to Build Data Science Teams
 
Webinar on ChatGPT.pptx
Webinar on ChatGPT.pptxWebinar on ChatGPT.pptx
Webinar on ChatGPT.pptx
 
Introduction to Machine Learning
Introduction to Machine Learning   Introduction to Machine Learning
Introduction to Machine Learning
 
Knowledge Graphs and Generative AI
Knowledge Graphs and Generative AIKnowledge Graphs and Generative AI
Knowledge Graphs and Generative AI
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Generative Models and ChatGPT
Generative Models and ChatGPTGenerative Models and ChatGPT
Generative Models and ChatGPT
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdf
 
Storytelling with Data - See | Show | Tell | Engage
Storytelling with Data - See | Show | Tell | EngageStorytelling with Data - See | Show | Tell | Engage
Storytelling with Data - See | Show | Tell | Engage
 
Landscape of AI/ML in 2023
Landscape of AI/ML in 2023Landscape of AI/ML in 2023
Landscape of AI/ML in 2023
 
Large Language Models Bootcamp
Large Language Models BootcampLarge Language Models Bootcamp
Large Language Models Bootcamp
 
Design thinking
Design thinkingDesign thinking
Design thinking
 
Data Visualization and Dashboard Design
Data Visualization and Dashboard DesignData Visualization and Dashboard Design
Data Visualization and Dashboard Design
 
Design Thinking
Design ThinkingDesign Thinking
Design Thinking
 
Machine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And ApplicationsMachine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And Applications
 
Presentation on Big Data Analytics
Presentation on Big Data AnalyticsPresentation on Big Data Analytics
Presentation on Big Data Analytics
 
ChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPChatGPT Evaluation for NLP
ChatGPT Evaluation for NLP
 

Viewers also liked

How to Read Academic Papers
How to Read Academic PapersHow to Read Academic Papers
How to Read Academic PapersJia-Bin Huang
 
Research 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeXResearch 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeXJia-Bin Huang
 
What Makes a Creative Photograph?
What Makes a Creative Photograph?What Makes a Creative Photograph?
What Makes a Creative Photograph?Jia-Bin Huang
 
Writing Fast MATLAB Code
Writing Fast MATLAB CodeWriting Fast MATLAB Code
Writing Fast MATLAB CodeJia-Bin Huang
 
Recent Advances in Computer Vision
Recent Advances in Computer VisionRecent Advances in Computer Vision
Recent Advances in Computer Visionantiw
 
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)Jia-Bin Huang
 
Computer Vision Crash Course
Computer Vision Crash CourseComputer Vision Crash Course
Computer Vision Crash CourseJia-Bin Huang
 
Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.Jia-Bin Huang
 
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)Jia-Bin Huang
 
Jia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum VitaeJia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum VitaeJia-Bin Huang
 
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...Jia-Bin Huang
 
Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013Jia-Bin Huang
 
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Jia-Bin Huang
 
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Jia-Bin Huang
 
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Jia-Bin Huang
 
Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
 
Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015Jia-Bin Huang
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialJia-Bin Huang
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015Jia-Bin Huang
 

Viewers also liked (20)

How to Read Academic Papers
How to Read Academic PapersHow to Read Academic Papers
How to Read Academic Papers
 
Research 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeXResearch 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeX
 
What Makes a Creative Photograph?
What Makes a Creative Photograph?What Makes a Creative Photograph?
What Makes a Creative Photograph?
 
Writing Fast MATLAB Code
Writing Fast MATLAB CodeWriting Fast MATLAB Code
Writing Fast MATLAB Code
 
Recent Advances in Computer Vision
Recent Advances in Computer VisionRecent Advances in Computer Vision
Recent Advances in Computer Vision
 
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)
 
Computer Vision Crash Course
Computer Vision Crash CourseComputer Vision Crash Course
Computer Vision Crash Course
 
Raskar 2012, Idea Hexagon
Raskar 2012, Idea HexagonRaskar 2012, Idea Hexagon
Raskar 2012, Idea Hexagon
 
Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.
 
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)
 
Jia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum VitaeJia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum Vitae
 
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
 
Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013
 
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
 
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
 
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
 
Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)
 
Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorial
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
 

Similar to How to come up with new research ideas

17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptxssuser2023c6
 
17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis Functions17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis FunctionsAndres Mendez-Vazquez
 
Generalization abstraction
Generalization abstractionGeneralization abstraction
Generalization abstractionEdward Blurock
 
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...Association for Computational Linguistics
 
Visual thinking colin_ware_lectures_2013_9_visual thinking_1
Visual thinking colin_ware_lectures_2013_9_visual thinking_1Visual thinking colin_ware_lectures_2013_9_visual thinking_1
Visual thinking colin_ware_lectures_2013_9_visual thinking_1Elsa von Licy
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3butest
 
Algebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptxAlgebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptxWillSoo1
 
Algebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptxAlgebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptxWillSoo1
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learningkkkc
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.pptyang947066
 
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdfMcSwathi
 
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text ClassificationSai Srinivas Kotni
 
Transfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingTransfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingSebastian Ruder
 
Intro to concept maps v3 march 2012
Intro to concept maps v3 march 2012Intro to concept maps v3 march 2012
Intro to concept maps v3 march 2012Andre Daniels
 
Presentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data MiningPresentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data Miningbutest
 

Similar to How to come up with new research ideas (20)

17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx
 
17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis Functions17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis Functions
 
Generalization abstraction
Generalization abstractionGeneralization abstraction
Generalization abstraction
 
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
 
Grade 5 Math
Grade 5 MathGrade 5 Math
Grade 5 Math
 
Visual thinking colin_ware_lectures_2013_9_visual thinking_1
Visual thinking colin_ware_lectures_2013_9_visual thinking_1Visual thinking colin_ware_lectures_2013_9_visual thinking_1
Visual thinking colin_ware_lectures_2013_9_visual thinking_1
 
KNN
KNNKNN
KNN
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3
 
Algebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptxAlgebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptx
 
Algebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptxAlgebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptx
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
 
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text Classification
 
07 bestpractice
07 bestpractice07 bestpractice
07 bestpractice
 
Transfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingTransfer Learning for Natural Language Processing
Transfer Learning for Natural Language Processing
 
Intro to concept maps v3 march 2012
Intro to concept maps v3 march 2012Intro to concept maps v3 march 2012
Intro to concept maps v3 march 2012
 
Presentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data MiningPresentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data Mining
 
16 17 bag_words
16 17 bag_words16 17 bag_words
16 17 bag_words
 
Clustering
ClusteringClustering
Clustering
 

More from Jia-Bin Huang

How to write a clear paper
How to write a clear paperHow to write a clear paper
How to write a clear paperJia-Bin Huang
 
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)Jia-Bin Huang
 
Real-time Face Detection and Recognition
Real-time Face Detection and RecognitionReal-time Face Detection and Recognition
Real-time Face Detection and RecognitionJia-Bin Huang
 
Pose aware online visual tracking
Pose aware online visual trackingPose aware online visual tracking
Pose aware online visual trackingJia-Bin Huang
 
Face Expression Enhancement
Face Expression EnhancementFace Expression Enhancement
Face Expression EnhancementJia-Bin Huang
 
Image Smoothing for Structure Extraction
Image Smoothing for Structure ExtractionImage Smoothing for Structure Extraction
Image Smoothing for Structure ExtractionJia-Bin Huang
 
Three Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiucThree Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiucJia-Bin Huang
 
Static and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture RecognitionStatic and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture RecognitionJia-Bin Huang
 
Real-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes RecognitionReal-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes RecognitionJia-Bin Huang
 
UIUC CS 498 - Computational Photography - Final project presentation
UIUC CS 498 - Computational Photography - Final project presentation UIUC CS 498 - Computational Photography - Final project presentation
UIUC CS 498 - Computational Photography - Final project presentation Jia-Bin Huang
 
Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
 
Information Preserving Color Transformation for Protanopia and Deuteranopia (...
Information Preserving Color Transformation for Protanopia and Deuteranopia (...Information Preserving Color Transformation for Protanopia and Deuteranopia (...
Information Preserving Color Transformation for Protanopia and Deuteranopia (...Jia-Bin Huang
 
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Jia-Bin Huang
 
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Jia-Bin Huang
 
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Jia-Bin Huang
 

More from Jia-Bin Huang (15)

How to write a clear paper
How to write a clear paperHow to write a clear paper
How to write a clear paper
 
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
 
Real-time Face Detection and Recognition
Real-time Face Detection and RecognitionReal-time Face Detection and Recognition
Real-time Face Detection and Recognition
 
Pose aware online visual tracking
Pose aware online visual trackingPose aware online visual tracking
Pose aware online visual tracking
 
Face Expression Enhancement
Face Expression EnhancementFace Expression Enhancement
Face Expression Enhancement
 
Image Smoothing for Structure Extraction
Image Smoothing for Structure ExtractionImage Smoothing for Structure Extraction
Image Smoothing for Structure Extraction
 
Three Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiucThree Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiuc
 
Static and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture RecognitionStatic and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture Recognition
 
Real-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes RecognitionReal-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes Recognition
 
UIUC CS 498 - Computational Photography - Final project presentation
UIUC CS 498 - Computational Photography - Final project presentation UIUC CS 498 - Computational Photography - Final project presentation
UIUC CS 498 - Computational Photography - Final project presentation
 
Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)
 
Information Preserving Color Transformation for Protanopia and Deuteranopia (...
Information Preserving Color Transformation for Protanopia and Deuteranopia (...Information Preserving Color Transformation for Protanopia and Deuteranopia (...
Information Preserving Color Transformation for Protanopia and Deuteranopia (...
 
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
 
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
 
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
 

Recently uploaded

Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 

Recently uploaded (20)

Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 

How to come up with new research ideas

  • 1. How to Come Up With New Research Ideas? Jia-Bin Huang jbhuang0604@gmail.com Taiwan May , 2010 1 / 94
  • 2. What this talk is about? Five approaches to come up with new ideas in computer vision. Extensive case studies (i.e., more than one hundred papers). A common sense talk. No complicate theories or equations. I wish someone told me this before. Reference The content of this talk is greatly inspired by “Raskar Idea Hexagon". 2 / 94
  • 3. What this talk is about? Five approaches to come up with new ideas in computer vision. Extensive case studies (i.e., more than one hundred papers). A common sense talk. No complicate theories or equations. I wish someone told me this before. Reference The content of this talk is greatly inspired by “Raskar Idea Hexagon". 2 / 94
  • 4. What this talk is about? Five approaches to come up with new ideas in computer vision. Extensive case studies (i.e., more than one hundred papers). A common sense talk. No complicate theories or equations. I wish someone told me this before. Reference The content of this talk is greatly inspired by “Raskar Idea Hexagon". 2 / 94
  • 5. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 3 / 94
  • 6. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 4 / 94
  • 7. Active Topics in Computer Vision [Szeliski Computer Vision: Algorithms and Applications 2010] Digital image processing Blocks world, line labeling Generalized cylinders Pictorial structures Stereo correspondence Intrinsic images Optical flow Structure from motion Image pyramids Scale-space processing Shape from X Physically-based modeling Regularization Markov Random Fields Kalman filters 3D range data processing Projective invariants Factorization Physics-based vision Graph cuts Particle filtering Energy-based segmentation Face recognition and detection Subspace methods Image-based modeling/rendering Texture synthesis/inpainting Computational photography Feature-based recognition MRF inference algorithms Learning 5 / 94
  • 8. What can we learn from the past? The topics are diverse and evolve over time. The ways to come up with new ideas are similar. There are patterns to follow. 6 / 94
  • 9. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 7 / 94
  • 10. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 8 / 94
  • 11. Seek different dimensions neXt = X d The only difference between a rut and a grave is their dimensions. - Ellen Glasgow 9 / 94
  • 12. Seek different dimensions neXt = X d Idea Can we increase/replace/transform the dimensions of the original problem to get new problems/solutions? What kind of dimensions can we work on? 1 Concrete dimensions (e.g., space, time, frequency) 2 Abstract dimensions (e.g., properties) 10 / 94
  • 13. EX 1-1. Content-Aware Media Resizing [Avidan et al. SIGGRAPH 07] [Rubinstein et al. SIGGRAPH 08] Ideas Extend dimensions from 2D image to 3D video: image re-targeting ⇒ video re-targeting Other dimensions? E.g., 4D light field, infrared image, range image. 11 / 94
  • 14. EX 1-2. Video Stitching [Rav-Acha et al. CVPR 05] Input video Dynamic Panorama Ideas Extend dimensions from image to video, i.e., Image Panorama ⇒ Video Mosaics with Non-Chronological Time Increase the time dimension in both input and output 12 / 94
  • 15. EX 1-3. Multi-Image Fusion [Agarwala et al. SIGGRAPH 04] Ideas Extend from single input image to multiple input images ⇒ Digital Photomontage Increase the dimension in input only. 13 / 94
  • 16. EX 1-4. Computation Photography (Coded Photography) [Raskar et al. SIGGRAPH 04, 06, 08] [Levin et al. SIGGRAPH 07] Ideas Coded Photography: reversibly encode information about the scene in a single photograph Coding in Time (Exposure), Coded Illumination, Coding in Space (aperture), and Coded Wavelength Replace the dimension to code information of the light field 14 / 94
  • 17. EX 1-1. Photography in Low Light Conditions Flash Blurred Noisy What we can do ? Flash → Changes the overall scene appearance (cold and gray) Long exposure time (hand shake) → Blurred image Short exposure time (insufficient light) → Noisy image 15 / 94
  • 18. EX 1-1-1. Flash/non-Flash Photography [Petschnigg et al. SIGGRAPH 2004] Flash No flash Detail transfer with denoising Ideas The original problem (taking a good photo in low light environments from single image) is difficult. Increase the dimension of input (flash/no-flash image pair) make the problem much easier. 16 / 94
  • 19. EX 1-1-2. Image Deblurring with Blurred/Noisy Image Pairs [Yuan et al. SIGGRAPH 2007] Blurred Noisy Enhanced noisy Deblurred result Ideas The original problem (taking a good photo in low light and flash prohibited environments from single image) is difficult. Increase the dimension of input (Blurred/Noisy image pair) make the problem much easier. 17 / 94
  • 20. EX 1-1-3. Robust Flash Deblurring [Zhou et al. CVPR 2010] Ideas The original problem (taking a good photo in low light environments from single image) is difficult. Increase the dimension of input (Blurred/Flash image pair) make the problem much easier. 18 / 94
  • 21. EX 1-1-4. Dark Flash Photography [Krishnan et al. SIGGRAPH 2009] Ideas The original problem (taking a good photo in low light environments from single image) is difficult. Increase the dimension of input (Dark Flash/Noisy image pair) make the problem much easier. 19 / 94
  • 22. EX 1-2. Brute-Force Vision [Hays and Efros SIGGRAPH 07] [Dale et al. ICCV 09] [Agarwal et al. ICCV 09] [Furukawa et al. ICCV 09] Ideas Utilize a large collection of photos. 20 / 94
  • 23. EX 2-1. X Alignment/Registration (pixel, object, scene) [Liu et al. CVPR 08, ECCV 08] [Berg et al. CVPR 05] 21 / 94
  • 24. EX 2-2. Shape from X (shading, texture, specular) [Lobay and Forsyth IJCV 06] [Fleming et al JOV 04] [Adato et al ICCV 07] shading specular texture specular flow 22 / 94
  • 25. EX 2-3. Depth from X (stereo, (de-)focus, coded aperture, diffusion, occlusion, semantic label) [Levin et al. SIGGRAPH 07] [Hoiem et al. ICCV 07] [Liu et al. CVPR 10] [Zhou et al. CVPR 10] Coded Aperture Semantic Labels Occlusion Diffusion 23 / 94
  • 26. EX 2-4. Infer X from a single image (geometric, geography, illumination) [Hoiem et al. ICCV 05] [Hays and Efros CVPR 08] [Lalonde et al. ICCV 09] Geometric Geography Illumination 24 / 94
  • 27. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 25 / 94
  • 28. Combine two or more topics neXt = X + Y To steal ideas from one person is plagiarism. To steal from many is research. - Wilson Mizner 26 / 94
  • 29. Combine two or more topics neXt = X + Y Idea Can we combine two or more topics to get new problems or solutions? What kind of topics can we combine? 1 X, Y are methods 2 X, Y are problems 3 X, Y are areas 27 / 94
  • 30. EX 1-1. Viola-Jones Object Detection Framework [Viola and Jones CVPR 2001] Simple feature Integral img Boosting Cascade structure Ideas Paper title: Rapid Object Detection using a Boosted Cascade of Simple Features Viola-Jones object detection framework = Integral Images (simple feature)(1984) + AdaBoost(1997) + Cascade Architecture(long time ago) 28 / 94
  • 31. EX 1-2. SIFT Flow = SIFT + Optical Flow [Liu et al. ECCV 08 CVPR 09] Motion hallucination Label transfer Ideas Dense sampling in time : optical flow :: dense sampling in world images : SIFT flow 29 / 94
  • 32. EX 1-3. Visual Tracking with Online Multiple Instance Boosting [Babenko et al. CVPR 09] Ideas MILTrack = Multiple Instance Boosting (2005) + Online Boosting Tracking (2006) 30 / 94
  • 33. EX 2-1. High Dynamic Range Image Reconstruction from Hand-held Cameras [Lu et al. CVPR 2009] Ideas HDR from from Hand-held Cameras = High Dynamic Range Image Reconstruction + Image Deblurring 31 / 94
  • 34. EX 2-2. Human Body Understanding [Guan et al. ICCV 09] Ideas Human Body Understanding = Shape Reconstruction + Pose Estimation 32 / 94
  • 35. EX 2-3. Image Understanding detection, tracking, recognition, segmentation, reconstruction, scene classification, event recognition 33 / 94
  • 36. EX 2-3-1. Detection + Tracking [Andriluka et al. CVPR 08] Ideas People detection and people tracking are highly correlated problems. Combine two problems can potentially achieve improved performance on individual tasks. 34 / 94
  • 37. EX 2-3-2. Object Attribute + Recognition [Farhadi et al. CVPR 09] [Lampert et al. CVPR 09] Ideas Describe image by attributes Enable knowledge transfer to recognition class with no visual examples 35 / 94
  • 38. EX 2-3-2. Object Recognition + Detection [Yeh et al. CVPR 09] Ideas Concurrent object localization and recognition 36 / 94
  • 39. EX 2-3-3. Image Segmentation + Object Recognition + Event Recognition [Li et al. CVPR 09] Ideas Combine scene classification, image segmentation, image annotation All three tasks are mutually beneficial 37 / 94
  • 40. EX 3-1. SixthSense - A Wearable Gestural Interface [Mistry and Maes TED 2009] Ideas SixthSense = Computer Vision (e.g., tracking, recognition) + Internet 38 / 94
  • 41. EX 3-2. Sikuli:Picture-driven computing [Yeh et al. UIST 09] [Chang et al. CHI 10] Ideas 1. Readability/usability, 2. GUI serialization, 3. Computer vision on computer-generated figures 39 / 94
  • 42. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 40 / 94
  • 43. Re-think the research directions ¯ neXt = X If at first, the idea is not absurd, then there is no hope for it - Albert Einstein 41 / 94
  • 44. Re-think the research directions ¯ neXt = X Ideas Are the current research directions really make sense? What’s the key problem? What could we do? 1 Re-formulate the original problem. 2 Analyze, compare existing approaches. Provide insight to the problems. 42 / 94
  • 45. EX 1-1. Beyond Sliding Windows [Lampert et al. CVPR 08] Rectangle set Branch and bound search Ideas Sliding window search ⇔ brand-and-bound search Represent a set of rectangles with 4 intervals Use brand-and-bound to find the optimal rectangle (object localization) efficiently 43 / 94
  • 46. EX 1-2. Beyond Categories [Malisiewicz and Efros CVPR 08, NIPS 09] Ideas Explicit categorization ⇔ Implicit categorization Ask "what is this like?" (association), instead of "what is it?" (categorization) 44 / 94
  • 47. EX 1-3. Motion-Invariant Photography [Levin et al. SIGGRAPH 08] [Cho et al. ICCP 10] Ideas Still camera ⇔ Moving camera (parabolic exposures) Enable the use of spatial-invariant blur kernel estimation 45 / 94
  • 48. EX 1-4. Super-resolution from Single Image [Glasner et al. ICCV 09] Ideas Clasical multi-image SR/Example-based SR ⇔ Single SR framework 46 / 94
  • 49. EX 2-1. In Defense of ... [Boiman et al. CVPR 08] [Hartley PAMI 97] Nearest-Neighbor Based Image Classification Quantization of local image descriptors (used to generate "bags-of-words", codebooks). Computation of "Image-to-Image" distance, instead of "Image-to-Class" distance The performance ranks among the top leading learning-based image classifiers The 8-point Algorithm for the fundamental matrix Normalization, Normalization, Normalization! Performs almost as well as the best iterative algorithm 47 / 94
  • 50. EX 2-2. Understanding blind deconvolution [Levin et al. CVPR 2009] Ideas Blind deconvolution: recover sharp image x from the blurred one (y = k ⊗ x + n). MAPx,k estimation often favors no-blur explanations. MAPk can be accurately estimated since the kernel size is often smaller than the image size. Blind deconvolution should be address in this way: MAPk + non-blind deconvolution. 48 / 94
  • 51. EX 2-3. Understanding camera trade-offs [Levin et al. ECCV 08] Ideas Traditional optics evaluation: 2D image sharpness (eg, Modulation Transfer Function) Modern camera evaluation: How well does the recorded data allow us to estimate the visual world - the lightfield? 49 / 94
  • 52. EX 2-4. What is a good image segment? [Bagon et al. ECCV 08] Ideas Good image segment as one which can be easily composed using its own pieces, but is difficult to compose using pieces from other parts of the image 50 / 94
  • 53. EX 2-5. Lambertian Reflectance and Linear Subspaces [Basri and Jacobs PAMI 03] Ideas The set of all Lambertian reflectance functions (the mapping from surface normals to intensities) obtained with arbitrary distant light sources lies close to a 9D linear subspace. Explain prior empirical results using linear subspace methods. 51 / 94
  • 54. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 52 / 94
  • 55. Use powerful tools, find suitable problems neXt = X ↑ If the only tool you have is a hammer, you tend to see every problem as a nail. - Abraham Maslow 53 / 94
  • 56. Use powerful tools, find suitable problems neXt = X ↑ What kinds of tools should we understand? Calculus of Variations Dimensionality Reduction Spectral Methods (specifically, spectral clustering) Probabilistic Graphical Model Structured Prediction Bilateral Filtering Sparse Representation and more ... spectral method/theory, information theory, (convex) optimization, etc 54 / 94
  • 57. EX 1. Calculus of Variations (1/2) From Calculus to Calculus of Variations Calculus Calculus of Variations Functions Functionals (functions of functions) x f: Rn → R f: F → R, f (u) = x12 L(x, u(x), u (x))dx (x) df (u) Derivative dfdx Variation du lim∆x→0 f (x+∆x)−f (x) ∆x lim →0 f (u+ δx)−f (u) ∂ f (x + ∆u)| ∂ =0 Local extremum Local extremum df (x) dx = 0 Euler-Lagrange equation Total Variation (TV) x1 TV(y) = x0 |y |dx: The "oscillation strength" of y(x) 55 / 94
  • 58. EX 1. Calculus of Variations (2/2) Total Variation Denoising/Inpainting Applications in computer vision Optical flow [Horn and Schunck AI 81] Shape from shading [Horn and Brooks CVGIP 86] Edge detection [PAMI 87] Anisotropic diffusion [Perona and Malik PAMI 90] Active contours model [Kass et al. IJCV 98] Image segmentation [Morel and Solimini 95] Image restoration [Aubert and Vese SIAM Journal on NA 97] 56 / 94
  • 59. EX 1. Calculus of Variations (2/2) Total Variation Denoising/Inpainting Applications in computer vision Optical flow [Horn and Schunck AI 81] Shape from shading [Horn and Brooks CVGIP 86] Edge detection [PAMI 87] Anisotropic diffusion [Perona and Malik PAMI 90] Active contours model [Kass et al. IJCV 98] Image segmentation [Morel and Solimini 95] Image restoration [Aubert and Vese SIAM Journal on NA 97] 56 / 94
  • 60. EX 2. Dimensionality Reduction (1/2) Why we need dimensionality reduction? Since high-dimensional data is everywhere (e.g., images, human gene distributions, weather prediction), we need dimensionality reduction for 1 processing data efficiently. 2 estimating the distributions of data accuratly (curse of dimensionality) 3 finding meaningful representation of data Classification of dimensionality reduction methods Global structure preserved Local structure preserved Linear PCA, LDA LPP, NPE Nonlinear ISOMAP, Kernel PCA, DM LLE, LE, HE 57 / 94
  • 61. EX 2. Dimensionality Reduction (1/2) Why we need dimensionality reduction? Since high-dimensional data is everywhere (e.g., images, human gene distributions, weather prediction), we need dimensionality reduction for 1 processing data efficiently. 2 estimating the distributions of data accuratly (curse of dimensionality) 3 finding meaningful representation of data Classification of dimensionality reduction methods Global structure preserved Local structure preserved Linear PCA, LDA LPP, NPE Nonlinear ISOMAP, Kernel PCA, DM LLE, LE, HE 57 / 94
  • 62. EX 2. Dimensionality Reduction (2/2) Applications in computer vision Subspace as constraints Structure from motion [Tomasi and Kanade IJCV 92], Optical flow [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face alignment [Saragih et al. ICCV 09] Face recognition (e.g., PCA, LDA, LPP) PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97], LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09] Motion segmentation subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV 06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09] Lighting linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02] Visual tracking incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR 08] 58 / 94
  • 63. EX 2. Dimensionality Reduction (2/2) Applications in computer vision Subspace as constraints Structure from motion [Tomasi and Kanade IJCV 92], Optical flow [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face alignment [Saragih et al. ICCV 09] Face recognition (e.g., PCA, LDA, LPP) PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97], LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09] Motion segmentation subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV 06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09] Lighting linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02] Visual tracking incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR 08] 58 / 94
  • 64. EX 2. Dimensionality Reduction (2/2) Applications in computer vision Subspace as constraints Structure from motion [Tomasi and Kanade IJCV 92], Optical flow [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face alignment [Saragih et al. ICCV 09] Face recognition (e.g., PCA, LDA, LPP) PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97], LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09] Motion segmentation subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV 06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09] Lighting linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02] Visual tracking incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR 08] 58 / 94
  • 65. EX 2. Dimensionality Reduction (2/2) Applications in computer vision Subspace as constraints Structure from motion [Tomasi and Kanade IJCV 92], Optical flow [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face alignment [Saragih et al. ICCV 09] Face recognition (e.g., PCA, LDA, LPP) PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97], LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09] Motion segmentation subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV 06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09] Lighting linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02] Visual tracking incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR 08] 58 / 94
  • 66. EX 2. Dimensionality Reduction (2/2) Applications in computer vision Subspace as constraints Structure from motion [Tomasi and Kanade IJCV 92], Optical flow [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face alignment [Saragih et al. ICCV 09] Face recognition (e.g., PCA, LDA, LPP) PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97], LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09] Motion segmentation subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV 06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09] Lighting linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02] Visual tracking incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR 08] 58 / 94
  • 67. EX 3. Spectral Clustering (1/3) Why spectral clustering is popular? Can be solved efficiently by standard linear algebra software Very often outperform traditional clustering algorithms Spectral clustering algorithm Input: a set of data points 1 Construct a similarity graph, e.g., -neighbor, k-nearest neighbor, fully connected 2 Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym ) 3 Compute the first k (with smallest eigenvalues) eigenvectors of L, v1 , · · · , vk 4 Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns 5 Cluster the row vectors yi with the k-means algorithm into cluster C1 , · · · , Ck Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci } 59 / 94
  • 68. EX 3. Spectral Clustering (1/3) Why spectral clustering is popular? Can be solved efficiently by standard linear algebra software Very often outperform traditional clustering algorithms Spectral clustering algorithm Input: a set of data points 1 Construct a similarity graph, e.g., -neighbor, k-nearest neighbor, fully connected 2 Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym ) 3 Compute the first k (with smallest eigenvalues) eigenvectors of L, v1 , · · · , vk 4 Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns 5 Cluster the row vectors yi with the k-means algorithm into cluster C1 , · · · , Ck Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci } 59 / 94
  • 69. EX 3. Spectral Clustering (1/3) Why spectral clustering is popular? Can be solved efficiently by standard linear algebra software Very often outperform traditional clustering algorithms Spectral clustering algorithm Input: a set of data points 1 Construct a similarity graph, e.g., -neighbor, k-nearest neighbor, fully connected 2 Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym ) 3 Compute the first k (with smallest eigenvalues) eigenvectors of L, v1 , · · · , vk 4 Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns 5 Cluster the row vectors yi with the k-means algorithm into cluster C1 , · · · , Ck Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci } 59 / 94
  • 70. EX 3. Spectral Clustering (1/3) Why spectral clustering is popular? Can be solved efficiently by standard linear algebra software Very often outperform traditional clustering algorithms Spectral clustering algorithm Input: a set of data points 1 Construct a similarity graph, e.g., -neighbor, k-nearest neighbor, fully connected 2 Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym ) 3 Compute the first k (with smallest eigenvalues) eigenvectors of L, v1 , · · · , vk 4 Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns 5 Cluster the row vectors yi with the k-means algorithm into cluster C1 , · · · , Ck Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci } 59 / 94
  • 71. EX 3. Spectral Clustering (1/3) Why spectral clustering is popular? Can be solved efficiently by standard linear algebra software Very often outperform traditional clustering algorithms Spectral clustering algorithm Input: a set of data points 1 Construct a similarity graph, e.g., -neighbor, k-nearest neighbor, fully connected 2 Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym ) 3 Compute the first k (with smallest eigenvalues) eigenvectors of L, v1 , · · · , vk 4 Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns 5 Cluster the row vectors yi with the k-means algorithm into cluster C1 , · · · , Ck Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci } 59 / 94
  • 72. EX 3. Spectral Clustering (1/3) Why spectral clustering is popular? Can be solved efficiently by standard linear algebra software Very often outperform traditional clustering algorithms Spectral clustering algorithm Input: a set of data points 1 Construct a similarity graph, e.g., -neighbor, k-nearest neighbor, fully connected 2 Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym ) 3 Compute the first k (with smallest eigenvalues) eigenvectors of L, v1 , · · · , vk 4 Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns 5 Cluster the row vectors yi with the k-means algorithm into cluster C1 , · · · , Ck Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci } 59 / 94
  • 73. EX 3. Spectral Clustering (1/3) Why spectral clustering is popular? Can be solved efficiently by standard linear algebra software Very often outperform traditional clustering algorithms Spectral clustering algorithm Input: a set of data points 1 Construct a similarity graph, e.g., -neighbor, k-nearest neighbor, fully connected 2 Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym ) 3 Compute the first k (with smallest eigenvalues) eigenvectors of L, v1 , · · · , vk 4 Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns 5 Cluster the row vectors yi with the k-means algorithm into cluster C1 , · · · , Ck Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci } 59 / 94
  • 74. EX 3. Spectral Clustering (2/3) Why it works? Graph Cut Point of View: Construct a partition that minimize the weight across the cut (the well-known mincut problem) while balancing the clusters (e.g., RatioCut, Normalized cut). Random Walks Point of View: When minimizing Ncut, we actually look for a cut through the graph such that a random walk seldom transitions from one cluster to another. Perturbation Theory Point of View: The distance between eigenvectors from the ideal and nearly ideal graph Laplacian is bounded by a constant times a norm of the error matrix. If the perturbations are not small enough, then the k-means algorithm will still separate the groups from each other. 60 / 94
  • 75. EX 3. Spectral Clustering (2/3) Why it works? Graph Cut Point of View: Construct a partition that minimize the weight across the cut (the well-known mincut problem) while balancing the clusters (e.g., RatioCut, Normalized cut). Random Walks Point of View: When minimizing Ncut, we actually look for a cut through the graph such that a random walk seldom transitions from one cluster to another. Perturbation Theory Point of View: The distance between eigenvectors from the ideal and nearly ideal graph Laplacian is bounded by a constant times a norm of the error matrix. If the perturbations are not small enough, then the k-means algorithm will still separate the groups from each other. 60 / 94
  • 76. EX 3. Spectral Clustering (2/3) Why it works? Graph Cut Point of View: Construct a partition that minimize the weight across the cut (the well-known mincut problem) while balancing the clusters (e.g., RatioCut, Normalized cut). Random Walks Point of View: When minimizing Ncut, we actually look for a cut through the graph such that a random walk seldom transitions from one cluster to another. Perturbation Theory Point of View: The distance between eigenvectors from the ideal and nearly ideal graph Laplacian is bounded by a constant times a norm of the error matrix. If the perturbations are not small enough, then the k-means algorithm will still separate the groups from each other. 60 / 94
  • 77. EX 3. Spectral Clustering (3/3) [Shi and Malik PAMI 02] Eigenvectors carry contour information. 61 / 94
  • 78. EX 4. Probabilistic Graphical Model (1/2) What is probabilistic graphical models? A marriage between probability theory and graph theory. A natural tool for dealing with uncertainty and complexity Provides a way to view all probablistic systems (e.g., mixture models, factor analysis, hidden Markov models, Kalman filters and Ising models) as instances of a common underlying formalism. 62 / 94
  • 79. EX 4. Probabilistic Graphical Model (2/2) 63 / 94
  • 80. EX 5. Structured Prediction (1/2) What is structured prediction? Structured prediction is a framework for solving problems of classification or regression in which the output variables are mutually dependent or constrained. Lots of examples Natural language parsing Machine translation Object segmentation Gene prediction Protein alignment Numerous tasks in computational linguistics, speech, vision, biology. 64 / 94
  • 81. EX 5. Structured Prediction (1/2) What is structured prediction? Structured prediction is a framework for solving problems of classification or regression in which the output variables are mutually dependent or constrained. Lots of examples Natural language parsing Machine translation Object segmentation Gene prediction Protein alignment Numerous tasks in computational linguistics, speech, vision, biology. 64 / 94
  • 82. EX 5. Structured Prediction (2/2) Applications [Lampert et al. ECCV 08] [Desai et al. ICCV 09] 65 / 94
  • 83. EX 6. Bilateral Filtering (1/3) What’s Bilateral Filtering? A technique to smooth images while preserving edges Ubiquitous in image processing, computational photography 66 / 94
  • 84. EX 6. Bilateral Filtering (2/3) [Bennett and McMillan SIGGRAPH 05] [Eisemann and Durand SIGGRAPH 04] [Jones et al. SIGGRAPH 03] [Winnem¨oller et al. SIGGRAPH 06] [Bae et al. SIGGRAPH 02] 67 / 94
  • 85. EX 6. Bilateral Filtering (3/3) How does bilateral filter relate with other methods? Intepretation Bilateral filter is equivalent to mode filtering in local histograms Bilateral filter can be interpreted in term of robust statistics since it is related to a cost function Bilateral filter is a discretization of a particular kind of a PDE-based anisotropic diffusion 68 / 94
  • 86. EX 6. Bilateral Filtering (3/3) How does bilateral filter relate with other methods? Intepretation Bilateral filter is equivalent to mode filtering in local histograms Bilateral filter can be interpreted in term of robust statistics since it is related to a cost function Bilateral filter is a discretization of a particular kind of a PDE-based anisotropic diffusion 68 / 94
  • 87. EX 7. Sparse Representation (1/4) Ideas Natural signals (e.g. audio, image) usually admit sparse representation (i.e., can be well represented by a linear combination of a few atom signals) Successfully applied to various areas in signal/image precessing, vision and graphics. 69 / 94
  • 88. EX 7. Sparse Representation (2/4) Image Restoration [Aharon et al. TSP 06] [Julien et al. TIP 08] denoising Inpainting Demoisaic Inpainting 70 / 94
  • 89. EX 7. Sparse Representation (3/4) Classification [Wright et al. PAMI 09] [Julien et al. CVPR ECCV NIPS 08] face recognition edge detection texture classification pixel classification 71 / 94
  • 90. EX 7. Sparse Representation (4/4) Compressive sensing [donoho TIT 06] [Candes and Tao TIT 05 06] and more (e.g., low-rank matrix completion, robust PCA) 72 / 94
  • 91. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 73 / 94
  • 92. Add an appropriate adjective neXt = Adj + X There is only one religion, though there are a hundred versions of it. - George Bernard Shaw 74 / 94
  • 93. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 94. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 95. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 96. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 97. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 98. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 99. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 100. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 101. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 102. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 103. EX 1. Linear ⇔ Non-linear Hard to find a straingt line to seperate them into two cluster? Ideas Linear methods may not capture the nonlinear structure in the original data representation Nonlinear methods Kernel tricks (e.g., Kernel PCA, Kernel LDA, Kernel SVM, etc) Manifold learning (e.g., ISOMAP, LLE, Laplacian eigenmap, etc) 76 / 94
  • 104. EX 1. Linear ⇔ Non-linear Hard to find a straingt line to seperate them into two cluster? Ideas Linear methods may not capture the nonlinear structure in the original data representation Nonlinear methods Kernel tricks (e.g., Kernel PCA, Kernel LDA, Kernel SVM, etc) Manifold learning (e.g., ISOMAP, LLE, Laplacian eigenmap, etc) 76 / 94
  • 105. EX 2. Generative ⇔ Discriminative Classification task : X → Y Generative classifier estimate class-conditional pdfs P(X|Y) and prior probabilities P(Y) Naive Bayes, Mixtures of Gaussians, Mixtures of experts, Hidden Markov Models (HMM), Sigmoidal belief networks, Bayesian networks, Markov random fields (MRF) Discriminative classifier estimate posterior probabilities P(Y|X) Logistic regression, SVMs, Traditional neural networks, Nearest neighbor, Conditional Random Fields (CRF) Bayes’ rule P(X|Y)P(Y) P(Y|X) = P(X) Two different perspectives in viewing a problem 77 / 94
  • 106. EX 2. Generative ⇔ Discriminative Classification task : X → Y Generative classifier estimate class-conditional pdfs P(X|Y) and prior probabilities P(Y) Naive Bayes, Mixtures of Gaussians, Mixtures of experts, Hidden Markov Models (HMM), Sigmoidal belief networks, Bayesian networks, Markov random fields (MRF) Discriminative classifier estimate posterior probabilities P(Y|X) Logistic regression, SVMs, Traditional neural networks, Nearest neighbor, Conditional Random Fields (CRF) Bayes’ rule P(X|Y)P(Y) P(Y|X) = P(X) Two different perspectives in viewing a problem 77 / 94
  • 107. EX 2. Generative ⇔ Discriminative Classification task : X → Y Generative classifier estimate class-conditional pdfs P(X|Y) and prior probabilities P(Y) Naive Bayes, Mixtures of Gaussians, Mixtures of experts, Hidden Markov Models (HMM), Sigmoidal belief networks, Bayesian networks, Markov random fields (MRF) Discriminative classifier estimate posterior probabilities P(Y|X) Logistic regression, SVMs, Traditional neural networks, Nearest neighbor, Conditional Random Fields (CRF) Bayes’ rule P(X|Y)P(Y) P(Y|X) = P(X) Two different perspectives in viewing a problem 77 / 94
  • 108. EX 3. Rule-based / Hand-designed ⇔ Leanring-based Hard to find rules to recognize digits? Ideas It may be difficult to design a set of rule to do certain task such as handwritten digit recognition Turn to machine learning methods instead 78 / 94
  • 109. EX 4. Single scale ⇔ Multi-scale [Zelnik-Manor and Perona NIPS 04] Ideas We live in a multi-scale world (atom ↔ universe) Image pyraimds / scale-space theory / wavelet representation → all attempt to capture the multi-scale properties in signal/images. 79 / 94
  • 110. EX 5. Single step ⇔ Progressive [Yuan et al. SIGGRAPH 08] Ideas Some problems are difficult to solve in one step → solve it progressively 80 / 94
  • 111. EX 6. Batch processing ⇔ Incremental / Online processing Ideas Online methods can handle potentially infinite data samples and time-varied data Examples PCA → Incremental PCA (many variants) LDA → Incremental LDA (many variants) SVM → Incremental and decremental SVM [Cauwenberghs and Poggio NIPS 01] Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] → Online dictionary learning [Mairal et al. ICML/JMLR 09] AdaBoosting → Online boosting [Grabner and Bischof CVPR 06] Multiple instance boosting → Online multiple instance boosting [Babenko et al. CVPR 09] 81 / 94
  • 112. EX 6. Batch processing ⇔ Incremental / Online processing Ideas Online methods can handle potentially infinite data samples and time-varied data Examples PCA → Incremental PCA (many variants) LDA → Incremental LDA (many variants) SVM → Incremental and decremental SVM [Cauwenberghs and Poggio NIPS 01] Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] → Online dictionary learning [Mairal et al. ICML/JMLR 09] AdaBoosting → Online boosting [Grabner and Bischof CVPR 06] Multiple instance boosting → Online multiple instance boosting [Babenko et al. CVPR 09] 81 / 94
  • 113. EX 6. Batch processing ⇔ Incremental / Online processing Ideas Online methods can handle potentially infinite data samples and time-varied data Examples PCA → Incremental PCA (many variants) LDA → Incremental LDA (many variants) SVM → Incremental and decremental SVM [Cauwenberghs and Poggio NIPS 01] Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] → Online dictionary learning [Mairal et al. ICML/JMLR 09] AdaBoosting → Online boosting [Grabner and Bischof CVPR 06] Multiple instance boosting → Online multiple instance boosting [Babenko et al. CVPR 09] 81 / 94
  • 114. EX 6. Batch processing ⇔ Incremental / Online processing Ideas Online methods can handle potentially infinite data samples and time-varied data Examples PCA → Incremental PCA (many variants) LDA → Incremental LDA (many variants) SVM → Incremental and decremental SVM [Cauwenberghs and Poggio NIPS 01] Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] → Online dictionary learning [Mairal et al. ICML/JMLR 09] AdaBoosting → Online boosting [Grabner and Bischof CVPR 06] Multiple instance boosting → Online multiple instance boosting [Babenko et al. CVPR 09] 81 / 94
  • 115. EX 6. Batch processing ⇔ Incremental / Online processing Ideas Online methods can handle potentially infinite data samples and time-varied data Examples PCA → Incremental PCA (many variants) LDA → Incremental LDA (many variants) SVM → Incremental and decremental SVM [Cauwenberghs and Poggio NIPS 01] Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] → Online dictionary learning [Mairal et al. ICML/JMLR 09] AdaBoosting → Online boosting [Grabner and Bischof CVPR 06] Multiple instance boosting → Online multiple instance boosting [Babenko et al. CVPR 09] 81 / 94
  • 116. EX 6. Batch processing ⇔ Incremental / Online processing Ideas Online methods can handle potentially infinite data samples and time-varied data Examples PCA → Incremental PCA (many variants) LDA → Incremental LDA (many variants) SVM → Incremental and decremental SVM [Cauwenberghs and Poggio NIPS 01] Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] → Online dictionary learning [Mairal et al. ICML/JMLR 09] AdaBoosting → Online boosting [Grabner and Bischof CVPR 06] Multiple instance boosting → Online multiple instance boosting [Babenko et al. CVPR 09] 81 / 94
  • 117. EX 7. Fixed ⇔ Adaptive / Dynamic [Elad and Aharon TIP 06] Ideas Adaptive approaches usually outperform the predefined/fixed ones. 82 / 94
  • 118. EX 8. Parametric ⇔ Non-parametric Probability density estimation Parametric Assumes a specific functional form with paramter θ e.g., Gaussian distribution with unknown mean and variance, mixture of Gaussians Parameter estimation Estimative approach: p(x) = p(x|θbest ) Bayesian approach p(x) = a(θ)p(x|θ)dθ Non-parametric Do not assume a specific form of the probability distributions e.g., Histogram, kernel density estimation (or Parzen window method) 83 / 94
  • 119. EX 8. Parametric ⇔ Non-parametric Probability density estimation Parametric Assumes a specific functional form with paramter θ e.g., Gaussian distribution with unknown mean and variance, mixture of Gaussians Parameter estimation Estimative approach: p(x) = p(x|θbest ) Bayesian approach p(x) = a(θ)p(x|θ)dθ Non-parametric Do not assume a specific form of the probability distributions e.g., Histogram, kernel density estimation (or Parzen window method) 83 / 94
  • 120. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 121. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 122. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 123. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 124. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 125. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 126. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 127. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 128. EX 10. Z - aware [Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10] motion-aware video resizing Make your method be aware of potential failure cases Motion (e.g., video processing) Content (e.g., image processing) Semantic (e.g., image and video indexing/retrival) Context (e.g., image understanding) Occlusion (e.g., detection/tracking) 85 / 94
  • 129. EX 10. Z - aware [Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10] motion-aware video resizing Make your method be aware of potential failure cases Motion (e.g., video processing) Content (e.g., image processing) Semantic (e.g., image and video indexing/retrival) Context (e.g., image understanding) Occlusion (e.g., detection/tracking) 85 / 94
  • 130. EX 10. Z - aware [Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10] motion-aware video resizing Make your method be aware of potential failure cases Motion (e.g., video processing) Content (e.g., image processing) Semantic (e.g., image and video indexing/retrival) Context (e.g., image understanding) Occlusion (e.g., detection/tracking) 85 / 94
  • 131. EX 10. Z - aware [Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10] motion-aware video resizing Make your method be aware of potential failure cases Motion (e.g., video processing) Content (e.g., image processing) Semantic (e.g., image and video indexing/retrival) Context (e.g., image understanding) Occlusion (e.g., detection/tracking) 85 / 94
  • 132. EX 10. Z - aware [Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10] motion-aware video resizing Make your method be aware of potential failure cases Motion (e.g., video processing) Content (e.g., image processing) Semantic (e.g., image and video indexing/retrival) Context (e.g., image understanding) Occlusion (e.g., detection/tracking) 85 / 94
  • 133. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 86 / 94
  • 134. What is a bad idea? Naive combination of two or more methods Avoid a pipeline system paper Blind application of tools Use X feature and Y classifier without motivation and justification Follow the hype Too many competitors Do just because it can be done Do the right things, not just do things right 87 / 94
  • 141. Thank you for your kind attention. Questions? For more complete materials, please visit my blog http://jbhuang0604.blogspot.com/ 94 / 94