12. 画像認識への期待と可能性 / 中部大学工学部情報工学科 藤吉弘亘
• A Discriminatively Trained, Multiscale, Deformable Part Model [Felzenszwalb2008]
‒ Latent SVMを用いたパーツベースの物体検出
12
DPM:パーツベースの物体検出
ポイント
物体をパーツの集合として表現(Deformable Parts Model)
パーツの位置関係を考慮することで姿勢変動に対応
ルートフィルタ パーツフィルタ
パーツフィルタの
位置関係
13. 画像認識への期待と可能性 / 中部大学工学部情報工学科 藤吉弘亘
• Fast, Accurate Detection of 100,000 Object Classes on a Single Machine [Dean2013]
‒ 10万種類の物体を20秒以下で検出
13
バイナリコードを用いたHashによる10万種類の物体検出
Locality-sensitive Hashing with WTA
WTA codeをP個に分割 P個のコードそれぞれの
Hashテーブルを参照
クラス毎の
スコアヒストグラムを作成
各クラスのフィルタ応
答マップを作る
HOG特徴量
111101010011
WAT code
ポイント
多クラスDPMの高速化
パーツの集合に対して、WTA Hashを利用して超多クラスの検出を実現
22. 画像認識への期待と可能性 / 中部大学工学部情報工学科 藤吉弘亘 22
5年後の画像認識のトレンド:Prof. Tae-kyun Kim(Imperial College London)
• Combined of RF and Deep learning
‒ Random ForestとDeep Learningの融合
‒ 例:Decision Forest [Shotton2013]
!
!
!
!
• Long-term continuous learning
‒ never-ending image learning
‒ 終わりのない画像学習フレームワークの実現
(a) (b)
Figure 1: Motivation and notation. (a) An example use of a rooted decision DAG for classifying
image patches as belonging to grass, cow or sheep classes. Using DAGs instead of trees reduces the
number of nodes and can result in better generalization. For example, differently coloured patches
of grass (yellow and green) are merged together into node 4, because of similar class statistics. This
may encourage generalization by representing the fact that grass may appear as a mix of yellow and
green. (b) Notation for a DAG, its nodes, features and branches. See text for details.
input instance that reaches that node should progress through the left or right branch emanating from
the node. Prediction in binary decision trees involves every input starting at the root and moving
down as dictated by the split functions encountered at the split nodes. Prediction concludes when
the instance reaches a leaf node, each of which contains a unique prediction. For classification trees,
this prediction is a normalized histogram over class labels.
Rooted binary decision DAGs. Rooted binary DAGs have a different architecture compared to
decision trees and were introduced by Platt et al. [26] as a way of combining binary classifier for
multi-class classification tasks. More specifically a rooted binary DAG has: (i) one root node, with
in-degree 0; (ii) multiple split nodes, with in-degree 1 and out-degree 2; (iii) multiple leaf nodes,
2分木をネットワーク状に接続
省メモリ化とオーバーフィッティングを回避決
24. 画像認識への期待と可能性 / 中部大学工学部情報工学科 藤吉弘亘
• D. G. Lowe, Distinctive image features from scale-invariant keypoints , IJCV, Vol.60, No.2, pp.91-110, 2004.
• J. Matas, O. Chum, M. Urban, T. Pajdla, Robust wide baseline stereo from maximally stable extremal
regions. , BMVC, pp.384-396, 2002.
• K. Mikolajczyk, C. Schmid, Scale & affine invariant interest point detectors. International journal of computer
vision, Vol.60, No.1, pp.63-86, 2004.
• S. N. Sinha, J. Frahm, M. Pollefeys, Y. Genc, GPU-based Video Feature Tracking And Matching , Workshop
on Edge Computing Using New Commodity Architectures, 2006.
• H. Bay, T. Tuytelaars, L. Van Gool, SURF: Speeded Up Robust. Features , ECCV , pp.404-417, 2006.
• E. Rosten, R. Porter, T. Drummond, Faster and Better: A Machine Learning Approach To Corner Detection ,
PAMI, pp.105-119, 2010.
• M. Ozuysal, M. Calonder, V. Lepetit, P. Fua, Fast keypoint recognition using random ferns , PAMI, Vol.32, pp.
448-461, 2010.
• M. Calonder, V. Lepetit, C. Strecha, P. Fua, BRIEF: Binary Robust Independent Elementary Features , ECCV,
pp.778-792, 2010.
• E.Rublee, V.Rabaud, K.Konolige, G.Bradski ORB: an efficient alternative to SIFT or SURF , ICCV, 2011.
• M. Ambai, Y. Yoshida, CARD: Compact And Real-time Descriptors , ICCV, 2011.
• 上瀧剛, 内村圭一、 スペクトル理論のパターンマッチングへの応用 ,第17回画像の認識・理解シンポジウム, 2012.
• T. Tomasz, L. Vincent, Efficient Discriminative Projections for Compact Binary Descriptors , ECCV, pp.228‒
242, 2012.
• T. Tomasz, M. Christoudias, P. Fua, V. Lepetit, Boosting Binary Keypoint Descriptors ,CVPR, 2013.
24
参考文献(特徴点検出・記述)
25. 画像認識への期待と可能性 / 中部大学工学部情報工学科 藤吉弘亘
• 前田賢一, 渡辺貞一, 局所構造を導入したパターン・マッチング法 , 信学論D, Vol. J68, pp345-352, 1985.
• H. Murase, S. K. Nayar, Illumination planning for object recognition using parametric eigenspace,
PAMI, Vol. 16, pp.1219-1227, 1994
• T. Ojala, M. Pietikainen, T. Maenpaa, Multiresolution gray-scale and rotation invariant texture
classification with local binary patterns , PAMI, Vol.24, pp.971-987, 2002.
• 福井 和広, 山口 修, 鈴木 薫, 前田 賢一, 制約相互部分空間法を用いた環境変動にロバストな顔画像認識 ‒照明
変動の影響を抑える制約相互部分空間の学習‒ , 信学論 D-II Vol. J82, pp.613-620, 1999.
• N. Cristianini, J. Shawe-Taylor, An introduction to support vector machines and other kernel-based
learning methods , Cambridge university press, 2000.
• P. Viola, M. Jones, Rapid object detection using a boosted cascade of simple features , CVPR, vol.
1,pp.511-518, 2001.
• 佐藤雄隆, 金子俊一, 丹羽義典, 山本和彦, Radial Reach Filter (RRF) によるロバストな物体検出 (画像処理,
画像パターン認識) 信学論.D-II, Vol. J86, pp.616-624, 2003.
• G. Csurka, C. R. Dance, L. Fan, J. Willamowski, C. Bray, Visual Categorization with Bags of
Keypoints , ECCV, Vol. 1, pp. 1-2, 2004.
• T. Kobayashi, N. Otsu, Action and Simultaneous Multiple-Person Identification Using Cubic Higher
Order Local Auto-Correlation , ICPR, Vol. 4, pp.741-744, 2004
• N. Dalal, B. Triggs, Histograms of Oriented Gradients for Human Detection , CVPR, pp.886-893,
2005.
25
参考文献(特徴抽出・パターンマッチング)
26. 画像認識への期待と可能性 / 中部大学工学部情報工学科 藤吉弘亘
• 松原康晴, 尺長健, 疎テンプレートマッチングとその実時間物体追跡への応用 , 情報処理学会論文誌. CVIM, Vol.
46, pp.60-71, 2005.
• 河原 智一, 西山 正志, 山口 修, 直交相互部分空間法を用いた顔 認識, , CVIM, pp.17-24, 2005.
• C. Huang, H. Ai, Y. Li, S. Lao, Learning sparse features in granular space for multi-view face detection ,
FG, 2006.
• F. Perronnin, C. Dance, Fisher kernels on visual vocabularies for image categorization , CVPR, 2007.
• T. Watanabe, S. Ito, K. Yokoi, Co-occurrence histograms of oriented gradients for pedestrian
detection , In Advances in Image and Video Technology, pp. 37-47, 2009.
• H. Jegou, M. Douze, C. Schmid, P. Perez. Aggregating local descriptors into a compact image
representation , CVPR, 2010.
• L. J. Li, H. Su, E. P. Xing, F. Li, Object Bank: A High-Level Image Representation for Scene Classification
& Semantic Feature Sparsification , NIPS, Vol. 2, p.5, 2010.
• M. Hashimoto, T. Fujiwara, H. Koshimizu, H. Okuda, K. Sumi, Extraction of Unique Pixels based on Co-
occurrence Probability for High- speed Template Matching , Proceeding of International Symposium on
Optomechatronic Technologies, MVI-3, 2010.
• S. Hinterstoisser, V. Lepetit, S. Ilic, P. Fua, N. Navab, Dominant Orientation Templates for Real-Time
Detection of Texture-Less Objects , CVPR, pp.2257-2264, 2010.
• 上瀧剛, 内村圭一, 明るさ変動および雑音に頑健な固有値分解テンプレート法 , 電気学会論文誌C, Vol.131, No.9,
pp.1625‒1632, 2011.
• J. Deng, J. Krause, F. Li, Fine-grained crowdsourcing for fine-grained recognition.CVPR, pp. 580-587,
2013.
26
参考文献(特徴抽出・パターンマッチング)
27. 画像認識への期待と可能性 / 中部大学工学部情報工学科 藤吉弘亘
• D. E. Rumelhart, G. E. Hinton, R. J. Williams. Learning Internal Representations by Error Propagation , Parallel
distributed processing: Explorations in the microstructure of cognition, Volume 1: Foundations. MIT Press,
1986.
• C. Cortes, V. Vapnik, Support vector machine , Machine learning, Vol.20, No.3, 273-297, 1995.
• Y, Freund, R, E. Schapire, A decisiontheoretic generalization of on-line learning and an application to
boosting , Journal of Computer and System Sciences, No. 1, Vol. 55, pp. 119-139, 1997.
• L. Breiman, Random Forests. , Machine Learning 45 (1): 5-32, 2001.
• P. Geurts, D. Ernst, L. Wehenkel, Extremely randomized trees , Machine learning, Vol.63, No.1, pp.3-42, 2006.
• K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, Y. Singer, Online passive-aggressive algorithms . The
Journal of Machine Learning Research, pp.551-585, 2006.
• M. Ozuysal, P. Fua, V. Lepetit, Fast keypoint recognition in ten lines of code . ICPR, pp.1-8, 2007.
• P. Felzenszwalb, D. McAllester, D. Ramanan, A discriminatively trained, multiscale, deformable part model ,
CVPR, pp.1-8, 2008.
• J. Hamm, D. D. Lee, Grassmann discriminant analysis: a unifying view on subspace-based learning , ICML,
pp.376-383, 2008.
• R. Collobert, J. Weston, A unified architecture for natural language processing: Deep neural networks with
multitask learning , ICML, pp.160-167, 2008.
• C. H. Lampert, H. Nickisch, S. Harmeling, Learning To Detect Unseen Object Classes by Between-
ClassAttributeTransfer , CVPR, 2009.
• T. Malisiewicz, A. Gupta, A. A. Efros, Ensemble of exemplar-svms for object detection and beyond , ICCV, pp.
89-96, 2011.
27
参考文献(統計的学習法・最近傍探索)
28. 画像認識への期待と可能性 / 中部大学工学部情報工学科 藤吉弘亘
• H. Jegou, M. Douze, C. Schmid, Product quantization for nearest neighbor search , PAMI, Vol.33,
pp117-128, 2011.
• D. Parikh, K. Grauman, Relative attributes , ICCV, pp. 503-510, 2011.
• J. Shotton, T. Sharp, P. Kohli, S. Nowozin, J. Winn, A. Criminisi, Decision Jungles: Compact and Rich Models
for Classification , NIPS, pp.234-242, 2013.
28
参考文献(統計的学習法・最近傍探索)