SlideShare a Scribd company logo
1 of 23
Text Extraction from Infographics
Ansgar Scherp,
Kiel University and ZBW – Leibniz Information Centre for Economics, Germany
Falk Böschen,
Kiel University, Germany
LWA 2015 (KDML), Trier, Germany
Infographics Challenges
• Text with different font sizes
• Text with varying emphasis
• Text in different colors
• Text on different background colors
• Text rotated at different angles
• Text occluded by graphic elements
Slide [ 01 / 18 ]Falk Böschen and Ansgar Scherp
Initial presentation at [DocEng’15] → Now: Improve comparability and extensibility
Abstract Pipeline Idea
 Input: Information Graphic
1. RE: Extract regions from graphic
2. RC: Cluster regions into text and non-text elements
3. LC: Computation of text lines for orientation estimation
4. PRE: Preprocessing of text elements for OCR
5. OCR: Optical Character Recognition
6. POST: Post-correction of OCR result
 Output: Text
Slide [ 02 / 18 ]Falk Böschen and Ansgar Scherp
Region
Extraction
Region
Clustering
TextLine
Computa-tion
Preprocessing
OCR
Postprocessing
RE RC LC PRE POSTOCR
Excerpt of Related Work
Authors Title RE RC LC Pre OCR Post
Chiang & Knoblock Recognizing text in raster maps ✍ ✍ ✔ ✔ ✔ ✔
Jayant et al. Automated tactile graphics translation: in the field ✍ ✍ ✍ ✔ ✔
Sas & Zolnierek Three-Stage Method of Text Region Extraction from Diagram
Raster Images
✔ ✔ ✔ ✔
Huang et al. Associating Text and Graphics for Scientific Chart
Understanding
✔ ✍ ✔ ✔ ✔ ✍
Lu et al. Automated analysis of images in documents for intelligent
document search
✔ ✔ ✔
Xu & Krauthammer A New Pivoting and Iterative Text Detection Algorithm for
Biomedical Images
✔ ✔
Chen et al. DiagramFlyer: A Search Engine for Data-Driven Diagrams ? ? ? ? ? ?
Böschen & Scherp Multi-oriented Text Extraction from Information Graphics ✔ ✔ ✔ ✔ ✔
Gllavata et al. Adaptive Fuzzy Text Segmentation in Images with Complex
Backgrounds Using Color and Texture
✔ ✔
Fraz et al. Exploiting colour information for better scene text detection
and recognition
✔ ✔ ✔ ✔ ✔
Liu & Samarabandu Multiscale Edge-Based Text Extraction from Complex Images ✔ ✔
Olszewska Active contour based optical character recognition for
automated scene understanding
✔ ✔ ✔
Lu et al. Scene text extraction based on edges and support vector
regression
✔ ✔ ✍
Slide [ 03 / 18 ]Falk Böschen and Ansgar Scherp
Example: Adaptive Binarization and Labeling
• Binarization based on
Otsu‘s method
• Extended by hierarchical
computation using edge
images for split-decision
• Connected Component
Labeling with 8-neighbors
• Noise removal by region
size thresholding
Slide [ 04 / 18 ]Falk Böschen and Ansgar Scherp
Example: Grouping Regions
Slide [ 05 / 18 ]Falk Böschen and Ansgar Scherp
• Number of clusters
unknown
• Text is “dense”
→ DBSCAN
• DBSCAN does not
necessarily produce text
lines which are required
for reliable orientation
estimation
𝑓 =
𝑥
𝑦
𝑤
ℎ
𝑟
Example: Computing Text Lines
Slide [ 06 / 18 ]Falk Böschen and Ansgar Scherp
• Compute a Minimum
Spanning Tree for each
DBSCAN Cluster using
a reduced feature vector
• Split each MST
(if necessary) by using
the edge orientations
𝑓′ =
𝑥
𝑦
Example: Estimating the Orientation of Text Lines
Slide [ 07 / 18 ]Falk Böschen and Ansgar Scherp
• Transform the center of mass coordinates of each element of every
cluster into a discretized Hough space (one for each cluster)
→ a line/curve for each center of mass in Hough space
• Hough space discretized to 180 degree in 1 degree steps
• Find maximal value to obtain orientation of cluster
Maximum
Example: Rotating Text Lines and Applying OCR
Slide [ 08 / 18 ]Falk Böschen and Ansgar Scherp
• Cut each text element
out of the original image
• Rotate it accordingly to
the estimated angle
• Send it to an OCR
engine for recognition
• Reasonable OCR
engine: Tesseract (also
used in Google Books)
Ground Truth Generation
Falk Böschen and Ansgar Scherp Slide [ 09 / 18 ]
Evaluation Setup
Slide [ 10 / 18 ]Falk Böschen and Ansgar Scherp
item 1 Item 1
{e, i, m, t, 1}
{em, it, te}
{ite, tem}
{e, m, t, I, 1}
{em, te, It}
{tem, Ite}
Unigrams
Bigrams
Trigrams
Preliminary Evaluation Setup: Baselines
Baseline #1:
• OCR engine Tesseract with layout analysis
• Single execution on the whole infographic
Baseline #2:
• OCR engine Tesseract with layout analysis
• Multiple executions on the whole infographic at various angles
• Merging of the different executions results
+ + + +
Slide [ 11 / 18 ]Falk Böschen and Ansgar Scherp
Bilder oder Grafik
Slide [ 12 / 18 ]Falk Böschen and Ansgar Scherp
Evaluation Set: 121 Infographics (Domain Economics)
Dataset/Result set Characteristics
# 1-grams # 2-grams # 3-grams # Words Word Length
TX Pipeline AVG : 177.20
SD : 128.20
AVG : 127.34
SD : 100.51
AVG : 89.34
SD : 79.35
AVG : 50.07
SD : 31.95
AVG : 3.63
SD : 2.69
Baseline #1 AVG : 106.30
SD : 87.71
AVG : 80.17
SD : 69.12
AVG : 60.79
SD : 54.54
AVG : 25.21
SD : 22.12
AVG : 4.15
SD : 2.25
Baseline #2 AVG : 135.08
SD : 125.56
AVG : 100.20
SD : 98.20
AVG : 75.08
SD : 78.10
AVG : 35.25
SD : 33.94
AVG : 4.08
SD : 1.95
Ground Truth AVG : 150.65
SD : 122.28
AVG : 115.93
SD : 103.09
AVG : 84.95
SD : 85.61
AVG : 35.46
SD : 22.24
AVG : 4.22
SD : 1.48
Slide [ 13 / 18 ]Falk Böschen and Ansgar Scherp
# 1-grams # 2-grams # 3-grams # Words Word Length
TX Pipeline AVG : 177.20
SD : 128.20
AVG : 127.34
SD : 100.51
AVG : 89.34
SD : 79.35
AVG : 50.07
SD : 31.95
AVG : 3.63
SD : 2.69
Baseline #1 AVG : 106.30
SD : 87.71
AVG : 80.17
SD : 69.12
AVG : 60.79
SD : 54.54
AVG : 25.21
SD : 22.12
AVG : 4.15
SD : 2.25
Baseline #2 AVG : 135.08
SD : 125.56
AVG : 100.20
SD : 98.20
AVG : 75.08
SD : 78.10
AVG : 35.25
SD : 33.94
AVG : 4.08
SD : 1.95
Ground Truth AVG : 150.65
SD : 122.28
AVG : 115.93
SD : 103.09
AVG : 84.95
SD : 85.61
AVG : 35.46
SD : 22.24
AVG : 4.22
SD : 1.48
• Our pipeline extracts more characters and words than present in the data
→Increased chance to recognize all the textual information
• The baselines extract less characters and words than present in the data
→Obviously miss some text components
• There is a high standard deviation in general
→Infographics are very heterogeneous
Preliminary Evaluation Results
n-gram Precision Recall F1-measure
TX Pipeline 1
2
3
AVG: 0.50 SD: 0.41
AVG: 0.58 SD: 0.39
AVG: 0.52 SD: 0.39
AVG: 0.68 SD: 0.36
AVG: 0.54 SD: 0.38
AVG: 0.48 SD: 0.37
AVG: 0.47 SD: 0.39
AVG: 0.54 SD: 0.34
AVG: 0.49 SD: 0.37
Baseline #1 1
2
3
AVG: 0.37 SD: 0.36
AVG: 0.42 SD: 0.33
AVG: 0.42 SD: 0.31
AVG: 0.48 SD: 0.36
AVG: 0.42 SD: 0.34
AVG: 0.42 SD: 0.31
AVG: 0.36 SD: 0.35
AVG: 0.42 SD: 0.33
AVG: 0.36 SD: 0.33
Relative
Improvement
1
2
3
35.14 %
38.10 %
23.81 %
41.67 %
28.57 %
14.29 %
30.06 %
28.57 %
36.11 %
Slide [ 14 / 18 ]Falk Böschen and Ansgar Scherp
n-gram Precision Recall F1-measure
TX Pipeline 1
2
3
AVG: 0.50 SD: 0.41
AVG: 0.58 SD: 0.39
AVG: 0.52 SD: 0.39
AVG: 0.68 SD: 0.36
AVG: 0.54 SD: 0.38
AVG: 0.48 SD: 0.37
AVG: 0.47 SD: 0.39
AVG: 0.54 SD: 0.34
AVG: 0.49 SD: 0.37
Baseline #2 1
2
3
AVG: 0.37 SD: 0.37
AVG: 0.42 SD: 0.34
AVG: 0.42 SD: 0.32
AVG: 0.51 SD: 0.38
AVG: 0.42 SD: 0.35
AVG: 0.42 SD: 0.32
AVG: 0.36 SD: 0.36
AVG: 0.42 SD: 0.34
AVG: 0.42 SD: 0.32
Relative
Improvement
1
2
3
35.14 %
38.10 %
23.81 %
33.33 %
28.57 %
14.29 %
30.06 %
28.57 %
16.67 %
Preliminary Evaluation: Orientation Distributions
Here horizontal equals ±15° based on Tesseracts rotation tolerances
Falk Böschen and Ansgar Scherp Slide [ 15 / 18 ]
Preliminary Evaluation: Levenshtein Distance
Slide [ 16 / 18 ]Falk Böschen and Ansgar Scherp
Extreme Examples
Best Result Worst Result
Falk Böschen and Ansgar Scherp Slide [ 17 / 18 ]
P/R/F TX BL1 BL2
Unigram 0.95/0.95/0.95 0.02/0.26/0.02 0.02/0.26/0.02
Bigram 0.92/0.92/0.92 0.00/0.00/0.00 0.00/0.00/0.00
Trigram 0.92/0.92/0.92 0.00/0.00/0.00 0.00/0.00/0.00
Levenshtein 0.14 3.69 3.21
P/R/F TX BL1 BL2
Unigram 0.02/0.45/0.02 0.00/0.00/0.00 0.00/0.00/0.00
Bigram 0.00/0.00/0.00 0.00/0.00/0.00 0.00/0.00/0.00
Trigram 0.00/0.00/0.00 0.00/0.00/0.00 0.00/0.00/0.00
Levenshtein 3.47 0.14 0.14
Conclusion and Future Work
 Conclusion
• Automated pipeline for text extraction from infographics
• Independent of infographic type (no special knowledge required)
 Future Work
• Improvements necessary for individual/broken characters,
occlusion, dotted lines, shading, super-/subscripts, …
• Make different approaches comparable (implementations)
• Improved evaluation framework for different configurations
• Test of alternative OCR engines
• Expanding the ground truth set for extensive evaluation
Falk Böschen and Ansgar Scherp Slide [ 18 / 18 ]
Questions?
Ansgar Scherp
ZBW – Leibniz Information
Centre for Economics
and Kiel University
Germany
asc@informatik.uni-kiel.de
Falk Böschen
Kiel University
Germany
fboe@informatik.uni-kiel.de
http://www.kd.informatik.uni-kiel.de/en
The Road Ahead …
Falk Böschen and Ansgar Scherp
Phase 1: Text Line Localization
Structure of our Text Extraction Pipeline
Adaptive
Binarization
and Labeling
Grouping
Regions into
Text Elements
Computing of
Text Lines
Estimating the
Orientation of
Text Lines
Rotation of
Text Lines and
Applying OCR
Evaluation
Phase 2: Text Extraction and Evaluation
Falk Böschen and Ansgar Scherp
Otsu‘s Method
Input Image Output Image
Source: https://en.wikipedia.org/wiki/Otsu's_method
• Assumes two classes of pixels following bi-modal histogram (foreground
pixels and background pixels)
• Calculates the optimum threshold separating the two classes so that their
combined spread (intra-class variance) is minimal / that their inter-class
variance is maximal
• Extension of the original method to multi-level thresholding exist
Falk Böschen and Ansgar Scherp

More Related Content

What's hot

Streaming Algorithms
Streaming AlgorithmsStreaming Algorithms
Streaming AlgorithmsJoe Kelley
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAAlbert Bifet
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Jen Aman
 
MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 Albert Bifet
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streamsKrish_ver2
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Ian Foster
 
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...Databricks
 
Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream miningAlbert Bifet
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLFlink Forward
 
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATLParikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATLMLconf
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real TimeAlbert Bifet
 
Mining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDTMining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDTDavide Gallitelli
 
Magellan-Spark as a Geospatial Analytics Engine by Ram Sriharsha
Magellan-Spark as a Geospatial Analytics Engine by Ram SriharshaMagellan-Spark as a Geospatial Analytics Engine by Ram Sriharsha
Magellan-Spark as a Geospatial Analytics Engine by Ram SriharshaSpark Summit
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersAlbert Bifet
 
Signals from outer space
Signals from outer spaceSignals from outer space
Signals from outer spaceGraphAware
 
Moa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data StreamsMoa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data StreamsAlbert Bifet
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Spark Summit
 
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017StampedeCon
 
LSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in RecommendationLSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in RecommendationMaruf Aytekin
 

What's hot (20)

Streaming Algorithms
Streaming AlgorithmsStreaming Algorithms
Streaming Algorithms
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
 
MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streams
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
 
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
 
Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream mining
 
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
 
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATLParikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Mining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDTMining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDT
 
Magellan-Spark as a Geospatial Analytics Engine by Ram Sriharsha
Magellan-Spark as a Geospatial Analytics Engine by Ram SriharshaMagellan-Spark as a Geospatial Analytics Engine by Ram Sriharsha
Magellan-Spark as a Geospatial Analytics Engine by Ram Sriharsha
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream Classifiers
 
Signals from outer space
Signals from outer spaceSignals from outer space
Signals from outer space
 
Moa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data StreamsMoa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data Streams
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
 
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
 
LSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in RecommendationLSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in Recommendation
 

Viewers also liked

About Multimedia Presentation Generation and Multimedia Metadata: From Synthe...
About Multimedia Presentation Generation and Multimedia Metadata: From Synthe...About Multimedia Presentation Generation and Multimedia Metadata: From Synthe...
About Multimedia Presentation Generation and Multimedia Metadata: From Synthe...Ansgar Scherp
 
SchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open DataSchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open DataAnsgar Scherp
 
Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)
Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)
Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)Ansgar Scherp
 
A Framework for Iterative Signing of Graph Data on the Web
A Framework for Iterative Signing of Graph Data on the WebA Framework for Iterative Signing of Graph Data on the Web
A Framework for Iterative Signing of Graph Data on the WebAnsgar Scherp
 
Smart photo selection: interpret gaze as personal interest
Smart photo selection: interpret gaze as personal interestSmart photo selection: interpret gaze as personal interest
Smart photo selection: interpret gaze as personal interestAnsgar Scherp
 
Events in Multimedia - Theory, Model, Application
Events in Multimedia - Theory, Model, ApplicationEvents in Multimedia - Theory, Model, Application
Events in Multimedia - Theory, Model, ApplicationAnsgar Scherp
 

Viewers also liked (6)

About Multimedia Presentation Generation and Multimedia Metadata: From Synthe...
About Multimedia Presentation Generation and Multimedia Metadata: From Synthe...About Multimedia Presentation Generation and Multimedia Metadata: From Synthe...
About Multimedia Presentation Generation and Multimedia Metadata: From Synthe...
 
SchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open DataSchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open Data
 
Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)
Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)
Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)
 
A Framework for Iterative Signing of Graph Data on the Web
A Framework for Iterative Signing of Graph Data on the WebA Framework for Iterative Signing of Graph Data on the Web
A Framework for Iterative Signing of Graph Data on the Web
 
Smart photo selection: interpret gaze as personal interest
Smart photo selection: interpret gaze as personal interestSmart photo selection: interpret gaze as personal interest
Smart photo selection: interpret gaze as personal interest
 
Events in Multimedia - Theory, Model, Application
Events in Multimedia - Theory, Model, ApplicationEvents in Multimedia - Theory, Model, Application
Events in Multimedia - Theory, Model, Application
 

Similar to Formalization and Preliminary Evaluation of a Pipeline for Text Extraction From Infographics

Deep Learning Introduction - WeCloudData
Deep Learning Introduction - WeCloudDataDeep Learning Introduction - WeCloudData
Deep Learning Introduction - WeCloudDataWeCloudData
 
Next-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plotNext-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plotLi Shen
 
Enhanced characterness for text detection in the wild
Enhanced characterness for text detection in the wildEnhanced characterness for text detection in the wild
Enhanced characterness for text detection in the wildPrerana Mukherjee
 
Decision Forests and discriminant analysis
Decision Forests and discriminant analysisDecision Forests and discriminant analysis
Decision Forests and discriminant analysispotaters
 
Implementation of Computer Vision Applications using OpenCV in C++
Implementation of Computer Vision Applications using OpenCV in C++Implementation of Computer Vision Applications using OpenCV in C++
Implementation of Computer Vision Applications using OpenCV in C++IRJET Journal
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNNJunho Cho
 
Introduction to R for Learning Analytics Researchers
Introduction to R for Learning Analytics ResearchersIntroduction to R for Learning Analytics Researchers
Introduction to R for Learning Analytics ResearchersVitomir Kovanovic
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
 
Temporal Superpixels Based on Proximity-Weighted Patch Matching
Temporal Superpixels Based on Proximity-Weighted Patch MatchingTemporal Superpixels Based on Proximity-Weighted Patch Matching
Temporal Superpixels Based on Proximity-Weighted Patch MatchingNAVER Engineering
 
Detecting text from natural images with Stroke Width Transform
Detecting text from natural images with Stroke Width TransformDetecting text from natural images with Stroke Width Transform
Detecting text from natural images with Stroke Width TransformPooja G N
 
License Plate Recognition
License Plate RecognitionLicense Plate Recognition
License Plate RecognitionGilbert
 
Creating a Custom Serialization Format (Gophercon 2017)
Creating a Custom Serialization Format (Gophercon 2017)Creating a Custom Serialization Format (Gophercon 2017)
Creating a Custom Serialization Format (Gophercon 2017)Scott Mansfield
 
Miniproject final group 14
Miniproject final group 14Miniproject final group 14
Miniproject final group 14Ashish Mundhra
 
Tomoya Sato Master Thesis
Tomoya Sato Master ThesisTomoya Sato Master Thesis
Tomoya Sato Master Thesispflab
 
Lane detection by use of canny edge
Lane detection by use of canny edgeLane detection by use of canny edge
Lane detection by use of canny edgebanz23
 

Similar to Formalization and Preliminary Evaluation of a Pipeline for Text Extraction From Infographics (20)

Deep Learning Introduction - WeCloudData
Deep Learning Introduction - WeCloudDataDeep Learning Introduction - WeCloudData
Deep Learning Introduction - WeCloudData
 
Aocr Hmm Presentation
Aocr Hmm PresentationAocr Hmm Presentation
Aocr Hmm Presentation
 
Next-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plotNext-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plot
 
Enhanced characterness for text detection in the wild
Enhanced characterness for text detection in the wildEnhanced characterness for text detection in the wild
Enhanced characterness for text detection in the wild
 
Decision Forests and discriminant analysis
Decision Forests and discriminant analysisDecision Forests and discriminant analysis
Decision Forests and discriminant analysis
 
Implementation of Computer Vision Applications using OpenCV in C++
Implementation of Computer Vision Applications using OpenCV in C++Implementation of Computer Vision Applications using OpenCV in C++
Implementation of Computer Vision Applications using OpenCV in C++
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNN
 
Introduction to R for Learning Analytics Researchers
Introduction to R for Learning Analytics ResearchersIntroduction to R for Learning Analytics Researchers
Introduction to R for Learning Analytics Researchers
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
 
Temporal Superpixels Based on Proximity-Weighted Patch Matching
Temporal Superpixels Based on Proximity-Weighted Patch MatchingTemporal Superpixels Based on Proximity-Weighted Patch Matching
Temporal Superpixels Based on Proximity-Weighted Patch Matching
 
Detecting text from natural images with Stroke Width Transform
Detecting text from natural images with Stroke Width TransformDetecting text from natural images with Stroke Width Transform
Detecting text from natural images with Stroke Width Transform
 
Temporal Segment Network
Temporal Segment NetworkTemporal Segment Network
Temporal Segment Network
 
License Plate Recognition
License Plate RecognitionLicense Plate Recognition
License Plate Recognition
 
Creating a Custom Serialization Format (Gophercon 2017)
Creating a Custom Serialization Format (Gophercon 2017)Creating a Custom Serialization Format (Gophercon 2017)
Creating a Custom Serialization Format (Gophercon 2017)
 
Miniproject final group 14
Miniproject final group 14Miniproject final group 14
Miniproject final group 14
 
Tomoya Sato Master Thesis
Tomoya Sato Master ThesisTomoya Sato Master Thesis
Tomoya Sato Master Thesis
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
CAMSAP19
CAMSAP19CAMSAP19
CAMSAP19
 
Lane detection by use of canny edge
Lane detection by use of canny edgeLane detection by use of canny edge
Lane detection by use of canny edge
 

More from Ansgar Scherp

Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...
Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...
Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...Ansgar Scherp
 
STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...
STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...
STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...Ansgar Scherp
 
Text Localization in Scientific Figures using Fully Convolutional Neural Netw...
Text Localization in Scientific Figures using Fully Convolutional Neural Netw...Text Localization in Scientific Figures using Fully Convolutional Neural Netw...
Text Localization in Scientific Figures using Fully Convolutional Neural Netw...Ansgar Scherp
 
A Comparison of Approaches for Automated Text Extraction from Scholarly Figures
A Comparison of Approaches for Automated Text Extraction from Scholarly FiguresA Comparison of Approaches for Automated Text Extraction from Scholarly Figures
A Comparison of Approaches for Automated Text Extraction from Scholarly FiguresAnsgar Scherp
 
Can you see it? Annotating Image Regions based on Users' Gaze Information
Can you see it? Annotating Image Regions based on Users' Gaze InformationCan you see it? Annotating Image Regions based on Users' Gaze Information
Can you see it? Annotating Image Regions based on Users' Gaze InformationAnsgar Scherp
 
Linked open data - how to juggle with more than a billion triples
Linked open data - how to juggle with more than a billion triplesLinked open data - how to juggle with more than a billion triples
Linked open data - how to juggle with more than a billion triplesAnsgar Scherp
 
SchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open DataSchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open DataAnsgar Scherp
 
A Model of Events for Integrating Event-based Information in Complex Socio-te...
A Model of Events for Integrating Event-based Information in Complex Socio-te...A Model of Events for Integrating Event-based Information in Complex Socio-te...
A Model of Events for Integrating Event-based Information in Complex Socio-te...Ansgar Scherp
 
strukt - A Pattern System for Integrating Individual and Organizational Knowl...
strukt - A Pattern System for Integrating Individual and Organizational Knowl...strukt - A Pattern System for Integrating Individual and Organizational Knowl...
strukt - A Pattern System for Integrating Individual and Organizational Knowl...Ansgar Scherp
 
Identifying Objects in Images from Analyzing the User‘s Gaze Movements for Pr...
Identifying Objects in Images from Analyzing the User‘s Gaze Movements for Pr...Identifying Objects in Images from Analyzing the User‘s Gaze Movements for Pr...
Identifying Objects in Images from Analyzing the User‘s Gaze Movements for Pr...Ansgar Scherp
 

More from Ansgar Scherp (10)

Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...
Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...
Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...
 
STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...
STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...
STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...
 
Text Localization in Scientific Figures using Fully Convolutional Neural Netw...
Text Localization in Scientific Figures using Fully Convolutional Neural Netw...Text Localization in Scientific Figures using Fully Convolutional Neural Netw...
Text Localization in Scientific Figures using Fully Convolutional Neural Netw...
 
A Comparison of Approaches for Automated Text Extraction from Scholarly Figures
A Comparison of Approaches for Automated Text Extraction from Scholarly FiguresA Comparison of Approaches for Automated Text Extraction from Scholarly Figures
A Comparison of Approaches for Automated Text Extraction from Scholarly Figures
 
Can you see it? Annotating Image Regions based on Users' Gaze Information
Can you see it? Annotating Image Regions based on Users' Gaze InformationCan you see it? Annotating Image Regions based on Users' Gaze Information
Can you see it? Annotating Image Regions based on Users' Gaze Information
 
Linked open data - how to juggle with more than a billion triples
Linked open data - how to juggle with more than a billion triplesLinked open data - how to juggle with more than a billion triples
Linked open data - how to juggle with more than a billion triples
 
SchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open DataSchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open Data
 
A Model of Events for Integrating Event-based Information in Complex Socio-te...
A Model of Events for Integrating Event-based Information in Complex Socio-te...A Model of Events for Integrating Event-based Information in Complex Socio-te...
A Model of Events for Integrating Event-based Information in Complex Socio-te...
 
strukt - A Pattern System for Integrating Individual and Organizational Knowl...
strukt - A Pattern System for Integrating Individual and Organizational Knowl...strukt - A Pattern System for Integrating Individual and Organizational Knowl...
strukt - A Pattern System for Integrating Individual and Organizational Knowl...
 
Identifying Objects in Images from Analyzing the User‘s Gaze Movements for Pr...
Identifying Objects in Images from Analyzing the User‘s Gaze Movements for Pr...Identifying Objects in Images from Analyzing the User‘s Gaze Movements for Pr...
Identifying Objects in Images from Analyzing the User‘s Gaze Movements for Pr...
 

Recently uploaded

Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Sonam Pathan
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一Fs
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一Fs
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Excelmac1
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一Fs
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Dana Luther
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITMgdsc13
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Lucknow
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012rehmti665
 

Recently uploaded (20)

Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITM
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
 

Formalization and Preliminary Evaluation of a Pipeline for Text Extraction From Infographics

  • 1. Text Extraction from Infographics Ansgar Scherp, Kiel University and ZBW – Leibniz Information Centre for Economics, Germany Falk Böschen, Kiel University, Germany LWA 2015 (KDML), Trier, Germany
  • 2. Infographics Challenges • Text with different font sizes • Text with varying emphasis • Text in different colors • Text on different background colors • Text rotated at different angles • Text occluded by graphic elements Slide [ 01 / 18 ]Falk Böschen and Ansgar Scherp Initial presentation at [DocEng’15] → Now: Improve comparability and extensibility
  • 3. Abstract Pipeline Idea  Input: Information Graphic 1. RE: Extract regions from graphic 2. RC: Cluster regions into text and non-text elements 3. LC: Computation of text lines for orientation estimation 4. PRE: Preprocessing of text elements for OCR 5. OCR: Optical Character Recognition 6. POST: Post-correction of OCR result  Output: Text Slide [ 02 / 18 ]Falk Böschen and Ansgar Scherp Region Extraction Region Clustering TextLine Computa-tion Preprocessing OCR Postprocessing RE RC LC PRE POSTOCR
  • 4. Excerpt of Related Work Authors Title RE RC LC Pre OCR Post Chiang & Knoblock Recognizing text in raster maps ✍ ✍ ✔ ✔ ✔ ✔ Jayant et al. Automated tactile graphics translation: in the field ✍ ✍ ✍ ✔ ✔ Sas & Zolnierek Three-Stage Method of Text Region Extraction from Diagram Raster Images ✔ ✔ ✔ ✔ Huang et al. Associating Text and Graphics for Scientific Chart Understanding ✔ ✍ ✔ ✔ ✔ ✍ Lu et al. Automated analysis of images in documents for intelligent document search ✔ ✔ ✔ Xu & Krauthammer A New Pivoting and Iterative Text Detection Algorithm for Biomedical Images ✔ ✔ Chen et al. DiagramFlyer: A Search Engine for Data-Driven Diagrams ? ? ? ? ? ? Böschen & Scherp Multi-oriented Text Extraction from Information Graphics ✔ ✔ ✔ ✔ ✔ Gllavata et al. Adaptive Fuzzy Text Segmentation in Images with Complex Backgrounds Using Color and Texture ✔ ✔ Fraz et al. Exploiting colour information for better scene text detection and recognition ✔ ✔ ✔ ✔ ✔ Liu & Samarabandu Multiscale Edge-Based Text Extraction from Complex Images ✔ ✔ Olszewska Active contour based optical character recognition for automated scene understanding ✔ ✔ ✔ Lu et al. Scene text extraction based on edges and support vector regression ✔ ✔ ✍ Slide [ 03 / 18 ]Falk Böschen and Ansgar Scherp
  • 5. Example: Adaptive Binarization and Labeling • Binarization based on Otsu‘s method • Extended by hierarchical computation using edge images for split-decision • Connected Component Labeling with 8-neighbors • Noise removal by region size thresholding Slide [ 04 / 18 ]Falk Böschen and Ansgar Scherp
  • 6. Example: Grouping Regions Slide [ 05 / 18 ]Falk Böschen and Ansgar Scherp • Number of clusters unknown • Text is “dense” → DBSCAN • DBSCAN does not necessarily produce text lines which are required for reliable orientation estimation 𝑓 = 𝑥 𝑦 𝑤 ℎ 𝑟
  • 7. Example: Computing Text Lines Slide [ 06 / 18 ]Falk Böschen and Ansgar Scherp • Compute a Minimum Spanning Tree for each DBSCAN Cluster using a reduced feature vector • Split each MST (if necessary) by using the edge orientations 𝑓′ = 𝑥 𝑦
  • 8. Example: Estimating the Orientation of Text Lines Slide [ 07 / 18 ]Falk Böschen and Ansgar Scherp • Transform the center of mass coordinates of each element of every cluster into a discretized Hough space (one for each cluster) → a line/curve for each center of mass in Hough space • Hough space discretized to 180 degree in 1 degree steps • Find maximal value to obtain orientation of cluster Maximum
  • 9. Example: Rotating Text Lines and Applying OCR Slide [ 08 / 18 ]Falk Böschen and Ansgar Scherp • Cut each text element out of the original image • Rotate it accordingly to the estimated angle • Send it to an OCR engine for recognition • Reasonable OCR engine: Tesseract (also used in Google Books)
  • 10. Ground Truth Generation Falk Böschen and Ansgar Scherp Slide [ 09 / 18 ]
  • 11. Evaluation Setup Slide [ 10 / 18 ]Falk Böschen and Ansgar Scherp item 1 Item 1 {e, i, m, t, 1} {em, it, te} {ite, tem} {e, m, t, I, 1} {em, te, It} {tem, Ite} Unigrams Bigrams Trigrams
  • 12. Preliminary Evaluation Setup: Baselines Baseline #1: • OCR engine Tesseract with layout analysis • Single execution on the whole infographic Baseline #2: • OCR engine Tesseract with layout analysis • Multiple executions on the whole infographic at various angles • Merging of the different executions results + + + + Slide [ 11 / 18 ]Falk Böschen and Ansgar Scherp
  • 13. Bilder oder Grafik Slide [ 12 / 18 ]Falk Böschen and Ansgar Scherp Evaluation Set: 121 Infographics (Domain Economics)
  • 14. Dataset/Result set Characteristics # 1-grams # 2-grams # 3-grams # Words Word Length TX Pipeline AVG : 177.20 SD : 128.20 AVG : 127.34 SD : 100.51 AVG : 89.34 SD : 79.35 AVG : 50.07 SD : 31.95 AVG : 3.63 SD : 2.69 Baseline #1 AVG : 106.30 SD : 87.71 AVG : 80.17 SD : 69.12 AVG : 60.79 SD : 54.54 AVG : 25.21 SD : 22.12 AVG : 4.15 SD : 2.25 Baseline #2 AVG : 135.08 SD : 125.56 AVG : 100.20 SD : 98.20 AVG : 75.08 SD : 78.10 AVG : 35.25 SD : 33.94 AVG : 4.08 SD : 1.95 Ground Truth AVG : 150.65 SD : 122.28 AVG : 115.93 SD : 103.09 AVG : 84.95 SD : 85.61 AVG : 35.46 SD : 22.24 AVG : 4.22 SD : 1.48 Slide [ 13 / 18 ]Falk Böschen and Ansgar Scherp # 1-grams # 2-grams # 3-grams # Words Word Length TX Pipeline AVG : 177.20 SD : 128.20 AVG : 127.34 SD : 100.51 AVG : 89.34 SD : 79.35 AVG : 50.07 SD : 31.95 AVG : 3.63 SD : 2.69 Baseline #1 AVG : 106.30 SD : 87.71 AVG : 80.17 SD : 69.12 AVG : 60.79 SD : 54.54 AVG : 25.21 SD : 22.12 AVG : 4.15 SD : 2.25 Baseline #2 AVG : 135.08 SD : 125.56 AVG : 100.20 SD : 98.20 AVG : 75.08 SD : 78.10 AVG : 35.25 SD : 33.94 AVG : 4.08 SD : 1.95 Ground Truth AVG : 150.65 SD : 122.28 AVG : 115.93 SD : 103.09 AVG : 84.95 SD : 85.61 AVG : 35.46 SD : 22.24 AVG : 4.22 SD : 1.48 • Our pipeline extracts more characters and words than present in the data →Increased chance to recognize all the textual information • The baselines extract less characters and words than present in the data →Obviously miss some text components • There is a high standard deviation in general →Infographics are very heterogeneous
  • 15. Preliminary Evaluation Results n-gram Precision Recall F1-measure TX Pipeline 1 2 3 AVG: 0.50 SD: 0.41 AVG: 0.58 SD: 0.39 AVG: 0.52 SD: 0.39 AVG: 0.68 SD: 0.36 AVG: 0.54 SD: 0.38 AVG: 0.48 SD: 0.37 AVG: 0.47 SD: 0.39 AVG: 0.54 SD: 0.34 AVG: 0.49 SD: 0.37 Baseline #1 1 2 3 AVG: 0.37 SD: 0.36 AVG: 0.42 SD: 0.33 AVG: 0.42 SD: 0.31 AVG: 0.48 SD: 0.36 AVG: 0.42 SD: 0.34 AVG: 0.42 SD: 0.31 AVG: 0.36 SD: 0.35 AVG: 0.42 SD: 0.33 AVG: 0.36 SD: 0.33 Relative Improvement 1 2 3 35.14 % 38.10 % 23.81 % 41.67 % 28.57 % 14.29 % 30.06 % 28.57 % 36.11 % Slide [ 14 / 18 ]Falk Böschen and Ansgar Scherp n-gram Precision Recall F1-measure TX Pipeline 1 2 3 AVG: 0.50 SD: 0.41 AVG: 0.58 SD: 0.39 AVG: 0.52 SD: 0.39 AVG: 0.68 SD: 0.36 AVG: 0.54 SD: 0.38 AVG: 0.48 SD: 0.37 AVG: 0.47 SD: 0.39 AVG: 0.54 SD: 0.34 AVG: 0.49 SD: 0.37 Baseline #2 1 2 3 AVG: 0.37 SD: 0.37 AVG: 0.42 SD: 0.34 AVG: 0.42 SD: 0.32 AVG: 0.51 SD: 0.38 AVG: 0.42 SD: 0.35 AVG: 0.42 SD: 0.32 AVG: 0.36 SD: 0.36 AVG: 0.42 SD: 0.34 AVG: 0.42 SD: 0.32 Relative Improvement 1 2 3 35.14 % 38.10 % 23.81 % 33.33 % 28.57 % 14.29 % 30.06 % 28.57 % 16.67 %
  • 16. Preliminary Evaluation: Orientation Distributions Here horizontal equals ±15° based on Tesseracts rotation tolerances Falk Böschen and Ansgar Scherp Slide [ 15 / 18 ]
  • 17. Preliminary Evaluation: Levenshtein Distance Slide [ 16 / 18 ]Falk Böschen and Ansgar Scherp
  • 18. Extreme Examples Best Result Worst Result Falk Böschen and Ansgar Scherp Slide [ 17 / 18 ] P/R/F TX BL1 BL2 Unigram 0.95/0.95/0.95 0.02/0.26/0.02 0.02/0.26/0.02 Bigram 0.92/0.92/0.92 0.00/0.00/0.00 0.00/0.00/0.00 Trigram 0.92/0.92/0.92 0.00/0.00/0.00 0.00/0.00/0.00 Levenshtein 0.14 3.69 3.21 P/R/F TX BL1 BL2 Unigram 0.02/0.45/0.02 0.00/0.00/0.00 0.00/0.00/0.00 Bigram 0.00/0.00/0.00 0.00/0.00/0.00 0.00/0.00/0.00 Trigram 0.00/0.00/0.00 0.00/0.00/0.00 0.00/0.00/0.00 Levenshtein 3.47 0.14 0.14
  • 19. Conclusion and Future Work  Conclusion • Automated pipeline for text extraction from infographics • Independent of infographic type (no special knowledge required)  Future Work • Improvements necessary for individual/broken characters, occlusion, dotted lines, shading, super-/subscripts, … • Make different approaches comparable (implementations) • Improved evaluation framework for different configurations • Test of alternative OCR engines • Expanding the ground truth set for extensive evaluation Falk Böschen and Ansgar Scherp Slide [ 18 / 18 ]
  • 20. Questions? Ansgar Scherp ZBW – Leibniz Information Centre for Economics and Kiel University Germany asc@informatik.uni-kiel.de Falk Böschen Kiel University Germany fboe@informatik.uni-kiel.de http://www.kd.informatik.uni-kiel.de/en
  • 21. The Road Ahead … Falk Böschen and Ansgar Scherp
  • 22. Phase 1: Text Line Localization Structure of our Text Extraction Pipeline Adaptive Binarization and Labeling Grouping Regions into Text Elements Computing of Text Lines Estimating the Orientation of Text Lines Rotation of Text Lines and Applying OCR Evaluation Phase 2: Text Extraction and Evaluation Falk Böschen and Ansgar Scherp
  • 23. Otsu‘s Method Input Image Output Image Source: https://en.wikipedia.org/wiki/Otsu's_method • Assumes two classes of pixels following bi-modal histogram (foreground pixels and background pixels) • Calculates the optimum threshold separating the two classes so that their combined spread (intra-class variance) is minimal / that their inter-class variance is maximal • Extension of the original method to multi-level thresholding exist Falk Böschen and Ansgar Scherp

Editor's Notes

  1. No uniform use of terms: Biomedical Image Topographic/Geographic/Raster Map Scientific Chart Chart Image Chart Diagram Diagram [Raster] Image Information Graphic Infographic Mathematical/Scholarly Figure Flow/Pie/Bar/Column Chart Column/Bar/Line Graph 2D Plot Scatterplot No (automated) complete pipeline from infographic to text described Technical description in many cases insufficient for reproduction Comparison is difficult due to missing formalization
  2. In computer vision and image processing, Otsu's method, named after Nobuyuki Otsu (大津展之 Ōtsu Nobuyuki?), is used to automatically perform clustering-based image thresholding,[1] or, the reduction of a graylevel image to a binary image. The algorithm assumes that the image contains two classes of pixels following bi-modal histogram (foreground pixels and background pixels), it then calculates the optimum threshold separating the two classes so that their combined spread (intra-class variance) is minimal, or equivalently (because the sum of pairwise squared distances is constant), so that their inter-class variance is maximal.[2] Consequently, Otsu's method is roughly a one-dimensional, discrete analog of Fisher's Discriminant Analysis. The extension of the original method to multi-level thresholding is referred to as the Multi Otsu method. https://en.wikipedia.org/wiki/Otsu's_method