SlideShare a Scribd company logo
1 of 17
Download to read offline
Graph Convolution for Multimodal
Information Extraction from Visually
Rich Documents
Xiaojing Liu, Feiyu Gao, Qiong Zhang, Huasha Zhao
Alibaba Group
Presented by Chloé Laurent
WiMLDS Paper Study Session - April 16th 2020
Abstract
Howtoextract pre-definedentitiesfrom
VRDs?
• VisuallyRich Documents (VRDs) : purchase
receipts, insurance policy documents,
custom declarationforms...
• Visual and layoutinformation is essential for
document understanding: text serialized
into classic one-dimensionalsequence is not
enough.
• Introductionof a graph convolution based
model to combine textual and visual
informationfor information extraction.
WiMLDS Paper Study Session - April 16th 2020 2
Challenges of IE from VRDs
•How to effectively incorporate visual cues from the
document ?
•What about the scalability of the task ?
WiMLDS Paper Study Session - April 16th 2020 3
Contributions of this paper
• Computes graph embeddings for each
text segment with graph convolutions
• Graph embeddings are combined with
text embeddings to feed into a
standard BiLSTM-CRF for Information
Extraction
WiMLDS Paper Study Session - April 16th 2020 4
This method (March 2019)
outperformsBiLSTM-CRF baselines
on two real-worlddatasets.
Differ from baseline
Information Extraction
• Process of extracting structuredinformations from unstructured documents
• Progress recently made in this area are on plain text document essentially
WiMLDS Paper Study Session - April 16th 2020 5
Shaolei Wang, Yue Zhang, Wanxiang Che, and Ting Liu. 2018. Joint extraction of entities and relations based on a novel graph scheme. In IJCAI, pages 4461–4467.
Document Modeling
• Generated by Optical Character
Recognition system (OCR)
• Each text segment is comprised
of its position and the text
within it
WiMLDS Paper Study Session - April 16th 2020 6
Nodes Embedding
• Nodes represent text
segments
• Embedded using a single layer
of BiLSTM
WiMLDS Paper Study Session - April 16th 2020 7
Feature Extraction
Edges represent
visual dependencies between
two nodes (relative shapes
and distance)
WiMLDS Paper Study Session - April 16th 2020 8
Horizontal and
vertical
distance
between the 2
text boxes
Aspect ratio of
width and
height of the 2
text boxes
Graph Convolution
• Convolution is defined on the node-edge-node triplets (ti, rij, tj) instead of on the node
alone
• For node ti , features hij for each neighbor tj is extracted using a multi-layer perceptron
(MLP) network.
WiMLDS Paper Study Session - April 16th 2020 9
Node-Edge-Node triplet
• Combines visual features directly into
the neighbor representation
• The information of the current node is
copied across the neigbors
WiMLDS Paper Study Session - April 16th 2020 10
where || is the concatenateoperation
The neigbor features can
potentially learn where to
attend given the current
node
Focus on Graph Convolution Networks
• This model follows convolution
directly on the graph to model
the text segment graph of VRDs
• Explicit edge embedding into the
graph convolution network
which models the relationship
between nodes
WiMLDS Paper Study Session - April 16th 2020 11
Source: Zonghan Wu et. al., 2019
Self-Attention Mechanism
• In this model, graph convolution
is defined based on the self-
attention mechanism.
• Compute the output hidden
representation of each node by
attending to its neigbors
• Outputs are fed as inputs to the
next layer of graph convolution.
WiMLDS Paper Study Session - April 16th 2020 12
BiLSTM-CRF with Graph Embeddings
WiMLDS Paper Study Session - April 16th 2020 13
xi : Input token sequence of text segment
e(xi) : Word2Vec vectors as token embeddings
t'i: Graph embedding of the node
Training
• Custom annotation system to facilitate the labelling of ground truth
data
• Labelling of the values for each pre-defined entity and their locations
(bounding boxes)
• IOB tagging format, label O to all tokens in empty text segments
• Graph convolution layers and BiLSTM-CRF extractors are trained
together
• Multi-task learning approach to improve prediction accuracy
(segment classification task)
WiMLDS Paper Study Session - April 16th 2020 14
Results
• Performs in much the same way
on "simple" entities (invoice
number and date)
• Outperforms clearly on entities
which can not be represented by
text alone (price, tax, buyer,
seller)
WiMLDS Paper Study Session - April 16th 2020 15
ValueAdded
Invoices
(chinese)
International
Purchase
Receipts
(english)
Thank you for listening !
All figures are extracted from the main article discussed during this talk,
except when it's mentionned otherwise
WiMLDS Paper Study Session - April 16th 2020
Sources
• Main article : https://arxiv.org/pdf/1903.11279v1.pdf
• Information Extraction: https://arxiv.org/pdf/1708.03743.pdf
• Graph ConvolutionalNetwork: https://tkipf.github.io/graph-convolutional-
networks/
• Information Extraction fromgraphs: https://arxiv.org/pdf/1810.13083.pdf
• Graph Convolution Survey: https://arxiv.org/pdf/1901.00596.pdf
• Node classification by GCN: https://www.experoinc.com/post/node-
classification-by-graph-convolutional-network
• Graph Embedding: https://www-
cs.stanford.edu/people/jure/pubs/graphrepresentation-ieee17.pdf
• Similar approach: https://clgiles.ist.psu.edu/pubs/CVPR2017-connets.pdf
• Neural Architectures for Named Entity
Recognition: https://arxiv.org/pdf/1603.01360.pdf
WiMLDS Paper Study Session - April 16th 2020 17

More Related Content

What's hot

International Conference on Natural Language Processing Computational Linguis...
International Conference on Natural Language Processing Computational Linguis...International Conference on Natural Language Processing Computational Linguis...
International Conference on Natural Language Processing Computational Linguis...kevig
 
International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)ijfcst journal
 
International Journal in Foundations of Computer Science & Technology (IJFCST)
International Journal in Foundations of Computer Science & Technology (IJFCST)International Journal in Foundations of Computer Science & Technology (IJFCST)
International Journal in Foundations of Computer Science & Technology (IJFCST)ijfcstjournal
 
8th International Conference on Foundations of Computer Science & Technology ...
8th International Conference on Foundations of Computer Science & Technology ...8th International Conference on Foundations of Computer Science & Technology ...
8th International Conference on Foundations of Computer Science & Technology ...ijfcstjournal
 
8th International Conference on Foundations of Computer Science & Technology ...
8th International Conference on Foundations of Computer Science & Technology ...8th International Conference on Foundations of Computer Science & Technology ...
8th International Conference on Foundations of Computer Science & Technology ...ijfcstjournal
 
8th International Conference on Foundations of Computer Science & Technology ...
8th International Conference on Foundations of Computer Science & Technology ...8th International Conference on Foundations of Computer Science & Technology ...
8th International Conference on Foundations of Computer Science & Technology ...ijcsa
 
An Abstract Framework for Agent-Based Explanations in AI
An Abstract Framework for Agent-Based Explanations in AIAn Abstract Framework for Agent-Based Explanations in AI
An Abstract Framework for Agent-Based Explanations in AIGiovanni Ciatto
 
LOTED: Exploiting Linked Data in Analyzing European Procurement Notices
LOTED: Exploiting Linked Data in Analyzing European Procurement NoticesLOTED: Exploiting Linked Data in Analyzing European Procurement Notices
LOTED: Exploiting Linked Data in Analyzing European Procurement NoticesMathieu d'Aquin
 
9th International Conference on Natural Language Processing (NLP 2020)
9th International Conference on Natural Language Processing (NLP 2020)9th International Conference on Natural Language Processing (NLP 2020)
9th International Conference on Natural Language Processing (NLP 2020)ijcsity
 
12th International Conference on Applications of Graph Theory in Wireless Ad ...
12th International Conference on Applications of Graph Theory in Wireless Ad ...12th International Conference on Applications of Graph Theory in Wireless Ad ...
12th International Conference on Applications of Graph Theory in Wireless Ad ...dannyijwest
 
International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)ijfcst journal
 

What's hot (12)

International Conference on Natural Language Processing Computational Linguis...
International Conference on Natural Language Processing Computational Linguis...International Conference on Natural Language Processing Computational Linguis...
International Conference on Natural Language Processing Computational Linguis...
 
International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)
 
International Journal in Foundations of Computer Science & Technology (IJFCST)
International Journal in Foundations of Computer Science & Technology (IJFCST)International Journal in Foundations of Computer Science & Technology (IJFCST)
International Journal in Foundations of Computer Science & Technology (IJFCST)
 
8th International Conference on Foundations of Computer Science & Technology ...
8th International Conference on Foundations of Computer Science & Technology ...8th International Conference on Foundations of Computer Science & Technology ...
8th International Conference on Foundations of Computer Science & Technology ...
 
8th International Conference on Foundations of Computer Science & Technology ...
8th International Conference on Foundations of Computer Science & Technology ...8th International Conference on Foundations of Computer Science & Technology ...
8th International Conference on Foundations of Computer Science & Technology ...
 
Building arguments on Open Data
Building arguments on Open DataBuilding arguments on Open Data
Building arguments on Open Data
 
8th International Conference on Foundations of Computer Science & Technology ...
8th International Conference on Foundations of Computer Science & Technology ...8th International Conference on Foundations of Computer Science & Technology ...
8th International Conference on Foundations of Computer Science & Technology ...
 
An Abstract Framework for Agent-Based Explanations in AI
An Abstract Framework for Agent-Based Explanations in AIAn Abstract Framework for Agent-Based Explanations in AI
An Abstract Framework for Agent-Based Explanations in AI
 
LOTED: Exploiting Linked Data in Analyzing European Procurement Notices
LOTED: Exploiting Linked Data in Analyzing European Procurement NoticesLOTED: Exploiting Linked Data in Analyzing European Procurement Notices
LOTED: Exploiting Linked Data in Analyzing European Procurement Notices
 
9th International Conference on Natural Language Processing (NLP 2020)
9th International Conference on Natural Language Processing (NLP 2020)9th International Conference on Natural Language Processing (NLP 2020)
9th International Conference on Natural Language Processing (NLP 2020)
 
12th International Conference on Applications of Graph Theory in Wireless Ad ...
12th International Conference on Applications of Graph Theory in Wireless Ad ...12th International Conference on Applications of Graph Theory in Wireless Ad ...
12th International Conference on Applications of Graph Theory in Wireless Ad ...
 
International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)
 

Similar to "Graph Convolution for Multimodal Information Extraction from Visually Rich Documents" presented by Chloé Laurent (MLM Conseil)

Transforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachTransforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachFerdin Joe John Joseph PhD
 
IRJET- Behaviour of Hybrid Fibre Reinforced Sintered Fly Ash Aggregate Concre...
IRJET- Behaviour of Hybrid Fibre Reinforced Sintered Fly Ash Aggregate Concre...IRJET- Behaviour of Hybrid Fibre Reinforced Sintered Fly Ash Aggregate Concre...
IRJET- Behaviour of Hybrid Fibre Reinforced Sintered Fly Ash Aggregate Concre...IRJET Journal
 
IRJET- SVM-based Web Content Mining with Leaf Classification Unit From DOM-Tree
IRJET- SVM-based Web Content Mining with Leaf Classification Unit From DOM-TreeIRJET- SVM-based Web Content Mining with Leaf Classification Unit From DOM-Tree
IRJET- SVM-based Web Content Mining with Leaf Classification Unit From DOM-TreeIRJET Journal
 
Using (Semantic) Mediawiki on an Enterprise Knowledge Management Platform: fr...
Using (Semantic) Mediawiki on an Enterprise Knowledge Management Platform: fr...Using (Semantic) Mediawiki on an Enterprise Knowledge Management Platform: fr...
Using (Semantic) Mediawiki on an Enterprise Knowledge Management Platform: fr...Matteo Busanelli
 
SFScon22 - Peter Hopfgartner - Bridging the Gap between BIM and GIS with Open...
SFScon22 - Peter Hopfgartner - Bridging the Gap between BIM and GIS with Open...SFScon22 - Peter Hopfgartner - Bridging the Gap between BIM and GIS with Open...
SFScon22 - Peter Hopfgartner - Bridging the Gap between BIM and GIS with Open...South Tyrol Free Software Conference
 
Government GraphSummit: Optimizing the Supply Chain
Government GraphSummit: Optimizing the Supply ChainGovernment GraphSummit: Optimizing the Supply Chain
Government GraphSummit: Optimizing the Supply ChainNeo4j
 
Netlist Optimization for CMOS Place and Route in MICROWIND
Netlist Optimization for CMOS Place and Route in MICROWINDNetlist Optimization for CMOS Place and Route in MICROWIND
Netlist Optimization for CMOS Place and Route in MICROWINDIRJET Journal
 
Towards Flexible Indices for Distributed Graph Data: The Formal Schema-level...
Towards Flexible Indices for  Distributed Graph Data: The Formal Schema-level...Towards Flexible Indices for  Distributed Graph Data: The Formal Schema-level...
Towards Flexible Indices for Distributed Graph Data: The Formal Schema-level...Till Blume
 
RICS CPD Day London - Steven Eglinton - Geospatial BIM
RICS CPD Day London - Steven Eglinton - Geospatial BIMRICS CPD Day London - Steven Eglinton - Geospatial BIM
RICS CPD Day London - Steven Eglinton - Geospatial BIMGeoEnable Limited
 
IRJET- Predicting Bitcoin Prices using Convolutional Neural Network Algor...
IRJET-  	  Predicting Bitcoin Prices using Convolutional Neural Network Algor...IRJET-  	  Predicting Bitcoin Prices using Convolutional Neural Network Algor...
IRJET- Predicting Bitcoin Prices using Convolutional Neural Network Algor...IRJET Journal
 
Introduction to mago3D, an Open Source Based Digital Twin Platform
Introduction to mago3D, an Open Source Based Digital Twin PlatformIntroduction to mago3D, an Open Source Based Digital Twin Platform
Introduction to mago3D, an Open Source Based Digital Twin PlatformSANGHEE SHIN
 
Reference Knowledge Models for Smart Application
Reference Knowledge Models for Smart ApplicationReference Knowledge Models for Smart Application
Reference Knowledge Models for Smart ApplicationMaxime Lefrançois
 
Experimental Based Learning and Modeling of Computer Networks
Experimental Based Learning and Modeling of Computer NetworksExperimental Based Learning and Modeling of Computer Networks
Experimental Based Learning and Modeling of Computer Networksijtsrd
 
Graphics Standards and Algorithm
Graphics Standards and AlgorithmGraphics Standards and Algorithm
Graphics Standards and AlgorithmYatin Singh
 
JACIC
JACICJACIC
JACICDMacP
 
TeleCAD-GIS Enterprise Platform Overview
TeleCAD-GIS Enterprise Platform OverviewTeleCAD-GIS Enterprise Platform Overview
TeleCAD-GIS Enterprise Platform OverviewMaksim Sestic
 
Cahill_Ben_Value_Through_LDT.pptx
Cahill_Ben_Value_Through_LDT.pptxCahill_Ben_Value_Through_LDT.pptx
Cahill_Ben_Value_Through_LDT.pptxFIWARE
 
Scaling PageRank to 100 Billion Pages
Scaling PageRank to 100 Billion PagesScaling PageRank to 100 Billion Pages
Scaling PageRank to 100 Billion PagesSubhajit Sahu
 
IoT Semantic Interoperability: Keynote at Haystack Connect 2017
IoT Semantic Interoperability: Keynote at Haystack Connect 2017IoT Semantic Interoperability: Keynote at Haystack Connect 2017
IoT Semantic Interoperability: Keynote at Haystack Connect 2017Milan Milenkovic
 

Similar to "Graph Convolution for Multimodal Information Extraction from Visually Rich Documents" presented by Chloé Laurent (MLM Conseil) (20)

Transforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachTransforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approach
 
IRJET- Behaviour of Hybrid Fibre Reinforced Sintered Fly Ash Aggregate Concre...
IRJET- Behaviour of Hybrid Fibre Reinforced Sintered Fly Ash Aggregate Concre...IRJET- Behaviour of Hybrid Fibre Reinforced Sintered Fly Ash Aggregate Concre...
IRJET- Behaviour of Hybrid Fibre Reinforced Sintered Fly Ash Aggregate Concre...
 
IRJET- SVM-based Web Content Mining with Leaf Classification Unit From DOM-Tree
IRJET- SVM-based Web Content Mining with Leaf Classification Unit From DOM-TreeIRJET- SVM-based Web Content Mining with Leaf Classification Unit From DOM-Tree
IRJET- SVM-based Web Content Mining with Leaf Classification Unit From DOM-Tree
 
Using (Semantic) Mediawiki on an Enterprise Knowledge Management Platform: fr...
Using (Semantic) Mediawiki on an Enterprise Knowledge Management Platform: fr...Using (Semantic) Mediawiki on an Enterprise Knowledge Management Platform: fr...
Using (Semantic) Mediawiki on an Enterprise Knowledge Management Platform: fr...
 
SFScon22 - Peter Hopfgartner - Bridging the Gap between BIM and GIS with Open...
SFScon22 - Peter Hopfgartner - Bridging the Gap between BIM and GIS with Open...SFScon22 - Peter Hopfgartner - Bridging the Gap between BIM and GIS with Open...
SFScon22 - Peter Hopfgartner - Bridging the Gap between BIM and GIS with Open...
 
Government GraphSummit: Optimizing the Supply Chain
Government GraphSummit: Optimizing the Supply ChainGovernment GraphSummit: Optimizing the Supply Chain
Government GraphSummit: Optimizing the Supply Chain
 
Netlist Optimization for CMOS Place and Route in MICROWIND
Netlist Optimization for CMOS Place and Route in MICROWINDNetlist Optimization for CMOS Place and Route in MICROWIND
Netlist Optimization for CMOS Place and Route in MICROWIND
 
Towards Flexible Indices for Distributed Graph Data: The Formal Schema-level...
Towards Flexible Indices for  Distributed Graph Data: The Formal Schema-level...Towards Flexible Indices for  Distributed Graph Data: The Formal Schema-level...
Towards Flexible Indices for Distributed Graph Data: The Formal Schema-level...
 
1213532535.pdf
1213532535.pdf1213532535.pdf
1213532535.pdf
 
RICS CPD Day London - Steven Eglinton - Geospatial BIM
RICS CPD Day London - Steven Eglinton - Geospatial BIMRICS CPD Day London - Steven Eglinton - Geospatial BIM
RICS CPD Day London - Steven Eglinton - Geospatial BIM
 
IRJET- Predicting Bitcoin Prices using Convolutional Neural Network Algor...
IRJET-  	  Predicting Bitcoin Prices using Convolutional Neural Network Algor...IRJET-  	  Predicting Bitcoin Prices using Convolutional Neural Network Algor...
IRJET- Predicting Bitcoin Prices using Convolutional Neural Network Algor...
 
Introduction to mago3D, an Open Source Based Digital Twin Platform
Introduction to mago3D, an Open Source Based Digital Twin PlatformIntroduction to mago3D, an Open Source Based Digital Twin Platform
Introduction to mago3D, an Open Source Based Digital Twin Platform
 
Reference Knowledge Models for Smart Application
Reference Knowledge Models for Smart ApplicationReference Knowledge Models for Smart Application
Reference Knowledge Models for Smart Application
 
Experimental Based Learning and Modeling of Computer Networks
Experimental Based Learning and Modeling of Computer NetworksExperimental Based Learning and Modeling of Computer Networks
Experimental Based Learning and Modeling of Computer Networks
 
Graphics Standards and Algorithm
Graphics Standards and AlgorithmGraphics Standards and Algorithm
Graphics Standards and Algorithm
 
JACIC
JACICJACIC
JACIC
 
TeleCAD-GIS Enterprise Platform Overview
TeleCAD-GIS Enterprise Platform OverviewTeleCAD-GIS Enterprise Platform Overview
TeleCAD-GIS Enterprise Platform Overview
 
Cahill_Ben_Value_Through_LDT.pptx
Cahill_Ben_Value_Through_LDT.pptxCahill_Ben_Value_Through_LDT.pptx
Cahill_Ben_Value_Through_LDT.pptx
 
Scaling PageRank to 100 Billion Pages
Scaling PageRank to 100 Billion PagesScaling PageRank to 100 Billion Pages
Scaling PageRank to 100 Billion Pages
 
IoT Semantic Interoperability: Keynote at Haystack Connect 2017
IoT Semantic Interoperability: Keynote at Haystack Connect 2017IoT Semantic Interoperability: Keynote at Haystack Connect 2017
IoT Semantic Interoperability: Keynote at Haystack Connect 2017
 

More from Paris Women in Machine Learning and Data Science

More from Paris Women in Machine Learning and Data Science (20)

Managing international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha DimbanManaging international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha Dimban
 
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria KnorpsOptimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
 
Perspectives, by M. Pannegeon
Perspectives, by M. PannegeonPerspectives, by M. Pannegeon
Perspectives, by M. Pannegeon
 
Evaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled dataEvaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled data
 
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
 
An age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-PierreAn age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-Pierre
 
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle LautréApplying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
 
How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
How to supervise a thesis in NLP in the ChatGPT era? By Laure SoulierHow to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
 
Global Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna AbreuGlobal Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna Abreu
 
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie DelonPlug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
 
Sales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca IannuzziSales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca Iannuzzi
 
Identifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta BinkyteIdentifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta Binkyte
 
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
 
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
 
Sandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI projectSandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI project
 
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
 
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdfKhrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
 
Iana Iatsun_ML in production_20Dec2022.pdf
Iana Iatsun_ML in production_20Dec2022.pdfIana Iatsun_ML in production_20Dec2022.pdf
Iana Iatsun_ML in production_20Dec2022.pdf
 
41 WiMLDS Kyiv Paris Poznan.pdf
41 WiMLDS Kyiv Paris Poznan.pdf41 WiMLDS Kyiv Paris Poznan.pdf
41 WiMLDS Kyiv Paris Poznan.pdf
 
Emergency plan to secure winter: what are the measures set up by RTE?
Emergency plan to secure winter: what are the measures set up by RTE?Emergency plan to secure winter: what are the measures set up by RTE?
Emergency plan to secure winter: what are the measures set up by RTE?
 

Recently uploaded

Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 

Recently uploaded (20)

Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 

"Graph Convolution for Multimodal Information Extraction from Visually Rich Documents" presented by Chloé Laurent (MLM Conseil)

  • 1. Graph Convolution for Multimodal Information Extraction from Visually Rich Documents Xiaojing Liu, Feiyu Gao, Qiong Zhang, Huasha Zhao Alibaba Group Presented by Chloé Laurent WiMLDS Paper Study Session - April 16th 2020
  • 2. Abstract Howtoextract pre-definedentitiesfrom VRDs? • VisuallyRich Documents (VRDs) : purchase receipts, insurance policy documents, custom declarationforms... • Visual and layoutinformation is essential for document understanding: text serialized into classic one-dimensionalsequence is not enough. • Introductionof a graph convolution based model to combine textual and visual informationfor information extraction. WiMLDS Paper Study Session - April 16th 2020 2
  • 3. Challenges of IE from VRDs •How to effectively incorporate visual cues from the document ? •What about the scalability of the task ? WiMLDS Paper Study Session - April 16th 2020 3
  • 4. Contributions of this paper • Computes graph embeddings for each text segment with graph convolutions • Graph embeddings are combined with text embeddings to feed into a standard BiLSTM-CRF for Information Extraction WiMLDS Paper Study Session - April 16th 2020 4 This method (March 2019) outperformsBiLSTM-CRF baselines on two real-worlddatasets. Differ from baseline
  • 5. Information Extraction • Process of extracting structuredinformations from unstructured documents • Progress recently made in this area are on plain text document essentially WiMLDS Paper Study Session - April 16th 2020 5 Shaolei Wang, Yue Zhang, Wanxiang Che, and Ting Liu. 2018. Joint extraction of entities and relations based on a novel graph scheme. In IJCAI, pages 4461–4467.
  • 6. Document Modeling • Generated by Optical Character Recognition system (OCR) • Each text segment is comprised of its position and the text within it WiMLDS Paper Study Session - April 16th 2020 6
  • 7. Nodes Embedding • Nodes represent text segments • Embedded using a single layer of BiLSTM WiMLDS Paper Study Session - April 16th 2020 7
  • 8. Feature Extraction Edges represent visual dependencies between two nodes (relative shapes and distance) WiMLDS Paper Study Session - April 16th 2020 8 Horizontal and vertical distance between the 2 text boxes Aspect ratio of width and height of the 2 text boxes
  • 9. Graph Convolution • Convolution is defined on the node-edge-node triplets (ti, rij, tj) instead of on the node alone • For node ti , features hij for each neighbor tj is extracted using a multi-layer perceptron (MLP) network. WiMLDS Paper Study Session - April 16th 2020 9
  • 10. Node-Edge-Node triplet • Combines visual features directly into the neighbor representation • The information of the current node is copied across the neigbors WiMLDS Paper Study Session - April 16th 2020 10 where || is the concatenateoperation The neigbor features can potentially learn where to attend given the current node
  • 11. Focus on Graph Convolution Networks • This model follows convolution directly on the graph to model the text segment graph of VRDs • Explicit edge embedding into the graph convolution network which models the relationship between nodes WiMLDS Paper Study Session - April 16th 2020 11 Source: Zonghan Wu et. al., 2019
  • 12. Self-Attention Mechanism • In this model, graph convolution is defined based on the self- attention mechanism. • Compute the output hidden representation of each node by attending to its neigbors • Outputs are fed as inputs to the next layer of graph convolution. WiMLDS Paper Study Session - April 16th 2020 12
  • 13. BiLSTM-CRF with Graph Embeddings WiMLDS Paper Study Session - April 16th 2020 13 xi : Input token sequence of text segment e(xi) : Word2Vec vectors as token embeddings t'i: Graph embedding of the node
  • 14. Training • Custom annotation system to facilitate the labelling of ground truth data • Labelling of the values for each pre-defined entity and their locations (bounding boxes) • IOB tagging format, label O to all tokens in empty text segments • Graph convolution layers and BiLSTM-CRF extractors are trained together • Multi-task learning approach to improve prediction accuracy (segment classification task) WiMLDS Paper Study Session - April 16th 2020 14
  • 15. Results • Performs in much the same way on "simple" entities (invoice number and date) • Outperforms clearly on entities which can not be represented by text alone (price, tax, buyer, seller) WiMLDS Paper Study Session - April 16th 2020 15 ValueAdded Invoices (chinese) International Purchase Receipts (english)
  • 16. Thank you for listening ! All figures are extracted from the main article discussed during this talk, except when it's mentionned otherwise WiMLDS Paper Study Session - April 16th 2020
  • 17. Sources • Main article : https://arxiv.org/pdf/1903.11279v1.pdf • Information Extraction: https://arxiv.org/pdf/1708.03743.pdf • Graph ConvolutionalNetwork: https://tkipf.github.io/graph-convolutional- networks/ • Information Extraction fromgraphs: https://arxiv.org/pdf/1810.13083.pdf • Graph Convolution Survey: https://arxiv.org/pdf/1901.00596.pdf • Node classification by GCN: https://www.experoinc.com/post/node- classification-by-graph-convolutional-network • Graph Embedding: https://www- cs.stanford.edu/people/jure/pubs/graphrepresentation-ieee17.pdf • Similar approach: https://clgiles.ist.psu.edu/pubs/CVPR2017-connets.pdf • Neural Architectures for Named Entity Recognition: https://arxiv.org/pdf/1603.01360.pdf WiMLDS Paper Study Session - April 16th 2020 17