A Graph-based Cross-lingual Projection Approach for Spoken Language Understanding Portability to a New Language

•

2 likes•921 views

$A Graph-based Cross-lingual Projection Approach for Spoken Language Understanding Portability to a New Language Seokhwan Kim Human Language Technology Department, Institute for Infocomm Research, Singapore Introduction Statistical approaches to SLU require a sufficient number of training examples to obtain good results Cross-lingual SLU using SMT technologies can improve the portability of SLU to a new language Previous work on cross-lingual SLU have focused on filtering out or correcting the noisy translations as post-processing We propose a graph-based projection approach to improve the robustness to the translation errors in cross-lingual SLU Cross-lingual SLU Using SMT TrainOnTarget Dataset in Ls SMT from Ls to Lt Translated Dataset in Lt SLU in Lt User Input in Lt TestTraining Annotations for a given word sequence x = {x1, · · · , xn} NE an NE tag sequence y = {y1, · · · , yn} DA a class variable z Example xs ys zs xt yt zt Show me flights to New York on Nov 18th î &â 11* 18Ò Êë ß ÃK š JBn to.city -b to.city -i month -b day -b o oooo to.city -b month -b day -b o o o oo o show_flight show_flight Direct Projection The simplest way of projection It propagates the annotations only with word alignments themselves It considers only the translation for each single utterance It is performed by a single pass process The results of direct projection can be unreliable because of erroneous translations and word alignments Graph-based Projection Graph Construction for NE Nodes All trigrams in the dataset Edges Monolingual: w(vi, vj) = simcosine(f(vi), f(vj)) = f(vi)·f(vj) |f(vi)||f(vj)| Bilingual: w(vk s , vl t ) = count(vk s ,vl t ) vm t count(vk s ,vm t ) Initial values Based on the manual annotations of NE in Ls vt vt vt vt vs vs vs vs Graph Construction for DA Nodes Utterance nodes U = {u1, · · · , um} Trigram nodes V Edges The edge between ui and vj has a binary weight value indicating whether vj in ui Initial values Based on the manual annotations of DA in Ls ut ut vt vt vt vt vs vs vs vs us us Label Propagation A graph-based semi-supervised learning algorithm It induces labels for all of the unlabeled nodes on the graph Experimental Settings Data 3,351 pairs of bi-utterances in English and Korean Manually annotated with 30 DA classes and 30 NE classes Toolkits Moses and SRILM for SMT Junto toolkit for Graph-based projection Maximum Entropy for DA identification Conditional Random Fields for NE recognition Measures 5-fold cross validation to the manual annotations on Lt Precision/recall/F-measure for NE recognition Accuracy for DA identification. Experimental results NE Korean→English English→Korean P R F P R F Supervised 97.6 95.4 96.4 97.1 96.9 97.0 TestOnSource 45.2 16.4 24.0 63.8 19.9 30.3 Direct 43.1 11.9 18.7 50.9 14.8 23.0 Graph-based 50.7 39.8 44.6 67.2 43.4 52.7 DA Accuracy (%) Korean→English English→Korean Supervised 87.7 83.3 TestOnSource 58.9 70.2 Direct 56.5 69.6 Graph-based 63.5 74.3 Conclusion This paper presented a graph-based projection approach for cross-lingual SLU using SMT Our approach performed a label propagation algorithm on a proposed graph that was defined with the translations for all over the dataset The feasibility of our approach was demonstrated by English and Korean SLU models Experimental results show that our graph-based projection helped to improve the performances of the cross-lingual SLU than previous approaches 1 Fusionopolis Way, #21-01 Connexis (South Tower), Singapore 138632 Email: kims@i2r.a-star.edu.sg WWW: http://hlt.i2r.a-star.edu.sg/$

What's hot

Ca notesankitadhoot

Graph coloring using backtrackingshashidharPapishetty

Introduction to Algorithms and Asymptotic NotationAmrinder Arora

Computability - Tractable, Intractable and Non-computable FunctionReggie Niccolo Santos

Wei Yang - 2014 - Consistent Improvement in Translation Quality of Chinese–Ja...Association for Computational Linguistics

Bipartite graphArafat Hossan

Artificial IntelligenceKALPANATCSE

Chromatic Number of a Graph (Graph Colouring)Adwait Hegde

Graph coloring problemV.V.Vanniaperumal College for Women

Fundamentals of Computing and C Programming - Part 3Karthik Srini B R

NP completenessAmrinder Arora

NP-Completeness - IIAmrinder Arora

First Technical PaperParinaaz Cobla

Sns pre semprabhatviet

Lab 1:c++سلمى شطا

Asymptotic NotationsNagendraK18

201506 - CSE340 Lecture 08Javier Gonzalez-Sanchez

Karin Quaasoxwocs

posterIraj Hedayati

Algorithm_NP-Completeness ProofIm Rafid

What's hot (20)

Ca notes

Graph coloring using backtracking

Introduction to Algorithms and Asymptotic Notation

Computability - Tractable, Intractable and Non-computable Function

Wei Yang - 2014 - Consistent Improvement in Translation Quality of Chinese–Ja...

Bipartite graph

Artificial Intelligence

Chromatic Number of a Graph (Graph Colouring)

Graph coloring problem

Fundamentals of Computing and C Programming - Part 3

NP completeness

NP-Completeness - II

First Technical Paper

Sns pre sem

Lab 1:c++

Asymptotic Notations

201506 - CSE340 Lecture 08

Karin Quaas

poster

Algorithm_NP-Completeness Proof

Similar to A Graph-based Cross-lingual Projection Approach for Spoken Language Understanding Portability to a New Language

Fast and Accurate Preordering for SMT using Neural NetworksSDL

Detecting paraphrases using recursive autoencodersFeynman Liang

深層意味表現学習 (Deep Semantic Representations)Danushka Bollegala

Introduction to Tree-LSTMsDaniel Perez

MT SUMMIT2013 poster boaster slides.Language-independent Model for Machine Tr...Lifeng (Aaron) Han

Topic model an introductionYueshen Xu

Language Technology Enhanced Learningtelss09

2-Chapter Two-N-gram Language Models.pptmilkesa13

EMNLP 2014: Opinion Mining with Deep Recurrent Neural NetworkPeinan ZHANG

Sentence Validation by Statistical Language Modeling and Semantic RelationsEditor IJCATR

semeval2016Lukáš Svoboda

Real Time Speech Enhancement in the Waveform DomainWilly Marroquin (WillyDevNET)

XtremeDistil: Multi-stage Distillation for Massive Multilingual ModelsSubhabrata Mukherjee

Phonetic distance based accentsipij

Generating sentences from a continuous spaceShuhei Iitsuka

SNLI_presentation_2Viral Gupta

Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...Deren Lei

A REVIEW ON PARTS-OF-SPEECH TECHNOLOGIESIJCSES Journal

Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon I...Association for Computational Linguistics

An improved spfa algorithm for single source shortest path problem using forw...IJMIT JOURNAL

Similar to A Graph-based Cross-lingual Projection Approach for Spoken Language Understanding Portability to a New Language (20)

Fast and Accurate Preordering for SMT using Neural Networks

Detecting paraphrases using recursive autoencoders

深層意味表現学習 (Deep Semantic Representations)

Introduction to Tree-LSTMs

MT SUMMIT2013 poster boaster slides.Language-independent Model for Machine Tr...

Topic model an introduction

Language Technology Enhanced Learning

2-Chapter Two-N-gram Language Models.ppt

EMNLP 2014: Opinion Mining with Deep Recurrent Neural Network

Sentence Validation by Statistical Language Modeling and Semantic Relations

semeval2016

Real Time Speech Enhancement in the Waveform Domain

XtremeDistil: Multi-stage Distillation for Massive Multilingual Models

Phonetic distance based accent

Generating sentences from a continuous space

SNLI_presentation_2

Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...

A REVIEW ON PARTS-OF-SPEECH TECHNOLOGIES

Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon I...

An improved spfa algorithm for single source shortest path problem using forw...

Recently uploaded

[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra

Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3

Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc

So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

A Framework for Development in the AI AgeCprime

Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani

Connecting the Dots for Information Discovery.pdfNeo4j

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3

UiPath Community: Communication Mining from Zero to HeroUiPathCommunity

Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3

A Journey Into the Emotions of Software DevelopersNicole Novielli

Rise of the Machines: Known As Drones...Rick Flair

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq

Recently uploaded (20)

[Webinar] SpiraTest - Setting New Standards in Quality Assurance

Emixa Mendix Meetup 11 April 2024 about Mendix Native development

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx

Long journey of Ruby standard library at RubyConf AU 2024

DevEX - reference for building teams, processes, and platforms

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy

So einfach geht modernes Roaming fuer Notes und Nomad.pdf

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

A Framework for Development in the AI Age

Potential of AI (Generative AI) in Business: Learnings and Insights

Connecting the Dots for Information Discovery.pdf

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx

UiPath Community: Communication Mining from Zero to Hero

Moving Beyond Passwords: FIDO Paris Seminar.pdf

A Journey Into the Emotions of Software Developers

Rise of the Machines: Known As Drones...

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx

Genislab builds better products and faster go-to-market with Lean project man...

A Graph-based Cross-lingual Projection Approach for Spoken Language Understanding Portability to a New Language

1. A Graph-based Cross-lingual Projection Approach for Spoken Language Understanding Portability to a New Language Seokhwan Kim Human Language Technology Department, Institute for Infocomm Research, Singapore Introduction Statistical approaches to SLU require a sufficient number of training examples to obtain good results Cross-lingual SLU using SMT technologies can improve the portability of SLU to a new language Previous work on cross-lingual SLU have focused on filtering out or correcting the noisy translations as post-processing We propose a graph-based projection approach to improve the robustness to the translation errors in cross-lingual SLU Cross-lingual SLU Using SMT TrainOnTarget Dataset in Ls SMT from Ls to Lt Translated Dataset in Lt SLU in Lt User Input in Lt TestTraining Annotations for a given word sequence x = {x1, · · · , xn} NE an NE tag sequence y = {y1, · · · , yn} DA a class variable z Example xs ys zs xt yt zt Show me flights to New York on Nov 18th î &â 11* 18Ò Êë ß ÃK š JBn to.city -b to.city -i month -b day -b o oooo to.city -b month -b day -b o o o oo o show_flight show_flight Direct Projection The simplest way of projection It propagates the annotations only with word alignments themselves It considers only the translation for each single utterance It is performed by a single pass process The results of direct projection can be unreliable because of erroneous translations and word alignments Graph-based Projection Graph Construction for NE Nodes All trigrams in the dataset Edges Monolingual: w(vi, vj) = simcosine(f(vi), f(vj)) = f(vi)·f(vj) |f(vi)||f(vj)| Bilingual: w(vk s , vl t ) = count(vk s ,vl t ) vm t count(vk s ,vm t ) Initial values Based on the manual annotations of NE in Ls vt vt vt vt vs vs vs vs Graph Construction for DA Nodes Utterance nodes U = {u1, · · · , um} Trigram nodes V Edges The edge between ui and vj has a binary weight value indicating whether vj in ui Initial values Based on the manual annotations of DA in Ls ut ut vt vt vt vt vs vs vs vs us us Label Propagation A graph-based semi-supervised learning algorithm It induces labels for all of the unlabeled nodes on the graph Experimental Settings Data 3,351 pairs of bi-utterances in English and Korean Manually annotated with 30 DA classes and 30 NE classes Toolkits Moses and SRILM for SMT Junto toolkit for Graph-based projection Maximum Entropy for DA identification Conditional Random Fields for NE recognition Measures 5-fold cross validation to the manual annotations on Lt Precision/recall/F-measure for NE recognition Accuracy for DA identification. Experimental results NE Korean→English English→Korean P R F P R F Supervised 97.6 95.4 96.4 97.1 96.9 97.0 TestOnSource 45.2 16.4 24.0 63.8 19.9 30.3 Direct 43.1 11.9 18.7 50.9 14.8 23.0 Graph-based 50.7 39.8 44.6 67.2 43.4 52.7 DA Accuracy (%) Korean→English English→Korean Supervised 87.7 83.3 TestOnSource 58.9 70.2 Direct 56.5 69.6 Graph-based 63.5 74.3 Conclusion This paper presented a graph-based projection approach for cross-lingual SLU using SMT Our approach performed a label propagation algorithm on a proposed graph that was defined with the translations for all over the dataset The feasibility of our approach was demonstrated by English and Korean SLU models Experimental results show that our graph-based projection helped to improve the performances of the cross-lingual SLU than previous approaches 1 Fusionopolis Way, #21-01 Connexis (South Tower), Singapore 138632 Email: kims@i2r.a-star.edu.sg WWW: http://hlt.i2r.a-star.edu.sg/

A Graph-based Cross-lingual Projection Approach for Spoken Language Understanding Portability to a New Language

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to A Graph-based Cross-lingual Projection Approach for Spoken Language Understanding Portability to a New Language

Similar to A Graph-based Cross-lingual Projection Approach for Spoken Language Understanding Portability to a New Language (20)

More from Seokhwan Kim

More from Seokhwan Kim (20)

Recently uploaded

Recently uploaded (20)

A Graph-based Cross-lingual Projection Approach for Spoken Language Understanding Portability to a New Language