SlideShare a Scribd company logo
1 of 1
Download to read offline
METHOD FOR AN AUTOMATIC GENERATION OF A SEMANTIC-LEVEL
           CONTEXTUAL TRANSLATIONAL DICTIONARY
                                                                                                Dmitry Kan
                                                                                                Faculty of Applied Mathematics and Control Processes,
                                             St. Petersburg State                               Department of Technology of Programming,
                                                  University                                    Peterhof, Russia
                                                                                                dmitry.kan@gmail.com


                       Abstract                                        Word alignment                                        Translational dictionary

In this paper we demonstrate the semantic
feature machine translation (MT) system
as a combination of two fundamental
approaches, where the rule-based side is                     Desperate to hold onto power , Pervez
supported by the functional model of the                     Musharraf has
Russian language and the statistical side                    discarded Pakistan ' s constitutional
                                                             framework and
utilizes statistical word alignment. The                     declared a state of emergency .
MT system relies on a semantic-level                         NULL ({20}) В ({})
                                                             отчаянном ({1 3 4})
contextual translational dictionary as its                   стремлении ({2}) удержать ({}) власть
key component. We will present the                           ({5}) ,
method for an automatic generation of the                    ({6}) Первез ({7}) Мушарраф ({8}) от-                     Parallel corpus: UMC 0.1
                                                             верг ({9 10})                                             86000 pairs of sentences
dictionary where disambiguation is done                      конституционную ({14 15})                                 1,3 million phrase pairs
on a semantic level.                                         систему ({})
                                                             Пакистана ({11 12 13}) и ({16})                           ~18000 resulting dictionary entries
                                                             объявил ({17}) о ({18})
                                                             введении ({})                                             В Y1>HabU(Y1:,ПРЕД:Z1)
                                                             чрезвычайного ({19 21})                                          <149>--->Within
 Computer semantics theory                                   положения ({}) . ({22})                                   В Y1>Loc(Y1:,ВНУТРИ$12/313/05
Thesis 1. Language is an algebraic system                    Table 1: Word alignment for English and Russian sentences (ПРЕД:Z1))
                                                                                                                              <146>--->at
{f1, .., fn, M}, where fi is basis function and                  Russian                 English                       В Y1>Loc(Y1:,Oper01(#,ПРЕД:Z1))
M is data structure (set of basis concepts) of                   NULL                    of                                   <208>--->In
a natural language L.                                                                                                  В Y1>Loc(Y1:,ПРЕД:Z1)
                                                                 отчаянном               Desperate to hold
Thesis 2. Each word in a sentence S is the                                                                                    <224>--->Throughout
                                                                 стремлении              to                            ...
name of its semantic function.
                                                                 власть                  power                         НА Y1>Direkt(Y1:,ВЕРХ$12/141/05
                                                                                                                       (ВИН:Z1))
 S  F ( f1 ( w11 ,..., w1k ),..., f n ( wn1 ,..., wnl )),       ,                       ,
                                                                                                                              <67>--->at
                                                                 Первез                  Pervez                        НА Y1>Direkt(Y1:,РОД:Z1)  <100>-
 wij  whm , i  h, j  m                                                                                             -->on
                                                                 Мушарраф                Musharraf
Thesis 3. Grammar links with semantics and                       отверг                  has discarded
                                                                                                                       НА Y1>Direkt(Y1:,РОД:Z1)  <69>--
can be incorporated into semantics                                                                                     ->for
                                                                 конституционную         constitutional framework      ...
dictionary
                                                                                                                       ОБРАЗ (РОД:Z1)  <2>--->a way
                                                                 Пакистана               Pakistan ´ s                  ОБЩЕМИРОВОЙ A1>Rel
Semantic Machine Translation                                     и                       and                           (A1:НЕЧТО$1,ПОЛНЫЙ$12/207/05
                                                                                                                       (МИР$1227))
          Model                                                  объявил                 declared
                                                                                                                              <1>--->global
                                                                 о                       a                             ...
SMTM P 
                                                                чрезвычайного          state emergency
arg max  (t ,..., t )  arg max  i (tk , tl )
                S                               s
                                                                .                      .
  i 1,n i 1 m             k 1,m 1 i
                                  l 2 ,m


where

                              1, t k tl  L M
   i (t , t )  
         S                                 2
               k l            0, t k tl  L M
                                           2


                                                    Features of Machine Translation System
            dictionary entries contain semantic attributes of the Russian words              the MT system is automatically extendable through acquiring new par-
            each entry represents a sample of a context extracted using statistical           allel corpora and applying the method of word alignment with semantic
             word alignment and coded with the corresponding semantic formula;                 analysis of sentences on source language side

More Related Content

More from Dmitry Kan

Solr onfitnesse learningfromberlinbuzzwords
Solr onfitnesse learningfromberlinbuzzwordsSolr onfitnesse learningfromberlinbuzzwords
Solr onfitnesse learningfromberlinbuzzwords
Dmitry Kan
 
Rule based approach to sentiment analysis at ROMIP 2011
Rule based approach to sentiment analysis at ROMIP 2011Rule based approach to sentiment analysis at ROMIP 2011
Rule based approach to sentiment analysis at ROMIP 2011
Dmitry Kan
 
Semantic feature machine translation system
Semantic feature machine translation systemSemantic feature machine translation system
Semantic feature machine translation system
Dmitry Kan
 
Introduction To Machine Translation 1
Introduction To Machine Translation 1Introduction To Machine Translation 1
Introduction To Machine Translation 1
Dmitry Kan
 
Introduction To Machine Translation
Introduction To Machine TranslationIntroduction To Machine Translation
Introduction To Machine Translation
Dmitry Kan
 
Automatic Build Of Semantic Translational Dictionary
Automatic Build Of Semantic Translational DictionaryAutomatic Build Of Semantic Translational Dictionary
Automatic Build Of Semantic Translational Dictionary
Dmitry Kan
 

More from Dmitry Kan (17)

Starget sentiment analyzer for English
Starget sentiment analyzer for EnglishStarget sentiment analyzer for English
Starget sentiment analyzer for English
 
Linguistic component Tokenizer for the Russian language
Linguistic component Tokenizer for the Russian languageLinguistic component Tokenizer for the Russian language
Linguistic component Tokenizer for the Russian language
 
Linguistic component Lemmatizer for the Russian language
Linguistic component Lemmatizer for the Russian languageLinguistic component Lemmatizer for the Russian language
Linguistic component Lemmatizer for the Russian language
 
Linguistic component Sentiment Analyzer for the Russian language
Linguistic component Sentiment Analyzer for the Russian languageLinguistic component Sentiment Analyzer for the Russian language
Linguistic component Sentiment Analyzer for the Russian language
 
Solr onfitnesse learningfromberlinbuzzwords
Solr onfitnesse learningfromberlinbuzzwordsSolr onfitnesse learningfromberlinbuzzwords
Solr onfitnesse learningfromberlinbuzzwords
 
MTEngine: Semantic-level Crowdsourced Machine Translation
MTEngine: Semantic-level Crowdsourced Machine TranslationMTEngine: Semantic-level Crowdsourced Machine Translation
MTEngine: Semantic-level Crowdsourced Machine Translation
 
Rule based approach to sentiment analysis at romip’11 slides
Rule based approach to sentiment analysis at romip’11 slidesRule based approach to sentiment analysis at romip’11 slides
Rule based approach to sentiment analysis at romip’11 slides
 
Machine translation course program (in English)
Machine translation course program (in English)Machine translation course program (in English)
Machine translation course program (in English)
 
Rule based approach to sentiment analysis at ROMIP 2011
Rule based approach to sentiment analysis at ROMIP 2011Rule based approach to sentiment analysis at ROMIP 2011
Rule based approach to sentiment analysis at ROMIP 2011
 
Icsoft 2011 51_cr
Icsoft 2011 51_crIcsoft 2011 51_cr
Icsoft 2011 51_cr
 
Semantic feature machine translation system
Semantic feature machine translation systemSemantic feature machine translation system
Semantic feature machine translation system
 
NoSQL, Apache SOLR and Apache Hadoop
NoSQL, Apache SOLR and Apache HadoopNoSQL, Apache SOLR and Apache Hadoop
NoSQL, Apache SOLR and Apache Hadoop
 
Introduction To Machine Translation 1
Introduction To Machine Translation 1Introduction To Machine Translation 1
Introduction To Machine Translation 1
 
Introduction To Machine Translation
Introduction To Machine TranslationIntroduction To Machine Translation
Introduction To Machine Translation
 
Semantic Analysis: theory, applications and use cases
Semantic Analysis: theory, applications and use casesSemantic Analysis: theory, applications and use cases
Semantic Analysis: theory, applications and use cases
 
Computer Semantics And Machine Translation
Computer Semantics And Machine TranslationComputer Semantics And Machine Translation
Computer Semantics And Machine Translation
 
Automatic Build Of Semantic Translational Dictionary
Automatic Build Of Semantic Translational DictionaryAutomatic Build Of Semantic Translational Dictionary
Automatic Build Of Semantic Translational Dictionary
 

Recently uploaded

Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
lizamodels9
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
dlhescort
 
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Sheetaleventcompany
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
dlhescort
 
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
lizamodels9
 

Recently uploaded (20)

Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptx
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRLBAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
 
PHX May 2024 Corporate Presentation Final
PHX May 2024 Corporate Presentation FinalPHX May 2024 Corporate Presentation Final
PHX May 2024 Corporate Presentation Final
 
Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
 
(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7
(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7
(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7
 
Uneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration PresentationUneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration Presentation
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
 
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 MonthsSEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptx
 
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
Malegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Malegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceMalegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Malegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
 
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
 

Poster: Method for an automatic generation of a semantic-level contextual translational dictionary

  • 1. METHOD FOR AN AUTOMATIC GENERATION OF A SEMANTIC-LEVEL CONTEXTUAL TRANSLATIONAL DICTIONARY Dmitry Kan Faculty of Applied Mathematics and Control Processes, St. Petersburg State Department of Technology of Programming, University Peterhof, Russia dmitry.kan@gmail.com Abstract Word alignment Translational dictionary In this paper we demonstrate the semantic feature machine translation (MT) system as a combination of two fundamental approaches, where the rule-based side is Desperate to hold onto power , Pervez supported by the functional model of the Musharraf has Russian language and the statistical side discarded Pakistan ' s constitutional framework and utilizes statistical word alignment. The declared a state of emergency . MT system relies on a semantic-level NULL ({20}) В ({}) отчаянном ({1 3 4}) contextual translational dictionary as its стремлении ({2}) удержать ({}) власть key component. We will present the ({5}) , method for an automatic generation of the ({6}) Первез ({7}) Мушарраф ({8}) от- Parallel corpus: UMC 0.1 верг ({9 10}) 86000 pairs of sentences dictionary where disambiguation is done конституционную ({14 15}) 1,3 million phrase pairs on a semantic level. систему ({}) Пакистана ({11 12 13}) и ({16}) ~18000 resulting dictionary entries объявил ({17}) о ({18}) введении ({}) В Y1>HabU(Y1:,ПРЕД:Z1) чрезвычайного ({19 21}) <149>--->Within Computer semantics theory положения ({}) . ({22}) В Y1>Loc(Y1:,ВНУТРИ$12/313/05 Thesis 1. Language is an algebraic system Table 1: Word alignment for English and Russian sentences (ПРЕД:Z1)) <146>--->at {f1, .., fn, M}, where fi is basis function and Russian English В Y1>Loc(Y1:,Oper01(#,ПРЕД:Z1)) M is data structure (set of basis concepts) of NULL of <208>--->In a natural language L. В Y1>Loc(Y1:,ПРЕД:Z1) отчаянном Desperate to hold Thesis 2. Each word in a sentence S is the <224>--->Throughout стремлении to ... name of its semantic function. власть power НА Y1>Direkt(Y1:,ВЕРХ$12/141/05 (ВИН:Z1)) S  F ( f1 ( w11 ,..., w1k ),..., f n ( wn1 ,..., wnl )), , , <67>--->at Первез Pervez НА Y1>Direkt(Y1:,РОД:Z1) <100>- wij  whm , i  h, j  m -->on Мушарраф Musharraf Thesis 3. Grammar links with semantics and отверг has discarded НА Y1>Direkt(Y1:,РОД:Z1) <69>-- can be incorporated into semantics ->for конституционную constitutional framework ... dictionary ОБРАЗ (РОД:Z1) <2>--->a way Пакистана Pakistan ´ s ОБЩЕМИРОВОЙ A1>Rel Semantic Machine Translation и and (A1:НЕЧТО$1,ПОЛНЫЙ$12/207/05 (МИР$1227)) Model объявил declared <1>--->global о a ... SMTM P  чрезвычайного state emergency arg max  (t ,..., t )  arg max  i (tk , tl ) S s . . i 1,n i 1 m k 1,m 1 i l 2 ,m where 1, t k tl  L M  i (t , t )   S 2 k l 0, t k tl  L M 2 Features of Machine Translation System  dictionary entries contain semantic attributes of the Russian words  the MT system is automatically extendable through acquiring new par-  each entry represents a sample of a context extracted using statistical allel corpora and applying the method of word alignment with semantic word alignment and coded with the corresponding semantic formula; analysis of sentences on source language side