SlideShare a Scribd company logo
1 of 39
Download to read offline
INTENTO
Konstantin Savenkov
Intento CEO
Dodging AI Biases
in Future-Proof

Machine Translation

Solutions
© Intento, Inc. / October 2020
GlobalSaké
© Intento, Inc. / October 2020
AGENDA
2
Some context on Intento
—
Future of Work, AI and company culture
—
Using MT for multilingual communication
—
Case Study 1: Gender Bias
—
Case Study 2: Tone of Voice
—
Case Study 3: Data Locality
—
Key Takeaways
© Intento, Inc. / October 2020
SOME CONTEXT ON INTENTO
3
ENTERPRISES
MASSIVELY FAIL
* Share of US companies with successful AI deployment
(Deloitte State of Cognitive Survey 2017)
INTENTO4
20%*
Wrong vendor selected
Failed integrations
Failed pilots
Failed to deliver ROI
© Intento, Inc. / September 2020
TO ADOPT
AI
© Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
5
MT Procurement
MT Need MT Systems
Localization
© Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
6
MT Procurement
MT Need MT Systems
Localization
© Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
7
MT Procurement
—
MT Curation
MT Need MT Systems
Localization
© Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
8
MT Procurement
—
MT Curation
—
Multi-Engine MT
MT Need MT Systems
Localization
© Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
9
MT Procurement
—
MT Curation
—
Multi-Engine MT
—
Multi-Purpose MT
MT Need MT Systems
Localization
Customer Service
Office Productivity
Global Community
© Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
10
MT Procurement
—
MT Curation
—
Multi-Engine MT
—
Multi-Purpose MT
—
Continuous Improvement
MT Need MT Systems
Localization
Customer Service
Office Productivity
Global Community
© Intento, Inc. / October 2020
FUTURE OF WORK, AI,
AND COMPANY CULTURE
11
© Intento, Inc. / October 2020
FUTURE OF WORK AND AI
FROM TOOLS TO COLLEAGUES
12
AI is more than just a tool
—
experience (pre-training)
—
test assignments (evaluations on your data)
—
onboarding (domain adaptation)
—
continuous learning
© Intento, Inc. / October 2020
RETHINKING TEAMS
13
Team as a cooperation of
cognitive models,
both human and artificial
© Intento, Inc. / October 2020
AI AND COMPANY CULTURE
14
Can we afford to have only a
part of the company aligned
with its culture and values?
© Intento, Inc. / October 2020
MT FOR MULTILINGUAL
COMMUNICATION
15
© Intento, Inc. / October 2020
MT ADOPTION BOTTLENECKS
16
technical (integration)
—
linguistic (domain adaptation)
—
economical (supply chain)
—
cultural (biases)
—
security & legal (privacy)
© Intento, Inc. / October 2020
MT FOR COMMUNICATION
17
INTENTO MT
HUB
INTENTO
© Intento, Inc. / October 2020
GETTING COMMUNICATION RIGHT
18
pre-moderation is not feasible
—
right communication = right
culture
© Intento, Inc. / October 2020
GETTING COMMUNICATION RIGHT
19
pre-moderation is not feasible
—
right communication = right
culture
—
adding MT to the mix
© Intento, Inc. / October 2020
THINGS TO LOOK AFTER
20
Gender
—
Tone of Voice
—
Privacy
© Intento, Inc. / October 2020
WORKING AROUND
THE GENDER BIAS
21
© Intento, Inc. / October 2020
GENDER BIAS
IN MACHINE TRANSLATION
22
Gender Bias in MT as evaluated in WinoMT Challenge [1]
—
carefully measures
the bias
—
in practice,
other cases
create more issues
(see next slide)
[1] “Evaluating Gender Bias in Machine Translation”, Gabriel Stanovsky, Noah A. Smith, Luke Zettlemoyer, 2019, https://arxiv.org/abs/1906.00591
© Intento, Inc. / October 2020
GENDER BIAS
IN COMMUNICATION
23
Source text (English)
Machine Translation
(French)
COMMENT
Are you ready? Es-tu prêt? MASCULINE
Are you ready? Es-tu prête? FEMININE
Are you surprised? Tu es surpris? MASCULINE
Are you surprised? Tu es surprise? FEMININE
Lack of context
—
Defaults to either
feminine or
masculine
—
Baseline MT
engines are not
consistent
© Intento, Inc. / October 2020
GENDER BIAS
MOSTLY MASCULINE BY DEFAULT
24
English to French
—
31 segment
—
stock models
—
mostly masculine
A B C D E F GA B C D E F G
Default gender distribution
© Intento, Inc. / October 2020
GENDER BIAS CONTROL
HOW TO FIX IT?
25
Option 1: Copy & paste from
Google Translate Web App
(supports gender control)
—
Option 2: Use long phrases,
adding some context
—
Option 3: MT-agnostic NLP
Not for French, not secure, cumbersome,
no customization.
You can instruct support operators, but
not employees and or.
Works to a certain extent, provides a
wider choice of MT engines
© Intento, Inc. / October 2020
GENDER CONTROL
ADJUST TO FEMININE
26
English to French
—
31 segments
—
stock models
—
let’s make it more
FEMININE
A B C D E F GA B C D E F G
Gender adjustment => feminine
© Intento, Inc. / October 2020
GENDER CONTROL
ADJUST TO FEMININE
27
English to French
—
31 segments
—
stock models
—
let’s make it more
MASCULINE
A B C D E F GA B C D E F G
Gender adjustment => masculine
© Intento, Inc. / October 2020
CONTROLLING
TONE OF VOICE
28
© Intento, Inc. / October 2020
TONE OF VOICE CONTROL
SAMPLES FROM SUPPORT CHATS
29
Source text (English)
Machine Translation
(German)
COMMENT
Can you share your screen?
Können Sie Ihren Bildschirm
freigeben?
FORMAL
Could you help me? Kannst du mir helfen? INFORMAL
Make sure you report any of
these issues.
Stellen Sie sicher, dass Sie eines
dieser Probleme melden.
FORMAL
Can you give an example? Kannst du ein Beispiel geben? INFORMAL
Formal vs.
Informal
—
Crucial for Live
Chats
—
Baseline MT
engines are not
consistent
© Intento, Inc. / October 2020
TONE OF VOICE CONTROL
DEFAULT MT OUTPUT
30
English to German
—
210 segments
—
stock models
A B C D E F G
© Intento, Inc. / October 2020
TONE OF VOICE CONTROL
HOW TO MAKE IT INFORMAL?
31
Option 1: Use DeepL with
formality=less (99.5% accuracy)
—
Option 2: Generate synthetic
training data, hoping
translations become more
informal
—
Option 3: MT-agnostic NLP
What if you need a custom model and
terminology, or another MT has better
linguistic quality for you?
Expensive and time-consuming, also
introduces bias into the model
Works to a certain extent, provides a
wider choice of MT engines
© Intento, Inc. / October 2020
TONE OF VOICE CONTROL
MT-AGNOSTIC ADJUSTMENT
32
English to German
—
210 segments
—
stock models
—
let’s make it more
INFORMAL
A B C D E F G
© Intento, Inc. / October 2020
TONE OF VOICE CONTROL
MT-AGNOSTIC ADJUSTMENT
33
English to German
—
210 segments
—
stock models
—
let’s make it more
FORMAL
A B C D E F G
© Intento, Inc. / October 2020
PRIVACY PROTECTION
34
© Intento, Inc. / October 2020
DATA PROTECTION LAWS
35
A B C D E F
According to DLA Piper https://www.dlapiperdataprotection.com/index.html?t=world-map as fetched on 2020-10-20
© Intento, Inc. / October 2020
CLOUD MT DEPLOYMENTS
36
A B C D E F
Alibaba
Amazon
Baidu
DeepL
Globalese*
Google
GTCom*
IBM
Microsoft
Mirai
ModenMT*
Naver
Niutrans*
PROMT
Rozetta
SDL*
Sogou
Systran*
Tencent
Tilde*
Yandex
Youdao
* On-premise and private cloud deployment available
© Intento, Inc. / October 2020
DATA AND PRIVACY PROTECTION
37
Communication may contain PII, healthcare, HR, financial data.
—
Option 1:
- select proper MT vendor for every region
- when in doubt, use private-cloud deployments
—
Option 2:
- proper DPA and data protection clauses + insurance
—
Option 3: pseudonymization to remove PII
© Intento, Inc. / October 2020
KEY TAKEAWAYS
38
Machine Translation becomes more and more ubiquitous. It
becomes more like our coworker than a tool.
—
When it’s biased, it may damage our work environment and
culture.
—
As of today, it’s mostly masculine by gender and quite
inconsistent by tone of voice.
—
It’s possible to dodge those biases using NLP paired with MT.
—
Make sure you know where your MT sits so that you stay
compliant.
THANKS!
ks@inten.to
39
Konstantin Savenkov, CEO

ks@inten.to

2150 Shattuck Ave

Berkeley CA 94705
INTENTO
https://inten.to

More Related Content

What's hot

What's hot (8)

State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)
 
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
 
State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)
 
State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)
 
Intento Machine Translation Benchmark, July 2017
Intento Machine Translation Benchmark, July 2017Intento Machine Translation Benchmark, July 2017
Intento Machine Translation Benchmark, July 2017
 
Intento Enterprise MT Hub
Intento Enterprise MT HubIntento Enterprise MT Hub
Intento Enterprise MT Hub
 
Intento Enterprise MT Hub
Intento Enterprise MT HubIntento Enterprise MT Hub
Intento Enterprise MT Hub
 
Intento Enterprise MT Hub
Intento Enterprise MT HubIntento Enterprise MT Hub
Intento Enterprise MT Hub
 

Similar to Dodging AI biases in future-proof Machine Translation solutions

Integrating Service Mesh with Kubernetes-based connected vehicle platform
Integrating Service Mesh with Kubernetes-based connected vehicle platformIntegrating Service Mesh with Kubernetes-based connected vehicle platform
Integrating Service Mesh with Kubernetes-based connected vehicle platform
Jun Kai Yong
 
스마트 디바이스 현황_및__전자정부에_대한_제언
스마트 디바이스 현황_및__전자정부에_대한_제언스마트 디바이스 현황_및__전자정부에_대한_제언
스마트 디바이스 현황_및__전자정부에_대한_제언
Gori Communication
 
Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale
Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale
Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale
SMAU
 

Similar to Dodging AI biases in future-proof Machine Translation solutions (20)

Intento Enterprise MT Hub
Intento Enterprise MT HubIntento Enterprise MT Hub
Intento Enterprise MT Hub
 
Annette Zimmermann (Gartner): Gartner Strategic Predictions: What Will Disrup...
Annette Zimmermann (Gartner): Gartner Strategic Predictions: What Will Disrup...Annette Zimmermann (Gartner): Gartner Strategic Predictions: What Will Disrup...
Annette Zimmermann (Gartner): Gartner Strategic Predictions: What Will Disrup...
 
Improve Efficiency by Double Digits – Leveraging Artificial Intelligence and ...
Improve Efficiency by Double Digits – Leveraging Artificial Intelligence and ...Improve Efficiency by Double Digits – Leveraging Artificial Intelligence and ...
Improve Efficiency by Double Digits – Leveraging Artificial Intelligence and ...
 
Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)
 
Intelligenza artificiale: le sue potenzialità, la bozza di regolamento UE e r...
Intelligenza artificiale: le sue potenzialità, la bozza di regolamento UE e r...Intelligenza artificiale: le sue potenzialità, la bozza di regolamento UE e r...
Intelligenza artificiale: le sue potenzialità, la bozza di regolamento UE e r...
 
Kubernetes based connected vehicle platform #k8sjp_t1 #k8sjp
Kubernetes based connected vehicle platform #k8sjp_t1 #k8sjp Kubernetes based connected vehicle platform #k8sjp_t1 #k8sjp
Kubernetes based connected vehicle platform #k8sjp_t1 #k8sjp
 
Integrating Service Mesh with Kubernetes-based connected vehicle platform
Integrating Service Mesh with Kubernetes-based connected vehicle platformIntegrating Service Mesh with Kubernetes-based connected vehicle platform
Integrating Service Mesh with Kubernetes-based connected vehicle platform
 
Technology, Media And Telecommunications Prediction 0f 2020
Technology, Media And Telecommunications Prediction 0f 2020Technology, Media And Telecommunications Prediction 0f 2020
Technology, Media And Telecommunications Prediction 0f 2020
 
Bitmovin LIVE: NAB 2020 Kickoff Webinar - "COVID-19 and its impact on OTT Video"
Bitmovin LIVE: NAB 2020 Kickoff Webinar - "COVID-19 and its impact on OTT Video"Bitmovin LIVE: NAB 2020 Kickoff Webinar - "COVID-19 and its impact on OTT Video"
Bitmovin LIVE: NAB 2020 Kickoff Webinar - "COVID-19 and its impact on OTT Video"
 
June 27 top_10_techtrends_dcearley_176465
June 27 top_10_techtrends_dcearley_176465June 27 top_10_techtrends_dcearley_176465
June 27 top_10_techtrends_dcearley_176465
 
스마트 디바이스 현황_및__전자정부에_대한_제언
스마트 디바이스 현황_및__전자정부에_대한_제언스마트 디바이스 현황_및__전자정부에_대한_제언
스마트 디바이스 현황_및__전자정부에_대한_제언
 
Artificial Intelligence: WHAT CONSEQUENCES FOR PRINTING AND WEB-TO-PRINT?
Artificial Intelligence: WHAT CONSEQUENCES FOR PRINTING AND WEB-TO-PRINT?Artificial Intelligence: WHAT CONSEQUENCES FOR PRINTING AND WEB-TO-PRINT?
Artificial Intelligence: WHAT CONSEQUENCES FOR PRINTING AND WEB-TO-PRINT?
 
Jaist satellite 20180301 v6
Jaist satellite 20180301 v6Jaist satellite 20180301 v6
Jaist satellite 20180301 v6
 
SAARIKOSKI YLE metadata machine
SAARIKOSKI YLE metadata machineSAARIKOSKI YLE metadata machine
SAARIKOSKI YLE metadata machine
 
Customer Centric Innovation in a World of Shiny Objects
Customer Centric Innovation in a World of Shiny ObjectsCustomer Centric Innovation in a World of Shiny Objects
Customer Centric Innovation in a World of Shiny Objects
 
HP Software Performance Tour 2014 - Apps, Big Data and Security 20/20
HP Software Performance Tour 2014 - Apps, Big Data and Security 20/20HP Software Performance Tour 2014 - Apps, Big Data and Security 20/20
HP Software Performance Tour 2014 - Apps, Big Data and Security 20/20
 
Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale
Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale
Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale
 
What Is GPT-3 And Why Is It Revolutionizing Artificial Intelligence?
What Is GPT-3 And Why Is It Revolutionizing Artificial Intelligence?What Is GPT-3 And Why Is It Revolutionizing Artificial Intelligence?
What Is GPT-3 And Why Is It Revolutionizing Artificial Intelligence?
 
Voice logger By infeetel
Voice logger By infeetelVoice logger By infeetel
Voice logger By infeetel
 
Software development for the diversification of Nigeria Ecomony
Software development for the diversification of Nigeria EcomonySoftware development for the diversification of Nigeria Ecomony
Software development for the diversification of Nigeria Ecomony
 

More from Konstantin Savenkov

More from Konstantin Savenkov (13)

GPT and other Text Transformers: Black Swans and Stochastic Parrots
GPT and other Text Transformers:  Black Swans and Stochastic ParrotsGPT and other Text Transformers:  Black Swans and Stochastic Parrots
GPT and other Text Transformers: Black Swans and Stochastic Parrots
 
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
 
Сравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного переводаСравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного перевода
 
NLU / Intent Detection Benchmark by Intento, August 2017
NLU / Intent Detection Benchmark by Intento, August 2017NLU / Intent Detection Benchmark by Intento, August 2017
NLU / Intent Detection Benchmark by Intento, August 2017
 
Building a Data Driven Business
Building a Data Driven BusinessBuilding a Data Driven Business
Building a Data Driven Business
 
Управление бизнесом на основе данных
Управление бизнесом на основе данныхУправление бизнесом на основе данных
Управление бизнесом на основе данных
 
Messengers, Bots and Personal Assistants
Messengers, Bots and Personal AssistantsMessengers, Bots and Personal Assistants
Messengers, Bots and Personal Assistants
 
Рекомендательные системы: роль и оценка эффективности
Рекомендательные системы: роль и оценка эффективностиРекомендательные системы: роль и оценка эффективности
Рекомендательные системы: роль и оценка эффективности
 
Measuring the agile process improvement
Measuring the agile process improvementMeasuring the agile process improvement
Measuring the agile process improvement
 
Lean production для SAAS
Lean production для SAASLean production для SAAS
Lean production для SAAS
 
Driving Business Goals with Recommender Systems @ YAC/m 2015
Driving Business Goals with Recommender Systems @ YAC/m 2015Driving Business Goals with Recommender Systems @ YAC/m 2015
Driving Business Goals with Recommender Systems @ YAC/m 2015
 
The Economics of Recommender Systems
The Economics of Recommender SystemsThe Economics of Recommender Systems
The Economics of Recommender Systems
 
Recommender Systems in a nutshell
Recommender Systems in a nutshellRecommender Systems in a nutshell
Recommender Systems in a nutshell
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Dodging AI biases in future-proof Machine Translation solutions

  • 1. INTENTO Konstantin Savenkov Intento CEO Dodging AI Biases in Future-Proof Machine Translation Solutions © Intento, Inc. / October 2020 GlobalSaké
  • 2. © Intento, Inc. / October 2020 AGENDA 2 Some context on Intento — Future of Work, AI and company culture — Using MT for multilingual communication — Case Study 1: Gender Bias — Case Study 2: Tone of Voice — Case Study 3: Data Locality — Key Takeaways
  • 3. © Intento, Inc. / October 2020 SOME CONTEXT ON INTENTO 3
  • 4. ENTERPRISES MASSIVELY FAIL * Share of US companies with successful AI deployment (Deloitte State of Cognitive Survey 2017) INTENTO4 20%* Wrong vendor selected Failed integrations Failed pilots Failed to deliver ROI © Intento, Inc. / September 2020 TO ADOPT AI
  • 5. © Intento, Inc. / October 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 5 MT Procurement MT Need MT Systems Localization
  • 6. © Intento, Inc. / October 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 6 MT Procurement MT Need MT Systems Localization
  • 7. © Intento, Inc. / October 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 7 MT Procurement — MT Curation MT Need MT Systems Localization
  • 8. © Intento, Inc. / October 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 8 MT Procurement — MT Curation — Multi-Engine MT MT Need MT Systems Localization
  • 9. © Intento, Inc. / October 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 9 MT Procurement — MT Curation — Multi-Engine MT — Multi-Purpose MT MT Need MT Systems Localization Customer Service Office Productivity Global Community
  • 10. © Intento, Inc. / October 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 10 MT Procurement — MT Curation — Multi-Engine MT — Multi-Purpose MT — Continuous Improvement MT Need MT Systems Localization Customer Service Office Productivity Global Community
  • 11. © Intento, Inc. / October 2020 FUTURE OF WORK, AI, AND COMPANY CULTURE 11
  • 12. © Intento, Inc. / October 2020 FUTURE OF WORK AND AI FROM TOOLS TO COLLEAGUES 12 AI is more than just a tool — experience (pre-training) — test assignments (evaluations on your data) — onboarding (domain adaptation) — continuous learning
  • 13. © Intento, Inc. / October 2020 RETHINKING TEAMS 13 Team as a cooperation of cognitive models, both human and artificial
  • 14. © Intento, Inc. / October 2020 AI AND COMPANY CULTURE 14 Can we afford to have only a part of the company aligned with its culture and values?
  • 15. © Intento, Inc. / October 2020 MT FOR MULTILINGUAL COMMUNICATION 15
  • 16. © Intento, Inc. / October 2020 MT ADOPTION BOTTLENECKS 16 technical (integration) — linguistic (domain adaptation) — economical (supply chain) — cultural (biases) — security & legal (privacy)
  • 17. © Intento, Inc. / October 2020 MT FOR COMMUNICATION 17 INTENTO MT HUB INTENTO
  • 18. © Intento, Inc. / October 2020 GETTING COMMUNICATION RIGHT 18 pre-moderation is not feasible — right communication = right culture
  • 19. © Intento, Inc. / October 2020 GETTING COMMUNICATION RIGHT 19 pre-moderation is not feasible — right communication = right culture — adding MT to the mix
  • 20. © Intento, Inc. / October 2020 THINGS TO LOOK AFTER 20 Gender — Tone of Voice — Privacy
  • 21. © Intento, Inc. / October 2020 WORKING AROUND THE GENDER BIAS 21
  • 22. © Intento, Inc. / October 2020 GENDER BIAS IN MACHINE TRANSLATION 22 Gender Bias in MT as evaluated in WinoMT Challenge [1] — carefully measures the bias — in practice, other cases create more issues (see next slide) [1] “Evaluating Gender Bias in Machine Translation”, Gabriel Stanovsky, Noah A. Smith, Luke Zettlemoyer, 2019, https://arxiv.org/abs/1906.00591
  • 23. © Intento, Inc. / October 2020 GENDER BIAS IN COMMUNICATION 23 Source text (English) Machine Translation (French) COMMENT Are you ready? Es-tu prêt? MASCULINE Are you ready? Es-tu prête? FEMININE Are you surprised? Tu es surpris? MASCULINE Are you surprised? Tu es surprise? FEMININE Lack of context — Defaults to either feminine or masculine — Baseline MT engines are not consistent
  • 24. © Intento, Inc. / October 2020 GENDER BIAS MOSTLY MASCULINE BY DEFAULT 24 English to French — 31 segment — stock models — mostly masculine A B C D E F GA B C D E F G Default gender distribution
  • 25. © Intento, Inc. / October 2020 GENDER BIAS CONTROL HOW TO FIX IT? 25 Option 1: Copy & paste from Google Translate Web App (supports gender control) — Option 2: Use long phrases, adding some context — Option 3: MT-agnostic NLP Not for French, not secure, cumbersome, no customization. You can instruct support operators, but not employees and or. Works to a certain extent, provides a wider choice of MT engines
  • 26. © Intento, Inc. / October 2020 GENDER CONTROL ADJUST TO FEMININE 26 English to French — 31 segments — stock models — let’s make it more FEMININE A B C D E F GA B C D E F G Gender adjustment => feminine
  • 27. © Intento, Inc. / October 2020 GENDER CONTROL ADJUST TO FEMININE 27 English to French — 31 segments — stock models — let’s make it more MASCULINE A B C D E F GA B C D E F G Gender adjustment => masculine
  • 28. © Intento, Inc. / October 2020 CONTROLLING TONE OF VOICE 28
  • 29. © Intento, Inc. / October 2020 TONE OF VOICE CONTROL SAMPLES FROM SUPPORT CHATS 29 Source text (English) Machine Translation (German) COMMENT Can you share your screen? Können Sie Ihren Bildschirm freigeben? FORMAL Could you help me? Kannst du mir helfen? INFORMAL Make sure you report any of these issues. Stellen Sie sicher, dass Sie eines dieser Probleme melden. FORMAL Can you give an example? Kannst du ein Beispiel geben? INFORMAL Formal vs. Informal — Crucial for Live Chats — Baseline MT engines are not consistent
  • 30. © Intento, Inc. / October 2020 TONE OF VOICE CONTROL DEFAULT MT OUTPUT 30 English to German — 210 segments — stock models A B C D E F G
  • 31. © Intento, Inc. / October 2020 TONE OF VOICE CONTROL HOW TO MAKE IT INFORMAL? 31 Option 1: Use DeepL with formality=less (99.5% accuracy) — Option 2: Generate synthetic training data, hoping translations become more informal — Option 3: MT-agnostic NLP What if you need a custom model and terminology, or another MT has better linguistic quality for you? Expensive and time-consuming, also introduces bias into the model Works to a certain extent, provides a wider choice of MT engines
  • 32. © Intento, Inc. / October 2020 TONE OF VOICE CONTROL MT-AGNOSTIC ADJUSTMENT 32 English to German — 210 segments — stock models — let’s make it more INFORMAL A B C D E F G
  • 33. © Intento, Inc. / October 2020 TONE OF VOICE CONTROL MT-AGNOSTIC ADJUSTMENT 33 English to German — 210 segments — stock models — let’s make it more FORMAL A B C D E F G
  • 34. © Intento, Inc. / October 2020 PRIVACY PROTECTION 34
  • 35. © Intento, Inc. / October 2020 DATA PROTECTION LAWS 35 A B C D E F According to DLA Piper https://www.dlapiperdataprotection.com/index.html?t=world-map as fetched on 2020-10-20
  • 36. © Intento, Inc. / October 2020 CLOUD MT DEPLOYMENTS 36 A B C D E F Alibaba Amazon Baidu DeepL Globalese* Google GTCom* IBM Microsoft Mirai ModenMT* Naver Niutrans* PROMT Rozetta SDL* Sogou Systran* Tencent Tilde* Yandex Youdao * On-premise and private cloud deployment available
  • 37. © Intento, Inc. / October 2020 DATA AND PRIVACY PROTECTION 37 Communication may contain PII, healthcare, HR, financial data. — Option 1: - select proper MT vendor for every region - when in doubt, use private-cloud deployments — Option 2: - proper DPA and data protection clauses + insurance — Option 3: pseudonymization to remove PII
  • 38. © Intento, Inc. / October 2020 KEY TAKEAWAYS 38 Machine Translation becomes more and more ubiquitous. It becomes more like our coworker than a tool. — When it’s biased, it may damage our work environment and culture. — As of today, it’s mostly masculine by gender and quite inconsistent by tone of voice. — It’s possible to dodge those biases using NLP paired with MT. — Make sure you know where your MT sits so that you stay compliant.
  • 39. THANKS! ks@inten.to 39 Konstantin Savenkov, CEO ks@inten.to 2150 Shattuck Ave Berkeley CA 94705 INTENTO https://inten.to