More Related Content Similar to Dodging AI biases in future-proof Machine Translation solutions (20) More from Konstantin Savenkov (13) Dodging AI biases in future-proof Machine Translation solutions2. © Intento, Inc. / October 2020
AGENDA
2
Some context on Intento
—
Future of Work, AI and company culture
—
Using MT for multilingual communication
—
Case Study 1: Gender Bias
—
Case Study 2: Tone of Voice
—
Case Study 3: Data Locality
—
Key Takeaways
4. ENTERPRISES
MASSIVELY FAIL
* Share of US companies with successful AI deployment
(Deloitte State of Cognitive Survey 2017)
INTENTO4
20%*
Wrong vendor selected
Failed integrations
Failed pilots
Failed to deliver ROI
© Intento, Inc. / September 2020
TO ADOPT
AI
5. © Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
5
MT Procurement
MT Need MT Systems
Localization
6. © Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
6
MT Procurement
MT Need MT Systems
Localization
7. © Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
7
MT Procurement
—
MT Curation
MT Need MT Systems
Localization
8. © Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
8
MT Procurement
—
MT Curation
—
Multi-Engine MT
MT Need MT Systems
Localization
9. © Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
9
MT Procurement
—
MT Curation
—
Multi-Engine MT
—
Multi-Purpose MT
MT Need MT Systems
Localization
Customer Service
Office Productivity
Global Community
10. © Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
10
MT Procurement
—
MT Curation
—
Multi-Engine MT
—
Multi-Purpose MT
—
Continuous Improvement
MT Need MT Systems
Localization
Customer Service
Office Productivity
Global Community
11. © Intento, Inc. / October 2020
FUTURE OF WORK, AI,
AND COMPANY CULTURE
11
12. © Intento, Inc. / October 2020
FUTURE OF WORK AND AI
FROM TOOLS TO COLLEAGUES
12
AI is more than just a tool
—
experience (pre-training)
—
test assignments (evaluations on your data)
—
onboarding (domain adaptation)
—
continuous learning
13. © Intento, Inc. / October 2020
RETHINKING TEAMS
13
Team as a cooperation of
cognitive models,
both human and artificial
14. © Intento, Inc. / October 2020
AI AND COMPANY CULTURE
14
Can we afford to have only a
part of the company aligned
with its culture and values?
16. © Intento, Inc. / October 2020
MT ADOPTION BOTTLENECKS
16
technical (integration)
—
linguistic (domain adaptation)
—
economical (supply chain)
—
cultural (biases)
—
security & legal (privacy)
17. © Intento, Inc. / October 2020
MT FOR COMMUNICATION
17
INTENTO MT
HUB
INTENTO
18. © Intento, Inc. / October 2020
GETTING COMMUNICATION RIGHT
18
pre-moderation is not feasible
—
right communication = right
culture
19. © Intento, Inc. / October 2020
GETTING COMMUNICATION RIGHT
19
pre-moderation is not feasible
—
right communication = right
culture
—
adding MT to the mix
20. © Intento, Inc. / October 2020
THINGS TO LOOK AFTER
20
Gender
—
Tone of Voice
—
Privacy
22. © Intento, Inc. / October 2020
GENDER BIAS
IN MACHINE TRANSLATION
22
Gender Bias in MT as evaluated in WinoMT Challenge [1]
—
carefully measures
the bias
—
in practice,
other cases
create more issues
(see next slide)
[1] “Evaluating Gender Bias in Machine Translation”, Gabriel Stanovsky, Noah A. Smith, Luke Zettlemoyer, 2019, https://arxiv.org/abs/1906.00591
23. © Intento, Inc. / October 2020
GENDER BIAS
IN COMMUNICATION
23
Source text (English)
Machine Translation
(French)
COMMENT
Are you ready? Es-tu prêt? MASCULINE
Are you ready? Es-tu prête? FEMININE
Are you surprised? Tu es surpris? MASCULINE
Are you surprised? Tu es surprise? FEMININE
Lack of context
—
Defaults to either
feminine or
masculine
—
Baseline MT
engines are not
consistent
24. © Intento, Inc. / October 2020
GENDER BIAS
MOSTLY MASCULINE BY DEFAULT
24
English to French
—
31 segment
—
stock models
—
mostly masculine
A B C D E F GA B C D E F G
Default gender distribution
25. © Intento, Inc. / October 2020
GENDER BIAS CONTROL
HOW TO FIX IT?
25
Option 1: Copy & paste from
Google Translate Web App
(supports gender control)
—
Option 2: Use long phrases,
adding some context
—
Option 3: MT-agnostic NLP
Not for French, not secure, cumbersome,
no customization.
You can instruct support operators, but
not employees and or.
Works to a certain extent, provides a
wider choice of MT engines
26. © Intento, Inc. / October 2020
GENDER CONTROL
ADJUST TO FEMININE
26
English to French
—
31 segments
—
stock models
—
let’s make it more
FEMININE
A B C D E F GA B C D E F G
Gender adjustment => feminine
27. © Intento, Inc. / October 2020
GENDER CONTROL
ADJUST TO FEMININE
27
English to French
—
31 segments
—
stock models
—
let’s make it more
MASCULINE
A B C D E F GA B C D E F G
Gender adjustment => masculine
29. © Intento, Inc. / October 2020
TONE OF VOICE CONTROL
SAMPLES FROM SUPPORT CHATS
29
Source text (English)
Machine Translation
(German)
COMMENT
Can you share your screen?
Können Sie Ihren Bildschirm
freigeben?
FORMAL
Could you help me? Kannst du mir helfen? INFORMAL
Make sure you report any of
these issues.
Stellen Sie sicher, dass Sie eines
dieser Probleme melden.
FORMAL
Can you give an example? Kannst du ein Beispiel geben? INFORMAL
Formal vs.
Informal
—
Crucial for Live
Chats
—
Baseline MT
engines are not
consistent
30. © Intento, Inc. / October 2020
TONE OF VOICE CONTROL
DEFAULT MT OUTPUT
30
English to German
—
210 segments
—
stock models
A B C D E F G
31. © Intento, Inc. / October 2020
TONE OF VOICE CONTROL
HOW TO MAKE IT INFORMAL?
31
Option 1: Use DeepL with
formality=less (99.5% accuracy)
—
Option 2: Generate synthetic
training data, hoping
translations become more
informal
—
Option 3: MT-agnostic NLP
What if you need a custom model and
terminology, or another MT has better
linguistic quality for you?
Expensive and time-consuming, also
introduces bias into the model
Works to a certain extent, provides a
wider choice of MT engines
32. © Intento, Inc. / October 2020
TONE OF VOICE CONTROL
MT-AGNOSTIC ADJUSTMENT
32
English to German
—
210 segments
—
stock models
—
let’s make it more
INFORMAL
A B C D E F G
33. © Intento, Inc. / October 2020
TONE OF VOICE CONTROL
MT-AGNOSTIC ADJUSTMENT
33
English to German
—
210 segments
—
stock models
—
let’s make it more
FORMAL
A B C D E F G
35. © Intento, Inc. / October 2020
DATA PROTECTION LAWS
35
A B C D E F
According to DLA Piper https://www.dlapiperdataprotection.com/index.html?t=world-map as fetched on 2020-10-20
36. © Intento, Inc. / October 2020
CLOUD MT DEPLOYMENTS
36
A B C D E F
Alibaba
Amazon
Baidu
DeepL
Globalese*
Google
GTCom*
IBM
Microsoft
Mirai
ModenMT*
Naver
Niutrans*
PROMT
Rozetta
SDL*
Sogou
Systran*
Tencent
Tilde*
Yandex
Youdao
* On-premise and private cloud deployment available
37. © Intento, Inc. / October 2020
DATA AND PRIVACY PROTECTION
37
Communication may contain PII, healthcare, HR, financial data.
—
Option 1:
- select proper MT vendor for every region
- when in doubt, use private-cloud deployments
—
Option 2:
- proper DPA and data protection clauses + insurance
—
Option 3: pseudonymization to remove PII
38. © Intento, Inc. / October 2020
KEY TAKEAWAYS
38
Machine Translation becomes more and more ubiquitous. It
becomes more like our coworker than a tool.
—
When it’s biased, it may damage our work environment and
culture.
—
As of today, it’s mostly masculine by gender and quite
inconsistent by tone of voice.
—
It’s possible to dodge those biases using NLP paired with MT.
—
Make sure you know where your MT sits so that you stay
compliant.