SlideShare a Scribd company logo
1 of 77
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Where’s Jarvis?
The Future of Voice
Recognition and Natural
Language User Interfaces.
Crispin Reedy, Versay Solutions
@crispinTX crispinreedy.com
#UXPA2016
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
From the session description
• What is voice recognition?
• What is natural language understanding?
• What are the common technologies in the market
today?
• How does this fit with IoT?
• What are design considerations / methods to
evaluate these types of interfaces?
• Implied: Should I speech-enable my ___?
• Bonus Q: Why doesn’t it work the way we want it
to, and when will it?
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Should I Speech-Enable My ___?
Iron Man 2: Marvel Studios, Paramount Pictures
Star Trek Voyager: Paramount Television
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
“Tomato soup”
“Tomato soup.
Ok, what kind?”
“Just plain”
“Coming right
up!”
Implicit
confirmation
Second level-open
ended prompting
Cultural context: plain = hot
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Terms & Technologies
• Speech Recognition
• Natural Language Understanding
• Voice Verification (Biometrics)
• Text to Speech
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Speech Recognition “ASR”
“See the cat.”
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Natural Language Understanding
• Extracting meaning from natural text
“Hello, yes,
I’d like to
pay my
water bill.
Can you
help me with
that?
Intent =
BillPay
Entity
(Bill Type) =
Water
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Voice Verification
“My voice is
my password.”
“Authenticated.
Welcome, Mr.
Smith.”
✓
Text To Speech
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
What Is Good TTS?
• Phonemes change based on location
• “Cat”
• “Alligator”
• Elision
• “I’m. Awaiting. You.”
• “I’m awaiting you.”
• Intonation
• “Do you want coffee?”
• “Do you want soda, tea, or coffee?”
• Most TTS isn’t “Movie Quality”
IMDB
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
SSML Example
SSML
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Speech Recognition
• Hands-free command /
control
• Dictation
• Input text
• Small form factor
device, etc.
Text To Speech
• Output text dynamically
• Respond to input
• Useful when no
display is available
Natural Language
Understanding
• Necessary for all
language-based input
• Extract meaning
• Parse large volumes of
text
Voice Verification
• Security
ASR
Application
Data
• Sign-In
• Interaction
• Request
• Action
• Meaning
• Access Data
• Output
TTS
NLU
Voice
prints
Verifi-
cation
ASR
Application
Data
• Sign-In
• Interaction
• Request
• Action
• Meaning
• Access Data
• Output
TTS
NLU
Voice
prints
Verifi-
cation
Touch
Keyboard
Manage I/O Modality
Determine Meaning in
Context
Visual
Context!
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
ASR
World
Knowledge
Semantics
Syntax
Lexicon
Morphology
Phonetics
Acoustics
Linguistics
Physiology
Concepts
Phrases
Words
Phonemes
Sounds
ASR
NLU
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Speech is ambiguous
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Language is ambiguous
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Everything is ambiguous
Speaker Independence
Speaker
Dependent
Multiple
Speakers
Speaker
Independent
Isolated Words
Connected
Words
Natural Speech
10 words
1000 words
100,000 words
Unlimited
VocabularySize
Humanlike
AUDREY: Automatic Digit
Recognizer
Bell Labs 1952
X — states
y — possible
observations
a — state transition
probabilities
b — output
probabilities
"HiddenMarkovModel" by Tdunningvectorization: Wikimedia
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Training
Speech
Recognition
Engine
Acoustic
Model
SLM and/or
Grammar
Pronunciation
Model
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Utterance
Noise
Levels?
Barge-In?
Feature
Extraction
Endpointing
Speech
Recognition
Engine
Grammar or SLM
Probabilities
n:best list
Literal return
Tokens
Recognition Event
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Early Commercial Adoptions
• Interactive Voice Response
• “Those Phone Menus”
• Server-based ASR
• Nuance
• Microsoft
• Voice-Enabled Handheld Devices
• Industrial / Productivity applications
• Device-based ASR
• Network not needed
Note: Call center
is still an
important
customer
touchpoint!
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Today’s Speech Agents vs. APIs
• Siri / Apple APIs
• Cortana / Cortana APIs
• Google Now / Google Voice Actions
• Amazon Echo (Alexa) / AVS API
• Jibo
• Ubi / Ubi Kit
• Assistant.ai / Api.ai
Alexa Skill vs. Amazon Voice Service
Amazon.com
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Alexa Skill Example
Amazon.com
Amazon.com
Capitol One.com
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
NLU
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Natural Language Understanding
• Parsing input to extract meaning
• Covers a large field
• Commands
• Automatic classification of emails
• Newspaper articles, large chunks of text
• Bots
• Conversational agents
• Messaging apps
• Personal assistants
• Input could be via speech or via text
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Levels of Meaning
Too Broad / Ambiguous Too MuchJust Right
“I’m having a problem
with my account.”
“Well, I was
looking at my
bill, because I
do that every
week, and I was
reviewing
everything on
there, and I
saw…”
“I’m seeing an
unusual charge
on my bill.”
“How can I help you?”
NLU Tasks
http://www.conversational-technologies.com/nldemos/nlDemos.html
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Intents and Entities
• “I’d like to transfer $50 from my checking account
to my savings account.”
• ACTION = Transfer (Intent)
• FROM_ACCOUNT = Checking (Entity)
• TO_ACCOUNT = Savings (Entity)
• AMOUNT = $50 (Entity)
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
NLU APIs
• API.ai
• Alexa
• Microsoft LUIS
• Wit.ai
• Google Voice Actions
• Etc.
Today’s NLU APIs
• Microsoft LUIS (part of Project Oxford)
Microsoft.com
Today’s NLU APIs
API.ai|
• API.ai
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
The Future Is Here
• DNN (Deep Neural Networks)
• Being applied to both ASR and NLU problems
• Requires large amounts of data to train the models
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
What’s The Glue Here?
Consistency
Across
Contexts?
“Omnichannel CX”
Data
Is
Everywhere
State Chart XML?
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
ASR vs. NLU: Wrap Up
ASR
• Spoken aloud
• Requires some NLU
even if it’s hand-crafted
(tagging)
• Useful in hands-free,
eyes-free contexts
NLU
• Focuses on meaning
extraction
• Could be used for chat
bots, etc.
• Machine learning to
train models
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Design Considerations
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Design Considerations
• What are you trying to build?
• What’s your platform?
• Existing guidelines / research
• User testing is key
• Especially if you’re trying to do something complicated
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Should I Speech-Enable My ___?
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
What’s Your ASR/NLU Platform?
Write an app (skill) for
an agent such as
Cortana / Alexa
Use cloud APIs to add
ASR / NLU to your app /
device / page / gadget
Download software and
use full-featured
capabilities for more robust
recognition on a specific
device
Build your own
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Network Availability
• Simply irritating… or totally unusable?
“What’s on my
calendar today?
“Sorry, I can’t
complete that request
right now.”
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Appropriate Modality?
• Voice Only? Voice + Display?
• Is it possible for the user to switch modalities?
• Or would switching potentially be dangerous?
“How long is the
flight from Dallas to
Seattle?
“I’ve got a few results
to show you.”
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Is State Maintained?
• Does your platform support a multiple-stage
interaction?
• Does it remember what you did previously?
“Who is Barack Obama?”
“Barack Obama is the 44th
president of the United
States.”
“How old is he?”
“I’m sorry, I don’t understand
your question.”
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Wake-Up Words
• How many of these “Agents”
will we be talking to?
“Jibo, take a picture.”
“Alexa, play music.”
“OK Google, set the
temperature to 77
degrees.”
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
System Personality
• Are you writing for an “Agent”
who has an existing style?
• What if your skill or app doesn’t
match that style?
• If not, should you create one?
“Hi, I’m Julie!”
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Context
• Real-world context
• Digital context
• How much does your app
know about where you are
and what it can do?
“When I get home,
remind me to take
out the trash.”
“I’m sorry, your calendar
doesn’t support location-
based reminders.”
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
What Are You Trying To Recognize?
• Long utterances work
better than short ones
• Letter names require extra
work
“Start a session”
“Got it”
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
And So Much More….
• What will you do when the
recognizer just can’t get it?
“I want my…. BARK
BARK BARK Timmy STOP
THAT NOW GET
DOWN!”
????
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Existing Guidelines / Research
• Caveat: Best practices evolved in one modality (e.g.
voice-only) may not apply the same way in another
(e.g. combined voice + touch)
• But they could be adapted
• Association for Voice Interaction Design (AVIxD.org)
• Wiki
• Peer-Reviewed Journal
• Virtual “Brown Bags”
• Academic Sources, Books
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
AVIxD.org
CUI Working Group is actively recruiting!
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Specific Example: “Help”
Voice XML
Standard
(2004)
“Help” should
be a global
command
AVIxD Wiki
(2014)
Stop using
“Help” as a
global
Agent API
Doc
(2015)
Offer “Help”
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Specific Example: “Help”
• Designers who tune applications have seen that the
word “help” is a known “False Attractor”
• Other things that you say which are short get recognized
as “help”
• People don’t voluntarily come up with “help”
unless they are prompted
• Give callers a context specific command only
where help may truly be needed, and call it
something besides "help”
• System: Say or enter your account number, or say, where
do I find it.
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Special Case: Car
• “Distracted Driver” is a hot topic!
• Richard Young, Wayne State University
• Paper: “Safe Interaction For Drivers”
• “Visual-Manual Mode” – What we do today
• “Auditory-Vocal Mode” – Speech only. NO GUI.
• “Mixed Mode” – Speech and GUI being used together
• Finding: If you give someone a graphic interface,
they’re going to look at it
• And take their eyes off the road
Design Documents
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Usability Studies / Research
• Special Challenges
• Technical setup
• Phone tap / Recording both sides
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions Warner Bros.
Early Stage Voice Only Prototype
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Should I Speech-Enable My ___?
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
What’s the Use Case?
• Enabling application
• User can’t do it any other way
• New tasks
• Enhancing application
• User can do it now
• But speech makes it better
• Faster
• Safer
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
API-Based
Device-
Based
Roll Your
Own /
Open-
Source
• Flexibility
• Power
• Customization
• Time
• Difficulty
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Cloud vs. Downloadable / Embedded
• Easy to get started
• Lightweight
• Not much specialized
knowledge
• Customizable
• Probably better recognition
• Can be device-specific
• More features
• Higher powered
• May require specialized
knowledge
– Speech scientist
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Open Source ASR
• CMU Sphinx
• pocketsphinx
• Kaldi
• http://kaldi-asr.org/
• Github
• New updates include some pretty interesting stuff (DNN)
• Requires:
• Corpus
• Tech know-how
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Should I Speech-Enable My ___?
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Should I Speech-Enable My ___?
Maybe
Iron Man 2: Marvel Studios, Paramount Pictures
Where’s Jarvis?
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Where’s Jarvis?
Gesture
Based
Interface
Artificial
Intelligence
Voice Based
Interface
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Where’s Jarvis?
ASR
NLU
Voice Design
Context
#UXPA2016Session Survey: http://www.uxpa2016.org/sessionsurvey?sessionid=321© 2016 Versay Solutions
Resources
• Handout / Web page

More Related Content

Similar to Where's Jarvis? The Future of Voice Recognition and Natural Language User Interfaces UXPA 2016

Who's Using Our Product? A Story of Enterprise UX Research
Who's Using Our Product? A Story of Enterprise UX ResearchWho's Using Our Product? A Story of Enterprise UX Research
Who's Using Our Product? A Story of Enterprise UX ResearchUXPA International
 
UXPA 2016 - Using UX Skills to Shape Your Career
UXPA 2016 - Using UX Skills to Shape Your CareerUXPA 2016 - Using UX Skills to Shape Your Career
UXPA 2016 - Using UX Skills to Shape Your CareerAmanda Stockwell
 
Using UX skills to craft your career
Using UX skills to craft your careerUsing UX skills to craft your career
Using UX skills to craft your careerUXPA International
 
What can social psychology teach us about (better) UX research?
What can social psychology teach us about (better) UX research?What can social psychology teach us about (better) UX research?
What can social psychology teach us about (better) UX research?UXPA International
 
UX Research within an Agile Design and Development Sprint Cycle
UX Research within an Agile Design and Development Sprint CycleUX Research within an Agile Design and Development Sprint Cycle
UX Research within an Agile Design and Development Sprint CycleUXPA International
 
Design Jams! How to run creative sessions with the people who use your product.
Design Jams! How to run creative sessions with the people who use your product.Design Jams! How to run creative sessions with the people who use your product.
Design Jams! How to run creative sessions with the people who use your product.UXPA International
 
Mature Products: The Cycle of UX Reinvention UXPA 2016
Mature Products: The Cycle of UX Reinvention UXPA 2016Mature Products: The Cycle of UX Reinvention UXPA 2016
Mature Products: The Cycle of UX Reinvention UXPA 2016Carol Smith
 
Mature Products: The Cycle of UX Reinvention
Mature Products: The Cycle of UX ReinventionMature Products: The Cycle of UX Reinvention
Mature Products: The Cycle of UX ReinventionUXPA International
 
Strategic User Experience Management
Strategic User Experience ManagementStrategic User Experience Management
Strategic User Experience ManagementUXPA International
 
UserZoom & UXPA Present a Webinar: Build a Better Experience
UserZoom & UXPA Present a Webinar: Build a Better ExperienceUserZoom & UXPA Present a Webinar: Build a Better Experience
UserZoom & UXPA Present a Webinar: Build a Better ExperienceUserZoom
 
Re-use and Recycle: Building sustainable relationships with your users
Re-use and Recycle: Building sustainable relationships with your usersRe-use and Recycle: Building sustainable relationships with your users
Re-use and Recycle: Building sustainable relationships with your usersUXPA International
 
Presumptive Design: "It's not research! We're getting stuff done!"
Presumptive Design: "It's not research! We're getting stuff done!"Presumptive Design: "It's not research! We're getting stuff done!"
Presumptive Design: "It's not research! We're getting stuff done!"UXPA International
 
Prototyping - 4 Strategic Factors for Designers
Prototyping - 4 Strategic Factors for DesignersPrototyping - 4 Strategic Factors for Designers
Prototyping - 4 Strategic Factors for DesignersUXPA International
 
Prototyping - 4 Strategic Factors for Designers - UXPA 2016
Prototyping - 4 Strategic Factors for Designers - UXPA 2016Prototyping - 4 Strategic Factors for Designers - UXPA 2016
Prototyping - 4 Strategic Factors for Designers - UXPA 2016Lyle Kantrovich
 
The UX Toolbelt for Developers
The UX Toolbelt for DevelopersThe UX Toolbelt for Developers
The UX Toolbelt for DevelopersSarah Dutkiewicz
 
Incorporating UX into Your Projects
Incorporating UX into Your ProjectsIncorporating UX into Your Projects
Incorporating UX into Your ProjectsKarl Kaufmann
 
Under the Knife: Plastic Surgery for Classic Software
Under the Knife: Plastic Surgery for Classic SoftwareUnder the Knife: Plastic Surgery for Classic Software
Under the Knife: Plastic Surgery for Classic SoftwareUXPA International
 
The Journey Towards Continuous Deployment
The Journey Towards Continuous DeploymentThe Journey Towards Continuous Deployment
The Journey Towards Continuous DeploymentBrian Mericle
 
Embedded User Assistance: Third Rail or Third Way?
Embedded User Assistance: Third Rail or Third Way?Embedded User Assistance: Third Rail or Third Way?
Embedded User Assistance: Third Rail or Third Way?Steven Jong
 

Similar to Where's Jarvis? The Future of Voice Recognition and Natural Language User Interfaces UXPA 2016 (20)

Who's Using Our Product? A Story of Enterprise UX Research
Who's Using Our Product? A Story of Enterprise UX ResearchWho's Using Our Product? A Story of Enterprise UX Research
Who's Using Our Product? A Story of Enterprise UX Research
 
UXPA 2016 - Using UX Skills to Shape Your Career
UXPA 2016 - Using UX Skills to Shape Your CareerUXPA 2016 - Using UX Skills to Shape Your Career
UXPA 2016 - Using UX Skills to Shape Your Career
 
Using UX skills to craft your career
Using UX skills to craft your careerUsing UX skills to craft your career
Using UX skills to craft your career
 
What can social psychology teach us about (better) UX research?
What can social psychology teach us about (better) UX research?What can social psychology teach us about (better) UX research?
What can social psychology teach us about (better) UX research?
 
UX Research within an Agile Design and Development Sprint Cycle
UX Research within an Agile Design and Development Sprint CycleUX Research within an Agile Design and Development Sprint Cycle
UX Research within an Agile Design and Development Sprint Cycle
 
Design Jams! How to run creative sessions with the people who use your product.
Design Jams! How to run creative sessions with the people who use your product.Design Jams! How to run creative sessions with the people who use your product.
Design Jams! How to run creative sessions with the people who use your product.
 
Mature Products: The Cycle of UX Reinvention UXPA 2016
Mature Products: The Cycle of UX Reinvention UXPA 2016Mature Products: The Cycle of UX Reinvention UXPA 2016
Mature Products: The Cycle of UX Reinvention UXPA 2016
 
Mature Products: The Cycle of UX Reinvention
Mature Products: The Cycle of UX ReinventionMature Products: The Cycle of UX Reinvention
Mature Products: The Cycle of UX Reinvention
 
Strategic User Experience Management
Strategic User Experience ManagementStrategic User Experience Management
Strategic User Experience Management
 
UserZoom & UXPA Present a Webinar: Build a Better Experience
UserZoom & UXPA Present a Webinar: Build a Better ExperienceUserZoom & UXPA Present a Webinar: Build a Better Experience
UserZoom & UXPA Present a Webinar: Build a Better Experience
 
Re-use and Recycle: Building sustainable relationships with your users
Re-use and Recycle: Building sustainable relationships with your usersRe-use and Recycle: Building sustainable relationships with your users
Re-use and Recycle: Building sustainable relationships with your users
 
Presumptive Design: "It's not research! We're getting stuff done!"
Presumptive Design: "It's not research! We're getting stuff done!"Presumptive Design: "It's not research! We're getting stuff done!"
Presumptive Design: "It's not research! We're getting stuff done!"
 
Prototyping - 4 Strategic Factors for Designers
Prototyping - 4 Strategic Factors for DesignersPrototyping - 4 Strategic Factors for Designers
Prototyping - 4 Strategic Factors for Designers
 
Prototyping - 4 Strategic Factors for Designers - UXPA 2016
Prototyping - 4 Strategic Factors for Designers - UXPA 2016Prototyping - 4 Strategic Factors for Designers - UXPA 2016
Prototyping - 4 Strategic Factors for Designers - UXPA 2016
 
The UX Toolbelt for Developers
The UX Toolbelt for DevelopersThe UX Toolbelt for Developers
The UX Toolbelt for Developers
 
IDLC
IDLCIDLC
IDLC
 
Incorporating UX into Your Projects
Incorporating UX into Your ProjectsIncorporating UX into Your Projects
Incorporating UX into Your Projects
 
Under the Knife: Plastic Surgery for Classic Software
Under the Knife: Plastic Surgery for Classic SoftwareUnder the Knife: Plastic Surgery for Classic Software
Under the Knife: Plastic Surgery for Classic Software
 
The Journey Towards Continuous Deployment
The Journey Towards Continuous DeploymentThe Journey Towards Continuous Deployment
The Journey Towards Continuous Deployment
 
Embedded User Assistance: Third Rail or Third Way?
Embedded User Assistance: Third Rail or Third Way?Embedded User Assistance: Third Rail or Third Way?
Embedded User Assistance: Third Rail or Third Way?
 

More from Crispin Reedy

Voice User Interface Design - Big Design 2017
Voice User Interface Design - Big Design 2017Voice User Interface Design - Big Design 2017
Voice User Interface Design - Big Design 2017Crispin Reedy
 
Association for Voice Interaction Design Annual Meeting 2017
Association for Voice Interaction Design Annual Meeting 2017Association for Voice Interaction Design Annual Meeting 2017
Association for Voice Interaction Design Annual Meeting 2017Crispin Reedy
 
Top 10 Tips for Making Complicated Things Simple
Top 10 Tips for Making Complicated Things SimpleTop 10 Tips for Making Complicated Things Simple
Top 10 Tips for Making Complicated Things SimpleCrispin Reedy
 
Going Solo: Design and Productivity Techniques for the Team of One
Going Solo: Design and Productivity Techniques for the Team of OneGoing Solo: Design and Productivity Techniques for the Team of One
Going Solo: Design and Productivity Techniques for the Team of OneCrispin Reedy
 
Service Design and the Omnichannel Experience - SpeechTEK 2015
Service Design and the Omnichannel Experience - SpeechTEK 2015Service Design and the Omnichannel Experience - SpeechTEK 2015
Service Design and the Omnichannel Experience - SpeechTEK 2015Crispin Reedy
 
Association for Voice Interaction Design Annual Meeting 2015
Association for Voice Interaction Design Annual Meeting 2015Association for Voice Interaction Design Annual Meeting 2015
Association for Voice Interaction Design Annual Meeting 2015Crispin Reedy
 
SpeechTEK University Outtakes 2014: Zero Out Strategies
SpeechTEK University Outtakes 2014: Zero Out StrategiesSpeechTEK University Outtakes 2014: Zero Out Strategies
SpeechTEK University Outtakes 2014: Zero Out StrategiesCrispin Reedy
 
2013 Speech TEK - Alphanumeric Recognition Discussion
2013 Speech TEK - Alphanumeric Recognition Discussion2013 Speech TEK - Alphanumeric Recognition Discussion
2013 Speech TEK - Alphanumeric Recognition DiscussionCrispin Reedy
 
Design Thinking Action Lab Exercise 1
Design Thinking Action Lab Exercise 1Design Thinking Action Lab Exercise 1
Design Thinking Action Lab Exercise 1Crispin Reedy
 

More from Crispin Reedy (10)

Assertive Niceness
Assertive NicenessAssertive Niceness
Assertive Niceness
 
Voice User Interface Design - Big Design 2017
Voice User Interface Design - Big Design 2017Voice User Interface Design - Big Design 2017
Voice User Interface Design - Big Design 2017
 
Association for Voice Interaction Design Annual Meeting 2017
Association for Voice Interaction Design Annual Meeting 2017Association for Voice Interaction Design Annual Meeting 2017
Association for Voice Interaction Design Annual Meeting 2017
 
Top 10 Tips for Making Complicated Things Simple
Top 10 Tips for Making Complicated Things SimpleTop 10 Tips for Making Complicated Things Simple
Top 10 Tips for Making Complicated Things Simple
 
Going Solo: Design and Productivity Techniques for the Team of One
Going Solo: Design and Productivity Techniques for the Team of OneGoing Solo: Design and Productivity Techniques for the Team of One
Going Solo: Design and Productivity Techniques for the Team of One
 
Service Design and the Omnichannel Experience - SpeechTEK 2015
Service Design and the Omnichannel Experience - SpeechTEK 2015Service Design and the Omnichannel Experience - SpeechTEK 2015
Service Design and the Omnichannel Experience - SpeechTEK 2015
 
Association for Voice Interaction Design Annual Meeting 2015
Association for Voice Interaction Design Annual Meeting 2015Association for Voice Interaction Design Annual Meeting 2015
Association for Voice Interaction Design Annual Meeting 2015
 
SpeechTEK University Outtakes 2014: Zero Out Strategies
SpeechTEK University Outtakes 2014: Zero Out StrategiesSpeechTEK University Outtakes 2014: Zero Out Strategies
SpeechTEK University Outtakes 2014: Zero Out Strategies
 
2013 Speech TEK - Alphanumeric Recognition Discussion
2013 Speech TEK - Alphanumeric Recognition Discussion2013 Speech TEK - Alphanumeric Recognition Discussion
2013 Speech TEK - Alphanumeric Recognition Discussion
 
Design Thinking Action Lab Exercise 1
Design Thinking Action Lab Exercise 1Design Thinking Action Lab Exercise 1
Design Thinking Action Lab Exercise 1
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 

Where's Jarvis? The Future of Voice Recognition and Natural Language User Interfaces UXPA 2016

Editor's Notes

  1. Voice User Interface Designer 10 years in the field English major, former coder; got interested in UX President of the Association for Voice Interaction Design Consultant for Versay Solutions 2 weeks in a row for conferences
  2. Jarvis: Audio and gestural Perfect recognition. No error recovery needed Great voice quality Connected to vast amounts of data Understands all the parts of the model: “Lose the landscape.” Context-sensitive. Aware of the space around him Sense of humor. “Am I to include the Belgian Waffle stands?” Takes initiative. “What is it you’re trying to achieve, sir?”
  3. Replicator: Good recognition No error recovery needed Good voice quality – understandable Connected to data – perhaps too much so? Context sensitive- but was this enough? A design failure (not a tech failure) Specifically around excessive disambiguation
  4. A Better Replicator Conversation
  5. “Speech to Text” ? Spoken Language – Machine readable format
  6. Not necessarily tied to speech recognition
  7. Also called voiceprints, biometrics, voice authentication, etc. Not going to discuss this one in a lot of detail today but it’s important that you understand the difference between these technologies. Recognizes a person, not necessarily what they are saying. You can have ASR without Voice Verification And vice versa
  8. Human voice talent Hundreds of hours of recording Digitized Phonemes: Concatenated speech synthesis
  9. Dynamic Speech Synthesis Many commercial products are available API-based Downloadable Quality varies If possible, record audio TTS has improved considerably, but is still noticeable High quality TTS may not be available in all situations If you have a lot of dynamic data TTS is useful You can mix recorded audio and TTS You may have to use TTS Voice Agent (Alexa, Cortana, etc.) API-based Some of them do let you mark up your TTS with SSML More phonemes = higher quality voice Also means a bigger download and install (if on device) Exceptions (addresses, names) can be iffy May require a lot of work to handle well St. James St. Saint James Street Punctuation Your data needs to be clean and ready to voice back Acronyms, incomplete sentences will not sound good It is possible to build a custom voice But it takes a lot of work!
  10. Speech Synthesis Markup Language XML based WC3 standard Not universally supported Tags which allow you produce a more natural quality output. Emphasis Break Voice Prosody Pitch
  11. World Knowledge: Concepts of the world around us, i.e. Tables have four legs, what is left and right, what is a car, etc. This is the level before language Semantics: The first level of language. Knowledge can be represented in structured meaningful elements. Example: semantics of a party invitation Syntax: The rules that govern putting words together to form meaningful units Lexicon: What words mean Morphology: How words change their form to perform differently in a language i.e. horse / horses Phonetics: Phonemes and how words are built Acoustics: What phonemes sound like and how to create them
  12. Speech is never stationary Coarticulation Noisy environments Accents Different speakers have voices with different acoustic qualities Goats Challenges vary depending on what you are going to recognize Spelling (short utterances) can be difficult even for humans Phonetic alphabet (Military)
  13. Humans can deduce meaning from context and unknown words “How can I help you?” I’m having a problem with my account. I’d like that one. No, not the green one, the red one. Time flies like an arrow. Fruit flies like a banana.
  14. All modern speech recognition is probabilistic GUI: Button clicked? true / false VUI: There is an 85% chance that button was clicked
  15. Three Dimensions of Speech Problems
  16. AUDREY: Davis, Biddulph, and Balashek - Bell Labs 1952 Analog Isolated digit recognition Pause between digits Speaker-dependent Speech recognition with vacuum tubes – How very steampunk. Her name was AUDREY. Let that sink in a minute. (Automatic Digit Recognizer)
  17. 1980’s: The Power of Statistics The recognition of connected speech becomes a search for the best path in a large network Problem of finding the probabilities Statistical Language Models Not all sequences of words are equally probable Rank all permissible sentences in terms of probability “Correct” grammar is not applicable Restricted by domain Hidden Markov Models (HMM) Unified probabilistic model for speech
  18. You’re Only As Good As What You’re Trained On Corpora Collection of speech used to train a recognizer Acoustic and/or Pronunciation Model Associates sounds with symbols and words. Created by a general speech corpora and a phonetic and orthographic transcription Statistical Language Model (SLM) A probability distribution over sequences of words Created by a domain-specific speech corpora and a tagged transcription to extract meaning
  19. Speech Agent: The “Person” who Distributed speech recognition Collection and compression of speech is on the device The language models are typically on the network Phone can be speaker-dependent Trains itself on your voice and on the acoustic environments you are in most often Many companies are providing APIs to use their speech recognition
  20. Alexa, Ask Capitol One What’s my current credit card balance?
  21. Observations to make: Represents the entirety of a VUI experience Placement of Spanish prompt would vary depending on type of call. Confirmation is variable Confirmation prompt is general
  22. What do you need it for? What kind of device will you be running it on? Connectivity? Can you use cloud based ASR? How much control do you need over the application / user interface?
  23. Jarvis: Audio and gestural Perfect recognition. No error recovery needed Great voice quality Connected to vast amounts of data Understands all the parts of the model: “Lose the landscape.” Context-sensitive. Aware of the space around him Sense of humor. “Am I to include the Belgian Waffle stands?” Takes initiative. “What is it you’re trying to achieve, sir?”