SlideShare a Scribd company logo
1 of 15
Download to read offline
Fabio Ballati, Fulvio Corno, Luigi De Russis
Politecnico di Torino, Italy
Assessing Virtual Assistant
Capabilities with Italian
Dysarthric Speech
ASSETS 2018 - October 22-24, 2018 - Galway
2
Usage of smartphone-based virtual assistants is growing,
worldwide
Such assistants generally have a positive impact on device
accessibility
People with speech impairments like dysarthria may be
unable to use those virtual assistants with proficiency
Background and Motivation
3
We focused on ALS-inducted dysarthria and the Italian language
Propose a methodology for the collection of dysarthric speech
samples to evaluate smartphone-based virtual assistants
Investigate which assistant provides the most coherent answer
when the recognized speech is at least partially correct
Investigate whether and how people with moderate dysarthria could
be understood by three virtual assistants
• Siri, Google Assistant, Cortana
Goal
4
We played the collected speech samples to
assess (i) the accuracy in transcription and (ii) the
coherence of the answers
ASSESSMENT
To collect dysarthric speech samples, we designed
a specific methodology and we recorded the 34
sentences from 8 people with ALS
DATA COLLECTION
Selection of 34 suitable sentences for virtual
assistants
SENTENCES SELECTION
Work Phases
5
Sample sentences
(translated in English)
Do I need to take an umbrella, today?
How many proteins are in two eggs?
Add onion and tomatoes to my shopping
list
Who is the president of the Italian
republic?
Set the home temperature to 22 degrees.
Set an alarm at 8am.
…
• Goal: to have a set of sentences
to record, suitable for
smartphone-based virtual
assistants
• We extracted 34 sentences from
the recommended questions for
virtual assistants
• We, then, slightly modified them
to include all the phonemes of
the Italian language
Sentence Selection
SENTENCE SELECTION
6
Goal: to have a dataset of dysarthric speech samples that may allow us
to assess the behavior of virtual assistants
Participants
• 8 native Italian speakers with ALS-induced dysarthria (4M, 4F), aged 64-
83
• Three types of dysarthria and within two speech intelligibility
categories
• Flaccid, Spastic, or Unilateral Upper Motor Neuron (Duffy classification)
• "Intelligible with repeating" and "Detectable speech disturbance" (ALS Functional
Rating Scale)
Data Collection
DATA COLLECTION
7
• Simple process, to be easily reproduced
• The participant read each of the 34 sentences from an A4 sheet of
paper (one sheet per sentence), located in front of the reader, while we
recorded them
• The recordings were taken with a smartphone located at distance of 30-
40 centimeters from the participant
Procedure
DATA COLLECTION
8
Goal: To investigate the accuracy in transcription and the coherence of the
answers of the virtual assistants
• The assessment took place in a quiet room of our university
• The recorded speech sample were played on a laptop connected to an
external high-quality speaker
• Each of the 272 sentences was played for Siri, Google Assistant, and
Cortana, separately, on three different smartphones
• iPhone 7 (iOS 11.2), Samsung A5 (Android 8.1), and Lumia 910 (Windows 10 Mobile)
• The results of the operation (recognized request and related response)
were noted down
Assessment
ASSESSMENT
9
Qualitative QC
Classification of each provided
transcription in:
• Correct
• Same semantic meaning
• Incomplete
• Wrong
• Not recognized
Quantitative QC
Word Error Rate (WER)
WER = (S + I + D) / N,
where S = substitution, I = insertion, D =
deletion, and N = number of words in the
original sentence
Given by the similarity between the original sentence
and the provided transcription
Measures: Question Comprehension (QC)
ASSESSMENT
10
• An indicator of the appropriateness of the assistants' responses
• Computed for sentences that were correct or with the same semantic
meaning, only
• Given as the number and percentage of times that a virtual assistant
provided a certain type of answer:
• Coherent answers, i.e., correct or logically consistent responses
• Incoherent answers, i.e., logically incoherent responses
• Default answers, i.e., responses that an assistant provides by default when it is
not able to fully understand or extract any context
Measures: Consistency in Answers
ASSESSMENT
11
• WER was highly dependent upon the
participant
• The average WER for Google Assistant
was lower than Cortana
• Siri performed the worst
• Looking at the results of individual
participants, the same trend appeared
Results: Quantitative QC
ASSESSMENT
12
Correct
Same semantic
meaning
Incomplete Wrong
Not
recognized
Google
Assistant
135
(49.63%)
39
(14.33%)
39
(14.33%)
58
(21.32%)
1
(0.37%)
Cortana
85
(31.25%)
23
(8.45%)
20
(7.35%)
141
(51.83%)
3
(1.10%)
Siri
36
(13.23%)
7
(2.58%)
32
(11.76%)
149
(54.78%)
48
(17.65%)
Overall results are similar to Quantitative QC, with Google Assistant that
performed better than the other two
Results: Qualitative QC
ASSESSMENT
13
Coherent
answer
Default
answer
Incorrect
answer
Google Assistant (174)
94
(54.02%)
78
(44.83%)
2
(1.15%)
Cortana (108)
26
(24.07%)
82
(75.93%)
0
(0%)
Siri (43)
26
(60.47%)
13
(30.23%)
4
(9.30%)
The answers provided by Google Assistant and Siri were mostly coherent
Results: Consistency in Answers
ASSESSMENT
14
We plan to publicly release the collected dataset
Google Assistant was the best in recognizing dysarthric speech
and in providing suitable answers
• Each virtual assistant behave differently
• The accuracy of transcription is strictly related to the speaker
• Some participants can use Google Assistant without any problems
• Siri performed the worst for the accuracy of the transcriptions but
provided a good number of suitable answers, when it properly
understood the request
Key Takeaways
Luigi De Russis
luigi.derussis@polito.it
https://elite.polito.it
Assessing Virtual Assistant
Capabilities with Italian
Dysarthric Speech

More Related Content

Similar to Assessing Virtual Assistants for Italian Dysarthric Speech

ELSA's Speech Recognition Overview
ELSA's Speech Recognition OverviewELSA's Speech Recognition Overview
ELSA's Speech Recognition OverviewLinhVu946763
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overviewsajanazoya
 
• COMMUNICATEBUSINESS VISION• WHAT TO EXPECT• .docx
• COMMUNICATEBUSINESS VISION• WHAT TO EXPECT• .docx• COMMUNICATEBUSINESS VISION• WHAT TO EXPECT• .docx
• COMMUNICATEBUSINESS VISION• WHAT TO EXPECT• .docxodiliagilby
 
The effects of learner characteristics and beliefs on usage of ASR-CALL systems
The effects of learner characteristics and beliefs on usage of ASR-CALL systemsThe effects of learner characteristics and beliefs on usage of ASR-CALL systems
The effects of learner characteristics and beliefs on usage of ASR-CALL systemsBindi Clements
 
Do you Mean what you say? Recognizing Emotions.
Do you Mean what you say? Recognizing Emotions.Do you Mean what you say? Recognizing Emotions.
Do you Mean what you say? Recognizing Emotions.Sunil Kumar Kopparapu
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligenceiarthur
 
Automated Language Assessment Scoring and impact on instruction
Automated Language Assessment Scoring and impact on instructionAutomated Language Assessment Scoring and impact on instruction
Automated Language Assessment Scoring and impact on instructiontfarny
 
An optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phonesAn optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phoneseSAT Journals
 
An optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phonesAn optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phoneseSAT Publishing House
 
To Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
To Label or Not? Advances and Open Challenges in SE-specific Sentiment AnalysisTo Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
To Label or Not? Advances and Open Challenges in SE-specific Sentiment AnalysisNicole Novielli
 
English speaking proficiency assessment using speech and electroencephalograp...
English speaking proficiency assessment using speech and electroencephalograp...English speaking proficiency assessment using speech and electroencephalograp...
English speaking proficiency assessment using speech and electroencephalograp...IJECEIAES
 
Open Creativity Scoring Tutorial
Open Creativity Scoring TutorialOpen Creativity Scoring Tutorial
Open Creativity Scoring TutorialDenisDumas2
 
Glide - Extracting Meaning from Social Media - Keith Woods-Holder
Glide - Extracting Meaning from Social Media - Keith Woods-HolderGlide - Extracting Meaning from Social Media - Keith Woods-Holder
Glide - Extracting Meaning from Social Media - Keith Woods-HolderInfluence People
 
Challenges in Language Technology.pptx
Challenges in Language Technology.pptxChallenges in Language Technology.pptx
Challenges in Language Technology.pptxssuserdbfc2c
 

Similar to Assessing Virtual Assistants for Italian Dysarthric Speech (20)

ELSA's Speech Recognition Overview
ELSA's Speech Recognition OverviewELSA's Speech Recognition Overview
ELSA's Speech Recognition Overview
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overview
 
Scil poster 2017
Scil poster 2017Scil poster 2017
Scil poster 2017
 
• COMMUNICATEBUSINESS VISION• WHAT TO EXPECT• .docx
• COMMUNICATEBUSINESS VISION• WHAT TO EXPECT• .docx• COMMUNICATEBUSINESS VISION• WHAT TO EXPECT• .docx
• COMMUNICATEBUSINESS VISION• WHAT TO EXPECT• .docx
 
The effects of learner characteristics and beliefs on usage of ASR-CALL systems
The effects of learner characteristics and beliefs on usage of ASR-CALL systemsThe effects of learner characteristics and beliefs on usage of ASR-CALL systems
The effects of learner characteristics and beliefs on usage of ASR-CALL systems
 
Do you Mean what you say? Recognizing Emotions.
Do you Mean what you say? Recognizing Emotions.Do you Mean what you say? Recognizing Emotions.
Do you Mean what you say? Recognizing Emotions.
 
thesis_palogiannidi
thesis_palogiannidithesis_palogiannidi
thesis_palogiannidi
 
N01741100102
N01741100102N01741100102
N01741100102
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Chp1,2&3
Chp1,2&3Chp1,2&3
Chp1,2&3
 
Automated Language Assessment Scoring and impact on instruction
Automated Language Assessment Scoring and impact on instructionAutomated Language Assessment Scoring and impact on instruction
Automated Language Assessment Scoring and impact on instruction
 
An optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phonesAn optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phones
 
An optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phonesAn optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phones
 
To Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
To Label or Not? Advances and Open Challenges in SE-specific Sentiment AnalysisTo Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
To Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
 
English speaking proficiency assessment using speech and electroencephalograp...
English speaking proficiency assessment using speech and electroencephalograp...English speaking proficiency assessment using speech and electroencephalograp...
English speaking proficiency assessment using speech and electroencephalograp...
 
Open Creativity Scoring Tutorial
Open Creativity Scoring TutorialOpen Creativity Scoring Tutorial
Open Creativity Scoring Tutorial
 
Glide - Extracting Meaning from Social Media - Keith Woods-Holder
Glide - Extracting Meaning from Social Media - Keith Woods-HolderGlide - Extracting Meaning from Social Media - Keith Woods-Holder
Glide - Extracting Meaning from Social Media - Keith Woods-Holder
 
SCiL Poster
SCiL PosterSCiL Poster
SCiL Poster
 
Challenges in Language Technology.pptx
Challenges in Language Technology.pptxChallenges in Language Technology.pptx
Challenges in Language Technology.pptx
 

More from Luigi De Russis

Semantic Web: an Introduction
Semantic Web: an IntroductionSemantic Web: an Introduction
Semantic Web: an IntroductionLuigi De Russis
 
Programming the Semantic Web
Programming the Semantic WebProgramming the Semantic Web
Programming the Semantic WebLuigi De Russis
 
Semantic Web - Ontology 101
Semantic Web - Ontology 101Semantic Web - Ontology 101
Semantic Web - Ontology 101Luigi De Russis
 
AmI 2017 - Python intermediate
AmI 2017 - Python intermediateAmI 2017 - Python intermediate
AmI 2017 - Python intermediateLuigi De Russis
 
AmI 2017 - Python basics
AmI 2017 - Python basicsAmI 2017 - Python basics
AmI 2017 - Python basicsLuigi De Russis
 
AngularJS: an introduction
AngularJS: an introductionAngularJS: an introduction
AngularJS: an introductionLuigi De Russis
 
AmI 2016 - Python basics
AmI 2016 - Python basicsAmI 2016 - Python basics
AmI 2016 - Python basicsLuigi De Russis
 
Introduction to OpenCV 3.x (with Java)
Introduction to OpenCV 3.x (with Java)Introduction to OpenCV 3.x (with Java)
Introduction to OpenCV 3.x (with Java)Luigi De Russis
 
Ambient Intelligence: An Overview
Ambient Intelligence: An OverviewAmbient Intelligence: An Overview
Ambient Intelligence: An OverviewLuigi De Russis
 
Version Control with Git
Version Control with GitVersion Control with Git
Version Control with GitLuigi De Russis
 
LAM 2015 - Social Networks Technologies
LAM 2015 - Social Networks TechnologiesLAM 2015 - Social Networks Technologies
LAM 2015 - Social Networks TechnologiesLuigi De Russis
 
AmI 2015 - Python basics
AmI 2015 - Python basicsAmI 2015 - Python basics
AmI 2015 - Python basicsLuigi De Russis
 
PowerOnt: an ontology-based approach for power consumption estimation in Smar...
PowerOnt: an ontology-based approach for power consumption estimation in Smar...PowerOnt: an ontology-based approach for power consumption estimation in Smar...
PowerOnt: an ontology-based approach for power consumption estimation in Smar...Luigi De Russis
 
Interacting with Smart Environments - Ph.D. Thesis Presentation
Interacting with Smart Environments - Ph.D. Thesis PresentationInteracting with Smart Environments - Ph.D. Thesis Presentation
Interacting with Smart Environments - Ph.D. Thesis PresentationLuigi De Russis
 
Semantic Web: an introduction
Semantic Web: an introductionSemantic Web: an introduction
Semantic Web: an introductionLuigi De Russis
 
Introduction to OpenCV (with Java)
Introduction to OpenCV (with Java)Introduction to OpenCV (with Java)
Introduction to OpenCV (with Java)Luigi De Russis
 
Living in Smart Environments - 3rd year PhD Report
Living in Smart Environments - 3rd year PhD ReportLiving in Smart Environments - 3rd year PhD Report
Living in Smart Environments - 3rd year PhD ReportLuigi De Russis
 
Semantic Web: an introduction
Semantic Web: an introductionSemantic Web: an introduction
Semantic Web: an introductionLuigi De Russis
 
Social Network Technologies
Social Network TechnologiesSocial Network Technologies
Social Network TechnologiesLuigi De Russis
 

More from Luigi De Russis (20)

Semantic Web: an Introduction
Semantic Web: an IntroductionSemantic Web: an Introduction
Semantic Web: an Introduction
 
Programming the Semantic Web
Programming the Semantic WebProgramming the Semantic Web
Programming the Semantic Web
 
Semantic Web - Ontology 101
Semantic Web - Ontology 101Semantic Web - Ontology 101
Semantic Web - Ontology 101
 
AmI 2017 - Python intermediate
AmI 2017 - Python intermediateAmI 2017 - Python intermediate
AmI 2017 - Python intermediate
 
AmI 2017 - Python basics
AmI 2017 - Python basicsAmI 2017 - Python basics
AmI 2017 - Python basics
 
AngularJS: an introduction
AngularJS: an introductionAngularJS: an introduction
AngularJS: an introduction
 
AmI 2016 - Python basics
AmI 2016 - Python basicsAmI 2016 - Python basics
AmI 2016 - Python basics
 
Introduction to OpenCV 3.x (with Java)
Introduction to OpenCV 3.x (with Java)Introduction to OpenCV 3.x (with Java)
Introduction to OpenCV 3.x (with Java)
 
Ambient Intelligence: An Overview
Ambient Intelligence: An OverviewAmbient Intelligence: An Overview
Ambient Intelligence: An Overview
 
Version Control with Git
Version Control with GitVersion Control with Git
Version Control with Git
 
LAM 2015 - Social Networks Technologies
LAM 2015 - Social Networks TechnologiesLAM 2015 - Social Networks Technologies
LAM 2015 - Social Networks Technologies
 
AmI 2015 - Python basics
AmI 2015 - Python basicsAmI 2015 - Python basics
AmI 2015 - Python basics
 
PowerOnt: an ontology-based approach for power consumption estimation in Smar...
PowerOnt: an ontology-based approach for power consumption estimation in Smar...PowerOnt: an ontology-based approach for power consumption estimation in Smar...
PowerOnt: an ontology-based approach for power consumption estimation in Smar...
 
Interacting with Smart Environments - Ph.D. Thesis Presentation
Interacting with Smart Environments - Ph.D. Thesis PresentationInteracting with Smart Environments - Ph.D. Thesis Presentation
Interacting with Smart Environments - Ph.D. Thesis Presentation
 
Semantic Web: an introduction
Semantic Web: an introductionSemantic Web: an introduction
Semantic Web: an introduction
 
Introduction to OpenCV (with Java)
Introduction to OpenCV (with Java)Introduction to OpenCV (with Java)
Introduction to OpenCV (with Java)
 
Living in Smart Environments - 3rd year PhD Report
Living in Smart Environments - 3rd year PhD ReportLiving in Smart Environments - 3rd year PhD Report
Living in Smart Environments - 3rd year PhD Report
 
Semantic Web: an introduction
Semantic Web: an introductionSemantic Web: an introduction
Semantic Web: an introduction
 
Social Network Technologies
Social Network TechnologiesSocial Network Technologies
Social Network Technologies
 
Clean Code
Clean CodeClean Code
Clean Code
 

Recently uploaded

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Recently uploaded (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

Assessing Virtual Assistants for Italian Dysarthric Speech

  • 1. Fabio Ballati, Fulvio Corno, Luigi De Russis Politecnico di Torino, Italy Assessing Virtual Assistant Capabilities with Italian Dysarthric Speech ASSETS 2018 - October 22-24, 2018 - Galway
  • 2. 2 Usage of smartphone-based virtual assistants is growing, worldwide Such assistants generally have a positive impact on device accessibility People with speech impairments like dysarthria may be unable to use those virtual assistants with proficiency Background and Motivation
  • 3. 3 We focused on ALS-inducted dysarthria and the Italian language Propose a methodology for the collection of dysarthric speech samples to evaluate smartphone-based virtual assistants Investigate which assistant provides the most coherent answer when the recognized speech is at least partially correct Investigate whether and how people with moderate dysarthria could be understood by three virtual assistants • Siri, Google Assistant, Cortana Goal
  • 4. 4 We played the collected speech samples to assess (i) the accuracy in transcription and (ii) the coherence of the answers ASSESSMENT To collect dysarthric speech samples, we designed a specific methodology and we recorded the 34 sentences from 8 people with ALS DATA COLLECTION Selection of 34 suitable sentences for virtual assistants SENTENCES SELECTION Work Phases
  • 5. 5 Sample sentences (translated in English) Do I need to take an umbrella, today? How many proteins are in two eggs? Add onion and tomatoes to my shopping list Who is the president of the Italian republic? Set the home temperature to 22 degrees. Set an alarm at 8am. … • Goal: to have a set of sentences to record, suitable for smartphone-based virtual assistants • We extracted 34 sentences from the recommended questions for virtual assistants • We, then, slightly modified them to include all the phonemes of the Italian language Sentence Selection SENTENCE SELECTION
  • 6. 6 Goal: to have a dataset of dysarthric speech samples that may allow us to assess the behavior of virtual assistants Participants • 8 native Italian speakers with ALS-induced dysarthria (4M, 4F), aged 64- 83 • Three types of dysarthria and within two speech intelligibility categories • Flaccid, Spastic, or Unilateral Upper Motor Neuron (Duffy classification) • "Intelligible with repeating" and "Detectable speech disturbance" (ALS Functional Rating Scale) Data Collection DATA COLLECTION
  • 7. 7 • Simple process, to be easily reproduced • The participant read each of the 34 sentences from an A4 sheet of paper (one sheet per sentence), located in front of the reader, while we recorded them • The recordings were taken with a smartphone located at distance of 30- 40 centimeters from the participant Procedure DATA COLLECTION
  • 8. 8 Goal: To investigate the accuracy in transcription and the coherence of the answers of the virtual assistants • The assessment took place in a quiet room of our university • The recorded speech sample were played on a laptop connected to an external high-quality speaker • Each of the 272 sentences was played for Siri, Google Assistant, and Cortana, separately, on three different smartphones • iPhone 7 (iOS 11.2), Samsung A5 (Android 8.1), and Lumia 910 (Windows 10 Mobile) • The results of the operation (recognized request and related response) were noted down Assessment ASSESSMENT
  • 9. 9 Qualitative QC Classification of each provided transcription in: • Correct • Same semantic meaning • Incomplete • Wrong • Not recognized Quantitative QC Word Error Rate (WER) WER = (S + I + D) / N, where S = substitution, I = insertion, D = deletion, and N = number of words in the original sentence Given by the similarity between the original sentence and the provided transcription Measures: Question Comprehension (QC) ASSESSMENT
  • 10. 10 • An indicator of the appropriateness of the assistants' responses • Computed for sentences that were correct or with the same semantic meaning, only • Given as the number and percentage of times that a virtual assistant provided a certain type of answer: • Coherent answers, i.e., correct or logically consistent responses • Incoherent answers, i.e., logically incoherent responses • Default answers, i.e., responses that an assistant provides by default when it is not able to fully understand or extract any context Measures: Consistency in Answers ASSESSMENT
  • 11. 11 • WER was highly dependent upon the participant • The average WER for Google Assistant was lower than Cortana • Siri performed the worst • Looking at the results of individual participants, the same trend appeared Results: Quantitative QC ASSESSMENT
  • 13. 13 Coherent answer Default answer Incorrect answer Google Assistant (174) 94 (54.02%) 78 (44.83%) 2 (1.15%) Cortana (108) 26 (24.07%) 82 (75.93%) 0 (0%) Siri (43) 26 (60.47%) 13 (30.23%) 4 (9.30%) The answers provided by Google Assistant and Siri were mostly coherent Results: Consistency in Answers ASSESSMENT
  • 14. 14 We plan to publicly release the collected dataset Google Assistant was the best in recognizing dysarthric speech and in providing suitable answers • Each virtual assistant behave differently • The accuracy of transcription is strictly related to the speaker • Some participants can use Google Assistant without any problems • Siri performed the worst for the accuracy of the transcriptions but provided a good number of suitable answers, when it properly understood the request Key Takeaways
  • 15. Luigi De Russis luigi.derussis@polito.it https://elite.polito.it Assessing Virtual Assistant Capabilities with Italian Dysarthric Speech