SlideShare a Scribd company logo
1 of 28
Download to read offline
Hands and Speech in Space
Mark Billinghurst
mark.billinghurst@hitlabnz.org
The HIT Lab NZ, University of Canterbury
May 28th 2014
2012 – Iron Man 2
To Make the Vision Real..
  Hardware/software requirements
 Contact lens displays
 Free space hand/body tracking
 Speech/gesture recognition
 Etc..
  Most importantly
 Usability/User Experience
Natural Hand Interaction
  Using bare hands to interact with AR content
  MS Kinect depth sensing
  Real time hand tracking
  Physics based simulation model
Pros and Cons of Gesture Only Input
  Gesture-only good for
 Direct manipulation,
 Selection, Motion
 Rapid expressiveness
  Limitations
 Descriptions (eg Temporal information)
 Operation on large numbers of objects
 Indirect manipulation, delayed actions
Multimodal Interaction
  Combined speech and gesture input
  Gesture and Speech complimentary
  Speech: modal commands, quantities
  Gesture: selection, motion, qualities
  Previous work found multimodal interfaces
intuitive for 2D/3D graphics interaction
  However, few multimodal AR interfaces
Wizard of Oz Study
  What speech and gesture input
would people like to use?
  Wizard
  Perform speech recognition
  Command interpretation
  Domain
  3D object interaction/modelling
Lee, M., & Billinghurst, M. (2008, October). A Wizard of Oz study for an AR
multimodal interface. In Proceedings of the 10th international conference on
Multimodal interfaces (pp. 249-256). ACM.
System Architecture
System Set Up
Key Results
  Most commands multimodal
  Multimodal (63%), Gesture (34%), Speech (4%)
  Most spoken phrases short
  74% phrases average 1.25 words long
  Sentences (26%) average 3 words
  Main gestures deictic (65%), metaphoric (35%)
  In multimodal commands gesture issued first
  94% time gesture begun before speech
Free Hand Multimodal Input
  Use free hand to interact with AR content
  Recognize simple gestures
  Open hand, closed hand, pointing
Point Move Pick/Drop
Lee, M., Billinghurst, M., Baek, W., Green, R., & Woo, W. (2013). A usability study of
multimodal input in an augmented reality environment. Virtual Reality, 17(4), 293-305.
Speech Input
  MS Speech + MS SAPI (> 90% accuracy)
  Single word speech commands
Multimodal Architecture
Multimodal Fusion
Hand Occlusion
Experimental Setup
Change object shape
and colour
User Evaluation
  Change object shape, colour and position
  Conditions
  (1) Speech only, (2) gesture only, (3) multimodal
  Measures
  performance time, errors, subjective survey
Results - Performance
  Average performance time
  Gesture: 15.44s
  Speech: 12.38s
  Multimodal: 11.78s
  Significant difference across conditions (p < 0.01)
  Difference between gesture and speech/MMI
Subjective Results (Likert 1-7)
  User subjective survey
  Gesture significantly worse, MMI and Speech same
  MMI perceived as most efficient
  Preference
  70% MMI, 25% speech only, 5% gesture only
Gesture Speech MMI
Naturalness 4.60 5.60 5.80
Ease of Use 4.00 5.90 6.00
Efficiency 4.45 5.15 6.05
Physical Effort 4.75 3.15 3.85
Observations
  Significant difference in number of commands
  Gesture (6.14), Speech (5.23), MMI (4.93)
  MMI Simultaneous vs. Sequential commands
  79% sequential, 21% simultaneous
  Reaction to system errors
  Almost always repeated same command
  In MMI rarely changes modalities
Lessons Learned
  Multimodal interaction significantly better than
gesture alone in AR interfaces for 3D tasks
  Shorter task time, more efficient
  Multimodal input was more natural, easier,
and more effective that gesture/speech only
  Simultaneous input rarely used
  More studies need to be conducted
  What gesture/speech patterns? Richer input
3D Gesture Tracking
  3 Gear Systems
  Kinect/Primesense Sensor
  Two hand tracking
  http://www.threegear.com
Skeleton Interaction + AR
  HMD AR View
  Viewpoint tracking
  Two hand input
  Skeleton interaction, occlusion
AR Rift Display
Conclusions
  AR experiences need new interaction methods
  Combined speech and gesture more powerful
  Complimentary input modalities
  Natural user interfaces possible
  Free hand gesture, speech, intelligence interfaces
  Important research directions for the future
  What gesture/speech commands should be used?
  Relationship better speech and gesture?
More Information
•  Mark Billinghurst
–  Email: mark.billinghurst@hitlabnz.org
–  Twitter: @marknb00
•  Website
–  http://www.hitlabnz.org/

More Related Content

Similar to Hands and Speech in Space

Kinect-taped communication: Using motion sensing to study gesture use and sim...
Kinect-taped communication: Using motion sensing to study gesture use and sim...Kinect-taped communication: Using motion sensing to study gesture use and sim...
Kinect-taped communication: Using motion sensing to study gesture use and sim...haochuan
 
Using Augmented Reality to Create Empathic Experiences
Using Augmented Reality to Create Empathic ExperiencesUsing Augmented Reality to Create Empathic Experiences
Using Augmented Reality to Create Empathic ExperiencesMark Billinghurst
 
Wearable Computing and Human Computer Interfaces
Wearable Computing and Human Computer InterfacesWearable Computing and Human Computer Interfaces
Wearable Computing and Human Computer InterfacesJeffrey Funk
 
PACER: Fine-grained Interactive Paper via Hybrid Camera and Touch Gestures on...
PACER: Fine-grained Interactive Paper via Hybrid Camera and Touch Gestures on...PACER: Fine-grained Interactive Paper via Hybrid Camera and Touch Gestures on...
PACER: Fine-grained Interactive Paper via Hybrid Camera and Touch Gestures on...Chunyuan Liao
 
PACER: Fine-grained Interactive Paper via Camera-touch Hybrid Gestures on a C...
PACER: Fine-grained Interactive Paper via Camera-touch Hybrid Gestures on a C...PACER: Fine-grained Interactive Paper via Camera-touch Hybrid Gestures on a C...
PACER: Fine-grained Interactive Paper via Camera-touch Hybrid Gestures on a C...FXPAL
 
Comp4010 Lecture13 More Research Directions
Comp4010 Lecture13 More Research DirectionsComp4010 Lecture13 More Research Directions
Comp4010 Lecture13 More Research DirectionsMark Billinghurst
 
Ijaia040203
Ijaia040203Ijaia040203
Ijaia040203ijaia
 
Behavioral biometrics mechanism for delaying password obsolescence
Behavioral biometrics   mechanism for delaying password obsolescenceBehavioral biometrics   mechanism for delaying password obsolescence
Behavioral biometrics mechanism for delaying password obsolescenceElaine Wooton
 
Designing for tablets: Touch and Natural Interaction
Designing for tablets: Touch and Natural InteractionDesigning for tablets: Touch and Natural Interaction
Designing for tablets: Touch and Natural InteractionArmando Fidalgo
 
COSC 426 Lect. 6: Collaborative AR
COSC 426 Lect. 6: Collaborative ARCOSC 426 Lect. 6: Collaborative AR
COSC 426 Lect. 6: Collaborative ARMark Billinghurst
 
Video Audio Interface for recognizing gestures of Indian sign Language
Video Audio Interface for recognizing gestures of Indian sign LanguageVideo Audio Interface for recognizing gestures of Indian sign Language
Video Audio Interface for recognizing gestures of Indian sign LanguageCSCJournals
 
Mnemonical Body Shortcuts: improving mobile interaction
Mnemonical Body Shortcuts: improving mobile interactionMnemonical Body Shortcuts: improving mobile interaction
Mnemonical Body Shortcuts: improving mobile interactionTiago Guerreiro
 
Multimodal Multi-sensory Interaction for Mixed Reality
Multimodal Multi-sensory Interaction for Mixed RealityMultimodal Multi-sensory Interaction for Mixed Reality
Multimodal Multi-sensory Interaction for Mixed RealityMark Billinghurst
 
The Glass Class Lecture 7: Future Research
The Glass Class Lecture 7: Future ResearchThe Glass Class Lecture 7: Future Research
The Glass Class Lecture 7: Future ResearchMark Billinghurst
 
IRJET- Hand Gesture Recognition for Deaf and Dumb
IRJET- Hand Gesture Recognition for Deaf and DumbIRJET- Hand Gesture Recognition for Deaf and Dumb
IRJET- Hand Gesture Recognition for Deaf and DumbIRJET Journal
 
IRJET - Paint using Hand Gesture
IRJET - Paint using Hand GestureIRJET - Paint using Hand Gesture
IRJET - Paint using Hand GestureIRJET Journal
 
Making Voting Accessible
Making Voting Accessible Making Voting Accessible
Making Voting Accessible Dana Chisnell
 

Similar to Hands and Speech in Space (20)

Kinect-taped communication: Using motion sensing to study gesture use and sim...
Kinect-taped communication: Using motion sensing to study gesture use and sim...Kinect-taped communication: Using motion sensing to study gesture use and sim...
Kinect-taped communication: Using motion sensing to study gesture use and sim...
 
Using Augmented Reality to Create Empathic Experiences
Using Augmented Reality to Create Empathic ExperiencesUsing Augmented Reality to Create Empathic Experiences
Using Augmented Reality to Create Empathic Experiences
 
Wearable Computing and Human Computer Interfaces
Wearable Computing and Human Computer InterfacesWearable Computing and Human Computer Interfaces
Wearable Computing and Human Computer Interfaces
 
PACER: Fine-grained Interactive Paper via Hybrid Camera and Touch Gestures on...
PACER: Fine-grained Interactive Paper via Hybrid Camera and Touch Gestures on...PACER: Fine-grained Interactive Paper via Hybrid Camera and Touch Gestures on...
PACER: Fine-grained Interactive Paper via Hybrid Camera and Touch Gestures on...
 
PACER: Fine-grained Interactive Paper via Camera-touch Hybrid Gestures on a C...
PACER: Fine-grained Interactive Paper via Camera-touch Hybrid Gestures on a C...PACER: Fine-grained Interactive Paper via Camera-touch Hybrid Gestures on a C...
PACER: Fine-grained Interactive Paper via Camera-touch Hybrid Gestures on a C...
 
universaldesign
 universaldesign universaldesign
universaldesign
 
Comp4010 Lecture13 More Research Directions
Comp4010 Lecture13 More Research DirectionsComp4010 Lecture13 More Research Directions
Comp4010 Lecture13 More Research Directions
 
Ijaia040203
Ijaia040203Ijaia040203
Ijaia040203
 
Behavioral biometrics mechanism for delaying password obsolescence
Behavioral biometrics   mechanism for delaying password obsolescenceBehavioral biometrics   mechanism for delaying password obsolescence
Behavioral biometrics mechanism for delaying password obsolescence
 
Designing for tablets: Touch and Natural Interaction
Designing for tablets: Touch and Natural InteractionDesigning for tablets: Touch and Natural Interaction
Designing for tablets: Touch and Natural Interaction
 
COSC 426 Lect. 6: Collaborative AR
COSC 426 Lect. 6: Collaborative ARCOSC 426 Lect. 6: Collaborative AR
COSC 426 Lect. 6: Collaborative AR
 
40120140503005 2
40120140503005 240120140503005 2
40120140503005 2
 
Video Audio Interface for recognizing gestures of Indian sign Language
Video Audio Interface for recognizing gestures of Indian sign LanguageVideo Audio Interface for recognizing gestures of Indian sign Language
Video Audio Interface for recognizing gestures of Indian sign Language
 
Mnemonical Body Shortcuts: improving mobile interaction
Mnemonical Body Shortcuts: improving mobile interactionMnemonical Body Shortcuts: improving mobile interaction
Mnemonical Body Shortcuts: improving mobile interaction
 
Multimodal Multi-sensory Interaction for Mixed Reality
Multimodal Multi-sensory Interaction for Mixed RealityMultimodal Multi-sensory Interaction for Mixed Reality
Multimodal Multi-sensory Interaction for Mixed Reality
 
The Glass Class Lecture 7: Future Research
The Glass Class Lecture 7: Future ResearchThe Glass Class Lecture 7: Future Research
The Glass Class Lecture 7: Future Research
 
IRJET- Hand Gesture Recognition for Deaf and Dumb
IRJET- Hand Gesture Recognition for Deaf and DumbIRJET- Hand Gesture Recognition for Deaf and Dumb
IRJET- Hand Gesture Recognition for Deaf and Dumb
 
IRJET - Paint using Hand Gesture
IRJET - Paint using Hand GestureIRJET - Paint using Hand Gesture
IRJET - Paint using Hand Gesture
 
ICS3211 Lecture 07
ICS3211 Lecture 07 ICS3211 Lecture 07
ICS3211 Lecture 07
 
Making Voting Accessible
Making Voting Accessible Making Voting Accessible
Making Voting Accessible
 

More from Mark Billinghurst

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Future Research Directions for Augmented Reality
Future Research Directions for Augmented RealityFuture Research Directions for Augmented Reality
Future Research Directions for Augmented RealityMark Billinghurst
 
Evaluation Methods for Social XR Experiences
Evaluation Methods for Social XR ExperiencesEvaluation Methods for Social XR Experiences
Evaluation Methods for Social XR ExperiencesMark Billinghurst
 
Empathic Computing: Delivering the Potential of the Metaverse
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the MetaverseMark Billinghurst
 
Empathic Computing: Capturing the Potential of the Metaverse
Empathic Computing: Capturing the Potential of the MetaverseEmpathic Computing: Capturing the Potential of the Metaverse
Empathic Computing: Capturing the Potential of the MetaverseMark Billinghurst
 
Talk to Me: Using Virtual Avatars to Improve Remote Collaboration
Talk to Me: Using Virtual Avatars to Improve Remote CollaborationTalk to Me: Using Virtual Avatars to Improve Remote Collaboration
Talk to Me: Using Virtual Avatars to Improve Remote CollaborationMark Billinghurst
 
Empathic Computing: Designing for the Broader Metaverse
Empathic Computing: Designing for the Broader MetaverseEmpathic Computing: Designing for the Broader Metaverse
Empathic Computing: Designing for the Broader MetaverseMark Billinghurst
 
2022 COMP 4010 Lecture 7: Introduction to VR
2022 COMP 4010 Lecture 7: Introduction to VR2022 COMP 4010 Lecture 7: Introduction to VR
2022 COMP 4010 Lecture 7: Introduction to VRMark Billinghurst
 
2022 COMP4010 Lecture 6: Designing AR Systems
2022 COMP4010 Lecture 6: Designing AR Systems2022 COMP4010 Lecture 6: Designing AR Systems
2022 COMP4010 Lecture 6: Designing AR SystemsMark Billinghurst
 
Novel Interfaces for AR Systems
Novel Interfaces for AR SystemsNovel Interfaces for AR Systems
Novel Interfaces for AR SystemsMark Billinghurst
 
2022 COMP4010 Lecture5: AR Prototyping
2022 COMP4010 Lecture5: AR Prototyping2022 COMP4010 Lecture5: AR Prototyping
2022 COMP4010 Lecture5: AR PrototypingMark Billinghurst
 
2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR Interaction2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR InteractionMark Billinghurst
 
2022 COMP4010 Lecture3: AR Technology
2022 COMP4010 Lecture3: AR Technology2022 COMP4010 Lecture3: AR Technology
2022 COMP4010 Lecture3: AR TechnologyMark Billinghurst
 
2022 COMP4010 Lecture2: Perception
2022 COMP4010 Lecture2: Perception2022 COMP4010 Lecture2: Perception
2022 COMP4010 Lecture2: PerceptionMark Billinghurst
 
2022 COMP4010 Lecture1: Introduction to XR
2022 COMP4010 Lecture1: Introduction to XR2022 COMP4010 Lecture1: Introduction to XR
2022 COMP4010 Lecture1: Introduction to XRMark Billinghurst
 
Empathic Computing and Collaborative Immersive Analytics
Empathic Computing and Collaborative Immersive AnalyticsEmpathic Computing and Collaborative Immersive Analytics
Empathic Computing and Collaborative Immersive AnalyticsMark Billinghurst
 
Empathic Computing: Developing for the Whole Metaverse
Empathic Computing: Developing for the Whole MetaverseEmpathic Computing: Developing for the Whole Metaverse
Empathic Computing: Developing for the Whole MetaverseMark Billinghurst
 

More from Mark Billinghurst (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Future Research Directions for Augmented Reality
Future Research Directions for Augmented RealityFuture Research Directions for Augmented Reality
Future Research Directions for Augmented Reality
 
Evaluation Methods for Social XR Experiences
Evaluation Methods for Social XR ExperiencesEvaluation Methods for Social XR Experiences
Evaluation Methods for Social XR Experiences
 
Empathic Computing: Delivering the Potential of the Metaverse
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the Metaverse
 
Empathic Computing: Capturing the Potential of the Metaverse
Empathic Computing: Capturing the Potential of the MetaverseEmpathic Computing: Capturing the Potential of the Metaverse
Empathic Computing: Capturing the Potential of the Metaverse
 
Talk to Me: Using Virtual Avatars to Improve Remote Collaboration
Talk to Me: Using Virtual Avatars to Improve Remote CollaborationTalk to Me: Using Virtual Avatars to Improve Remote Collaboration
Talk to Me: Using Virtual Avatars to Improve Remote Collaboration
 
Empathic Computing: Designing for the Broader Metaverse
Empathic Computing: Designing for the Broader MetaverseEmpathic Computing: Designing for the Broader Metaverse
Empathic Computing: Designing for the Broader Metaverse
 
2022 COMP 4010 Lecture 7: Introduction to VR
2022 COMP 4010 Lecture 7: Introduction to VR2022 COMP 4010 Lecture 7: Introduction to VR
2022 COMP 4010 Lecture 7: Introduction to VR
 
2022 COMP4010 Lecture 6: Designing AR Systems
2022 COMP4010 Lecture 6: Designing AR Systems2022 COMP4010 Lecture 6: Designing AR Systems
2022 COMP4010 Lecture 6: Designing AR Systems
 
ISS2022 Keynote
ISS2022 KeynoteISS2022 Keynote
ISS2022 Keynote
 
Novel Interfaces for AR Systems
Novel Interfaces for AR SystemsNovel Interfaces for AR Systems
Novel Interfaces for AR Systems
 
2022 COMP4010 Lecture5: AR Prototyping
2022 COMP4010 Lecture5: AR Prototyping2022 COMP4010 Lecture5: AR Prototyping
2022 COMP4010 Lecture5: AR Prototyping
 
2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR Interaction2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR Interaction
 
2022 COMP4010 Lecture3: AR Technology
2022 COMP4010 Lecture3: AR Technology2022 COMP4010 Lecture3: AR Technology
2022 COMP4010 Lecture3: AR Technology
 
2022 COMP4010 Lecture2: Perception
2022 COMP4010 Lecture2: Perception2022 COMP4010 Lecture2: Perception
2022 COMP4010 Lecture2: Perception
 
2022 COMP4010 Lecture1: Introduction to XR
2022 COMP4010 Lecture1: Introduction to XR2022 COMP4010 Lecture1: Introduction to XR
2022 COMP4010 Lecture1: Introduction to XR
 
Empathic Computing and Collaborative Immersive Analytics
Empathic Computing and Collaborative Immersive AnalyticsEmpathic Computing and Collaborative Immersive Analytics
Empathic Computing and Collaborative Immersive Analytics
 
Metaverse Learning
Metaverse LearningMetaverse Learning
Metaverse Learning
 
Empathic Computing: Developing for the Whole Metaverse
Empathic Computing: Developing for the Whole MetaverseEmpathic Computing: Developing for the Whole Metaverse
Empathic Computing: Developing for the Whole Metaverse
 

Recently uploaded

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Hands and Speech in Space

  • 1. Hands and Speech in Space Mark Billinghurst mark.billinghurst@hitlabnz.org The HIT Lab NZ, University of Canterbury May 28th 2014
  • 3. To Make the Vision Real..   Hardware/software requirements  Contact lens displays  Free space hand/body tracking  Speech/gesture recognition  Etc..   Most importantly  Usability/User Experience
  • 4. Natural Hand Interaction   Using bare hands to interact with AR content   MS Kinect depth sensing   Real time hand tracking   Physics based simulation model
  • 5. Pros and Cons of Gesture Only Input   Gesture-only good for  Direct manipulation,  Selection, Motion  Rapid expressiveness   Limitations  Descriptions (eg Temporal information)  Operation on large numbers of objects  Indirect manipulation, delayed actions
  • 6. Multimodal Interaction   Combined speech and gesture input   Gesture and Speech complimentary   Speech: modal commands, quantities   Gesture: selection, motion, qualities   Previous work found multimodal interfaces intuitive for 2D/3D graphics interaction   However, few multimodal AR interfaces
  • 7. Wizard of Oz Study   What speech and gesture input would people like to use?   Wizard   Perform speech recognition   Command interpretation   Domain   3D object interaction/modelling Lee, M., & Billinghurst, M. (2008, October). A Wizard of Oz study for an AR multimodal interface. In Proceedings of the 10th international conference on Multimodal interfaces (pp. 249-256). ACM.
  • 10. Key Results   Most commands multimodal   Multimodal (63%), Gesture (34%), Speech (4%)   Most spoken phrases short   74% phrases average 1.25 words long   Sentences (26%) average 3 words   Main gestures deictic (65%), metaphoric (35%)   In multimodal commands gesture issued first   94% time gesture begun before speech
  • 11. Free Hand Multimodal Input   Use free hand to interact with AR content   Recognize simple gestures   Open hand, closed hand, pointing Point Move Pick/Drop Lee, M., Billinghurst, M., Baek, W., Green, R., & Woo, W. (2013). A usability study of multimodal input in an augmented reality environment. Virtual Reality, 17(4), 293-305.
  • 12. Speech Input   MS Speech + MS SAPI (> 90% accuracy)   Single word speech commands
  • 17. User Evaluation   Change object shape, colour and position   Conditions   (1) Speech only, (2) gesture only, (3) multimodal   Measures   performance time, errors, subjective survey
  • 18. Results - Performance   Average performance time   Gesture: 15.44s   Speech: 12.38s   Multimodal: 11.78s   Significant difference across conditions (p < 0.01)   Difference between gesture and speech/MMI
  • 19. Subjective Results (Likert 1-7)   User subjective survey   Gesture significantly worse, MMI and Speech same   MMI perceived as most efficient   Preference   70% MMI, 25% speech only, 5% gesture only Gesture Speech MMI Naturalness 4.60 5.60 5.80 Ease of Use 4.00 5.90 6.00 Efficiency 4.45 5.15 6.05 Physical Effort 4.75 3.15 3.85
  • 20. Observations   Significant difference in number of commands   Gesture (6.14), Speech (5.23), MMI (4.93)   MMI Simultaneous vs. Sequential commands   79% sequential, 21% simultaneous   Reaction to system errors   Almost always repeated same command   In MMI rarely changes modalities
  • 21. Lessons Learned   Multimodal interaction significantly better than gesture alone in AR interfaces for 3D tasks   Shorter task time, more efficient   Multimodal input was more natural, easier, and more effective that gesture/speech only   Simultaneous input rarely used   More studies need to be conducted   What gesture/speech patterns? Richer input
  • 22. 3D Gesture Tracking   3 Gear Systems   Kinect/Primesense Sensor   Two hand tracking   http://www.threegear.com
  • 23. Skeleton Interaction + AR   HMD AR View   Viewpoint tracking   Two hand input   Skeleton interaction, occlusion
  • 25.
  • 26.
  • 27. Conclusions   AR experiences need new interaction methods   Combined speech and gesture more powerful   Complimentary input modalities   Natural user interfaces possible   Free hand gesture, speech, intelligence interfaces   Important research directions for the future   What gesture/speech commands should be used?   Relationship better speech and gesture?
  • 28. More Information •  Mark Billinghurst –  Email: mark.billinghurst@hitlabnz.org –  Twitter: @marknb00 •  Website –  http://www.hitlabnz.org/