Speech given by Mark Billinghurst at the AWE 2014 conference on how to use multimodal speech and gesture interaction with Augmented Reality applications. Talk given on May 28th, 2014.
3. To Make the Vision Real..
Hardware/software requirements
Contact lens displays
Free space hand/body tracking
Speech/gesture recognition
Etc..
Most importantly
Usability/User Experience
4. Natural Hand Interaction
Using bare hands to interact with AR content
MS Kinect depth sensing
Real time hand tracking
Physics based simulation model
5. Pros and Cons of Gesture Only Input
Gesture-only good for
Direct manipulation,
Selection, Motion
Rapid expressiveness
Limitations
Descriptions (eg Temporal information)
Operation on large numbers of objects
Indirect manipulation, delayed actions
6. Multimodal Interaction
Combined speech and gesture input
Gesture and Speech complimentary
Speech: modal commands, quantities
Gesture: selection, motion, qualities
Previous work found multimodal interfaces
intuitive for 2D/3D graphics interaction
However, few multimodal AR interfaces
7. Wizard of Oz Study
What speech and gesture input
would people like to use?
Wizard
Perform speech recognition
Command interpretation
Domain
3D object interaction/modelling
Lee, M., & Billinghurst, M. (2008, October). A Wizard of Oz study for an AR
multimodal interface. In Proceedings of the 10th international conference on
Multimodal interfaces (pp. 249-256). ACM.
10. Key Results
Most commands multimodal
Multimodal (63%), Gesture (34%), Speech (4%)
Most spoken phrases short
74% phrases average 1.25 words long
Sentences (26%) average 3 words
Main gestures deictic (65%), metaphoric (35%)
In multimodal commands gesture issued first
94% time gesture begun before speech
11. Free Hand Multimodal Input
Use free hand to interact with AR content
Recognize simple gestures
Open hand, closed hand, pointing
Point Move Pick/Drop
Lee, M., Billinghurst, M., Baek, W., Green, R., & Woo, W. (2013). A usability study of
multimodal input in an augmented reality environment. Virtual Reality, 17(4), 293-305.
12. Speech Input
MS Speech + MS SAPI (> 90% accuracy)
Single word speech commands
17. User Evaluation
Change object shape, colour and position
Conditions
(1) Speech only, (2) gesture only, (3) multimodal
Measures
performance time, errors, subjective survey
18. Results - Performance
Average performance time
Gesture: 15.44s
Speech: 12.38s
Multimodal: 11.78s
Significant difference across conditions (p < 0.01)
Difference between gesture and speech/MMI
19. Subjective Results (Likert 1-7)
User subjective survey
Gesture significantly worse, MMI and Speech same
MMI perceived as most efficient
Preference
70% MMI, 25% speech only, 5% gesture only
Gesture Speech MMI
Naturalness 4.60 5.60 5.80
Ease of Use 4.00 5.90 6.00
Efficiency 4.45 5.15 6.05
Physical Effort 4.75 3.15 3.85
20. Observations
Significant difference in number of commands
Gesture (6.14), Speech (5.23), MMI (4.93)
MMI Simultaneous vs. Sequential commands
79% sequential, 21% simultaneous
Reaction to system errors
Almost always repeated same command
In MMI rarely changes modalities
21. Lessons Learned
Multimodal interaction significantly better than
gesture alone in AR interfaces for 3D tasks
Shorter task time, more efficient
Multimodal input was more natural, easier,
and more effective that gesture/speech only
Simultaneous input rarely used
More studies need to be conducted
What gesture/speech patterns? Richer input
22. 3D Gesture Tracking
3 Gear Systems
Kinect/Primesense Sensor
Two hand tracking
http://www.threegear.com
23. Skeleton Interaction + AR
HMD AR View
Viewpoint tracking
Two hand input
Skeleton interaction, occlusion
27. Conclusions
AR experiences need new interaction methods
Combined speech and gesture more powerful
Complimentary input modalities
Natural user interfaces possible
Free hand gesture, speech, intelligence interfaces
Important research directions for the future
What gesture/speech commands should be used?
Relationship better speech and gesture?
28. More Information
• Mark Billinghurst
– Email: mark.billinghurst@hitlabnz.org
– Twitter: @marknb00
• Website
– http://www.hitlabnz.org/