User experience research has traditionally relied upon qualitative techniques that entail users telling us their feelings, wants, and needs. This creates an inherent cognitive bias – data is filtered through the participant’s cognition. That is, we may not necessarily be hearing the participants’ true feelings. They may be trying to please the moderator or may just be unable to articulate the cause of their emotions. But researchers and stakeholders alike are thirsty for quantitative data that complements the qualitative. Luckily, we live in exciting times – there are two particular technologies that are becoming more accessible that will help usability researchers break through cognitive bias and provide that ever tantalizing quantitative data: eye tracking and biometrics. Eye tracking equipment has only recently started to become affordable to most anyone who wants to use it. Researchers must now get up-to-speed on eye tracking methodology and analysis. When is it appropriate? How can we turn the data into actionable findings? What the heck do I do with all of this new data?! More importantly, we should find new research techniques that will break through cognitive bias.
This is where the second technology comes in: biometrics. Psychophysiology is the study of how emotions affect changes in the body. Changes in heart rate, breathing rate, heart rate variability, and galvanic skin response (GSR) have all been shown to be accurate indicators of a person’s emotions, among others. Just as with eye tracking, the equipment to measure these biometrics are just now starting to become accessible to usability researchers. Until very recently, the equipment to gather this data was rather obtrusive and invasive. This not only affected participant comfort, but also did not lend to conducting “discount” usability research. But new technology allows the collection of biometrics in non-invasive ways. For instance, Affectiva’s Q Sensor is worn on the wrist and wirelessly gathers a participant’s GSR. The problem with integrating psychophysiological data into usability research is that individual researchers will need to come up with not only the algorithms to interpret the biometrics but also the technology to temporally marry the biometrics to the eye tracking data. These are no small tasks. There are companies out there that will collect and interpret the data for you for a hefty fee. But this technique should be in every usability researcher’s toolkit. As such, we should come together as a research community to figure this out. We need an open dialogue. We need to share techniques and stories.
Beyond Eye Tracking: Bringing Biometrics to Usability Research
1. Prepared by:
Daniel Berlin – Experience Research Director
May 29, 2013
UXPA Boston 2013 Conference
Psychophysiology
and Eye Tracking
NEW AND OLD TECHNOLOGIES THAT
CAN COMPLEMENT USABILITY
RESEARCH
2. Today’s Presentation
2
•History of Eye Tracking and Psychophysiology
•Traditional and modern Eye Tracking metrics and methodologies
• Eye Tracking as data for HCI optimization, not as an input device
•Available eye tracking equipment
•The need to evolve neuromarketing
•Psychophysiology in user experience
3. Hi! I’m Dan Berlin
3
• BA in psychology from Brandeis University
• Studies focused on visual space perception
• Seven years in technical support
• Sat as a participant for a usability study for a product I was working on
• Realized that user experience (UX) work is the perfect combination of computers and psychology
• Went to Bentley U. to earn an MBA and MS in Human Factors in Information Design
• Two years at an interactive agency performing usability and neuromarketing
research
• Then did some freelance UX consulting for about a year
• Two years as an Experience Research Director in Mad*Pow’s Boston office
4. What Will NOT Be Covered In This Presentation
4
• The validity of current eye tracking metrics and methodologies
• What Eye Tracking and Psychophysiology has already taught us about
human behavior
• Psychophysiological traces other than skin conductance:
• Heart rate variability
• Heart rate
• Breathing rate
• Skin temperature
• Neurological signals
5. Why This is an Important Topic
5
• UX researchers should be collecting objective data
• Lack of “discount” quantitative measures to complement our typically qualitative
methods
• Eye tracking metrics provide objective data based on participant
behavior
• But current methods are only the beginning
• Pairing eye tracking with psychophysiology is the next logical step
• New technology is bridging the gap to discount usability testing
6. Yes, I’m Saying “Discount” Research
6
• GASP!
• The “golden triad” of user experience dictates
that we must collect useful, actionable data
ON THE CHEAP
Technology
User
Needs
Business
Goals
• We typically don’t achieve statistically
significant results – it’s not worth the cost
• We’re not going to bring in 12 participants for
every Agile sprint
8. What is Eye Tracking?
8
• Observing and recording eye movements as a study participant traverses
a website or application
• Allows us to gain deeper insight into how users perform usability tasks
• Allows UX researchers to collect objective behavioral data
• Terminology
• Fixation – when a user stops to look at something for more than 10ms
• Saccade – the path between fixations (searching)
• Scanpath – a set of fixations and saccades that indicate a trajectory
• “Modern” eye tracking began with Goldberg & Kotval (1999)
• Developed eye tracking metrics for on-screen tasks
• Eye tracking is NOT: pupil dilation, blink-rate, or facial recognition
Tobii 1750
9. History of Eye Tracking
9
•Has roots in reading research and
is over 100(!) years old:
• Electrodes placed around the eye
• Various types of contact lenses
• Cameras mounted in plane cockpits
• Big, heavy helmets
•Became more “mainstream” in
the 1950s with FAA studies done
on pilots for cockpit design
•Modern Eye Tracking equipment
is much less invasive
• They typically bounce infrared light off
the retina to determine eye position
Yesterday
Today
10. Typical Eye Tracking Data Visualizations (the eye candy)
10
Heat Map Gaze Plot
• # of fixations for all participants • Order of fixations for one participant
11. Basic Eye Tracking Methodology
11
• Break the page up into separate “areas
of interest” or AOIs
• Compare the fixation data between
important areas and less important
ones
• Or compare data between designs
• You will always need things to compare
• Eye tracking data does not tell much of a
story without a comparison
• There are no absolute standards for eye
tracking metrics – human behavior
differs!
Areas of Interest
Source:
12. Basic Eye Tracking Interpretation
12
• Number of fixations
• Is there a searching pattern?
• Are fixations close together?
• Are users reading the content?
• Fixation duration
• Are users spending a long time
looking at a single link?
• Are they particularly engaged with
one of the design/content
elements?
• Time to 1st Fixation
• How long did it take for users to
look at a call to action?
1
2
3
4
5 6
7
8
Order of Gazes
15
8
5
2
10
15
5
4
0
2
4
6
8
10
12
14
16
Area 1 Area 2 Area 3 Area 4
Area of Interest
#offixaons
# of fixa ons
Design 1
Design 2
13. Eye Tracking Metrics (Fixations)
13
• Poole & Ball (2010) provide a great summary of Eye Tracking metrics,
re-summarized here:
Description What it Measures
Overall # of fixations Increased overall fixations indicate less efficient search
Fixations per AOI
Increased fixations indicate increase noticeability or
importance
Fixations per AOI, adjusted for
text length
For text-based AOIs, divide by the number of words
Overall fixation duration
Increased fixation duration indicates confusion or
engagement
Gaze, dwell, or
fixation cluster/cycle
(Sum of fixation durations within an AOI)
Compare attention between AOIs and used to measure
anticipation
Fixation spatial density Small fixation area indicates efficient searching
Repeat fixations or post-target
fixations
Increased off-target fixations after initial target fixation
indicates low meaningfulness or visibility
Time to first fixation on-target
Faster time to first fixation on-target indicates increased
noticeability
Percentage of participants fixating
an area of interest
Higher percentages indicate increased noticeability
On-target (all target fixations)
(On-target fixations / Total # of fixations)
Lower ratio indicates lower search efficiency
14. Eye Tracking Metrics (Saccades and Scanpaths)
14
• Poole & Ball (2010) provide a great summary of Eye Tracking metrics,
re-summarized here:
Description What it Measures
Overall # of saccades Increased saccades indicate more searching
Saccade amplitude
Larger saccades indicate meaningful cues – attention is
drawn from a distance
Regressive saccades Indicate less meaningful cues
Marked directional shifts
Saccades greater than 90 degrees may indicate a change in
user goals or a breaking of user expectations
Scanpath duration Increased time indicates more searching
Scanpath length Increased length indicates more searching
Spatial density Smaller density indicates directed searching
Fixation/saccade ratio Higher ratio indicates less searching (more processing)
15. Eye Tracking Research
15
• Bojko (2006) shows how a combination of eye tracking and click data
can highlight differences in search behavior
• Increased time on task for the “old” website was caused by an increased
number of fixations before an on-target click
• Scanpaths showed that targets were more noticeable in the “new” design
(clicked upon 1st fixation)
• Some have looked into correlating eye-movement patterns with usability
problems (Ehmke & Wilson, 2007)
• Multiple, quick fixations may indicate missing information
• Promising patterns, but nothing concrete – more research is needed
• Journey mapping with head-mounted eye tracker (Alves, et al, 2012)
• “Real-world” tasks and scenarios
16. Using Eye Tracking in a Usability Study
16
• Use a within-subjects study design (all participants see all stimuli)
• People have different viewing behavior and the data needs to be comparable
• Expose participants to the stimuli in the course of performing a task
• Keeps the data relevant and contextual
• Think-aloud protocol may be distracting for the participant
• Some research has been done into “Retrospective Think-Aloud” (RTA)
• Studies that make use of Eye Tracking have special recruiting needs
• Over-recruit – you won’t be able to use the data from every participant
• Screen-out respondents with cornea or retina damage/disease
17. Eye Tracking Equipment
17
Tobii T60/120
Tobii Glasses
SMI RED
SMI Glasses
• Tobii and SMI are the major
players
• Both offer:
• Remote (monitor based)
• Head-mounted (glasses)
• Flexible (use your own monitor/laptop)
• There are other, cheaper options
• But you get what you pay for
18. Going Beyond Eye Tracking Metrics
18
• Eye tracking metrics are just the tip of the iceberg
• We need to take a step back and remember what eye tracking does best:
It tells us where participants are looking at any given time
• So what other temporal, objective data can we use in conjunction with
eye tracking?
20. What is Psychophysiology?
20
• In the late 1800s, it was discovered that Electro Dermal Activity (EDA)
will change based on a person’s feelings (Vigouroux, 1888)
• That is, the skin’s electrical conductance (or resistance) changes with positive or
negative arousal
• This allows us to observe a person’s psychological reaction without asking any
questions
• Galvanic skin response (GSR) is the typical metric used to measure EDA
• GSR measures the electrical conductivity of the skin
• Sweat glands are controlled by the sympathetic system and you sweat when
aroused
• More sweat = more skin conductivity
• Psychophysiology is the process of analyzing physiological metrics to
determine a person’s psychological state
21. What is Psychophysiology?
21
• Other physiological traces can tell us what is happening in the mind, but
are beyond the scope of today’s presentation (Dirican & Göktürk, 2011):
Trace Use
Event Related Brain Potentials
(ERP)
Mental workload
Electroencephalography (EEG) Task engagement and cognitive processes
Heart Rate (HR) & Heart Rate
Variability (HRV)
Arousal, mental workload, and valence
Blood Pressure (BP) Stress
Electromyogram (EMG) Motor preparation and emotional valence
Respiration Task demands and arousal
22. Wait, isn’t that Neuromarketing?
22
• Neuromarketing is a newer field whereby companies (typically) use
EEG/EMG data in marketing studies
fMRI EEG/EMG
Blood
oxygenation Brain waves
23. But neuromarketing is NOT helping the UX community
23
• “Discount” usability testing dictates that we should be
able to run 12-16 participants in 3-4 days
• fMRI is expensive
• EEG is time consuming and commodity equipment is
unreliable
• Emotiv headset has potential, but is not ready for our world
quite yet
• Neuromarketing companies rely on their “special sauce”
algorithm, which is not shared with the research
community
24. Bringing Psychophysiology to UX
24
• So let’s do it ourselves!
• Biophysical signals can indicate usability problems
• Ward & Marsden (2002) built a “good” and “bad” interface and compared
subjects’ biometrics
• They found that the “bad” interface caused higher skin conductivity, lower blood volume,
and increased pulse rate
• Lin and Hu (2005) had subjects play a game and do increasingly frustrating tasks
– with similar results
• Understanding participants’ biometrics gives us insight into trends
• Stickel (2009) found that participants who did not do well on tasks maintained
high stress levels and continued to perform poorly on subsequent tasks
25. Bringing Psychophysiology to UX
25
• There are, of course, some caveats
• We want to mimic real-world experiences during a usability study
• A person sitting at a computer with wires protruding from various body parts
isn’t exactly real-world
• Participant comfort is paramount
• Think-aloud vs. Retrospective Think-aloud
• Employing psychophysical methods during a usability study has the same problem as with eye
tracking: a talking participant is a distracted participant
• We want to minimize cost (time and money)
26. Bringing Psychophysiology to UX
26
• Focus the conversation on GSR
• Less invasive to measure
• Less subject to noise
• Fast response time to view event related changes
• Can run multiple sessions per day with minimal incremental cost
• Process is still tricky, but is promising
• GSR is one of the most promising biometric measures of arousal
(Henriques, et al, 2011)
• Though, there is the problem of valence: did the participant experience positive or negative
arousal?
• This can probably be alleviated by simply looking at what the user was doing – determine the
context of the GSR spike
• Heart Rate Variability has also been shown to measure emotional valence
27. Available GSR Capture Equipment
27
• Affectiva Q Sensor
• Had great promise, but is going end of life in 2014
• Thought Technology Procomp Infiniti
• A workhorse for physiological data capture in academia
• Neulog
• Seems like a promising alternative to Procomp
• For now, I think we’re stuck with a wired sensor
Affectiva Q Sensor
NeuLog GSR Logger
TT Procomp Infiniti
28. What Can We Expect From this Effort?
28
• We can expect to break through the participants’ cognitive bias that is
inherent in traditional usability studies
• Ever have a participant struggle through a task and rate it as easy?
• We can expect to get objective, quantitative data to which stakeholders
can more easily relate
• Explaining that people sweat when aroused is easier than explaining scanpaths
• We can expect to have a better understanding of what our participants
are feeling
• If a design is causing participants undue stress, it would be best if we knew
about it
29. In Conclusion
29
• Embrace “traditional” eye tracking
• Marry GSR and eye tracking data
• This is a VERY manual process right now – more tools are needed
• Scorn “secret sauce” – share your techniques and findings (both good
and bad) with the UX community
• This may be the quantitative measure for which we’ve been waiting!
• Join the conversation! Search for the “Psychophysiology in Usability”
group on LinkedIn
30. References
30
• Alves, R., Lim, V., Niforatos, E., Chen, M., Karapanos, E., & Nunes, NJ. (2012) Augmenting
Customer Journey Maps with quantitative empirical data: a case on EEG and eye tracking.
Retrieved from: http://arxiv.org/abs/1209.3155
• Bojko, A. (2006) Using Eye Tracking to Compare Web Page Designs: A Case Study. Journal of
Usability Studies, 3(1). Retrieved from:
http://www.upassoc.org/upa_publications/jus/2006_may/bojko_eye_tracking.html
• Dirican, AC., & Göktürk, M. (2011) Psychophysiological Measures of Human Cognitive States
Applied in Human Computer Interaction. Procedia Computer Science, 3, 1361-1367.
• Ehmke, C. & Wilson, S. (2007) Identifying Web Usability Problems from Eye-Tracking Data.
Proceedings of HCI 2007. Retrieved from:
http://www.bcs.org/upload/pdf/ewic_hc07_lppaper12.pdf
• Goldberg, J. & Kotval, X. (1999) Computer interface evaluation using eye movements: methods
and constructs. International Journal of Industrial Ergonomics, 24, 631-645.
• Henriques, R., Paiva, A., & Antunes, C. (2012) On the need of new methods to mine
electrodermal activity in emotion-centered studies. Retrieved from:
http://web.ist.utl.pt/claudia.antunes/artigos/henriques2012admi.aamas.pdf
• Lin, T. & Hu, W. (2005) Do Physiological Data Relate to Traditional Usability Indexes? Proceedings
of OZCHI 2005, Canberra, Australia.
31. References
31
• Poole, A. & Ball, L. (2005) Eye tracking in human-computer interaction and usability research. In
C. Ghaoui (ed.), Encyclopedia of human computer interaction. Idea Group, Pennsylvania, 211-
219. Retrieved from: http://www.alexpoole.info/blog/wp-content/uploads/2010/02/PooleBall-
EyeTracking.pdf
• Russell, Mark. (2005) Using Eye-Tracking Data to Understand First Impressions of a Website. In
B. Chaparro (ed.), Usability News. February 2005, 7(1). Wichita State University. Retrieved from:
http://psychology.wichita.edu/surl/usabilitynews/71/eye_tracking.asp
• Stickel, C., Ebner, M., Steinbach-Nordmann, S., Searle, G., & Holzinger, A. (2009) Emotion
Detection: Application of the Valence Arousal Space for Rapid Biological Usability Testing to
enhance Universal Access. HCII Conference San Diego, Springer Lecture Notes in Computer
Science. Retrieved from:
http://elearningblog.tugraz.at/scms/data/alt/publication/09_hci_emotion.pdf
• Vigouroux, R. (1888) The electrical resistance considered as a clinical sign. Progres Medicale, 3,
87-89.
• Ward, R., Marsden, P., Cahill, B., & Johnson, C. (2002) Physiological Responses to Well-Designed
and Poorly-Designed Interfaces. Proceedings of CHI 2002 Workshop on Physiological Computing.
Minneapolis, MN. Retrieved from:
http://physiologicalcomputing.net/chi2002/chi_papers/ward_physiological_responses_to_well_de
signed_and_poorly_designed_interfaces.pdf
32. Thank you! Any questions?
32
Dan Berlin
Experience Research Director, Mad*Pow
dberlin@madpow.net
@banderlin
Editor's Notes
Fixation = encode focal information, see periphery, plan next moveG&T = # of fixations, fixation duration, and fixation/saccade ratioEye-tracking allows us to see the unconscious decision-making process
FAA studies = In-dash cameras
Researchers don’t examine heat maps, we examine numbersHeat maps are eye candy that only frame the story that the data tellsGaze plots may be individually examined
RTA does not remove cognitive biasTA removes the ability to do time on taskScreened out: retinal & corneal damage, eye cancer & tumors, macular degeneration, cataracts, conjunctivitis, and nystagmusOkay to accept: amblyopia, glaucoma, and strabismus
Use in lie detectionSympathetic system controls the sweat glands – the more you sweat, the more conductivity
Use in lie detectionSympathetic system controls the sweat glands – the more you sweat, the more conductivity
Use in lie detection
Use in lie detection
Yes, businesses have the right to make money from intellectual propertyBut this inhibits bringing the technology to other fields that could benefit