Variation in speech tempo: Capt. Kirk, Mr. Spock, and all of us in between

Variation in speech tempo:
Capt. Kirk, Mr. Spock, and all of us in between
Tyler Schnoebelen
Stanford University

Goals
• The social meanings of tempo
– Who uses it, how’s it understood?
• Indexical field construction
– New approaches to structuring indexical fields
• Between fast speech and slow speech:
burstiness
– A way to measure William Shatner (and everybody
else)

Social meaning
• Speech rates are not stable by demographic
category
• They vary all over the place
• Conveying and creating identities and
attitudes

Fast tempo in music
Fast
tempo
Happiness
Pleasantness
Activity
Surprise(Potency)
(Anger)
(Fear)

Psychology, phonetics, musicology
Juslin and Laukka (2003)—104 studies of speech, 41 studies of music

Indexical fields
• Variables aren’t fixed but are located in a
“constellation of ideologically related
meanings” (Eckert 2008)

3 steps to an indexical field
1. Statistically significant correlations with fast
speech rate in psychology and linguistics
literature
2. Corpora (“talk fast” and “are fast talkers”)
– Corpus of Contemporary American English
– Bing! search results
3. Survey
– 50 participants across the US via Mechanical Turk

The structure of indexical fields
• It should be possible to relate items within the field
• And this should allows us to understand constraints
• And how different meanings come to attach to
different variables
• My assumptions:
– Indexical fields expand and contract over time
– New meanings rely upon what’s already there
– “The no teleportation” hypothesis
– In principle, sadness and fast talk could come to be related
but the path is unlikely

Clustering
• Take the indexical field (41 items)
• Also add “fast-talkers” and “slow-talkers”
• Ask for pair-wise judgments of how much overlap
there is between each pair
– ie, “New Yorkers” overlap with “Northerners”, but
“teachers” don’t overlap with “con-men/hustlers”
• 20 judgments for each pair (840 pairs)
• 245 Americans surveyed via Amazon Mechanical Turk
• Hierarchical clustering based on correlation patterns
(but non-hierarchical methods give similar results)

Sample of the data
active angry; in a rage anxious
active 2.11 -0.373 -0.156
angry; in a rage -0.433 2.08 -0.0549
anxious -0.298 -0.0875 2.08
auctioneers 0.500 -1.331 -0.898

Predictions
• The main clusters that emerge will be
connected to two inter-related notions:
– Ideologies of time
– Emotional arousal

Overwhelmed
Other-oriented
Overwhelming
Authoritative
Persuading

What’s this showing?
• Time and emotional arousal may well underlie
the field
• But a different axis is much more apparent:
– Speaker-orientation vs. listener-orientation
• Fast-speech is about time, but are you talking
fast for me or for yourself?
– There’s a parallel to IN’ vs. ING
• Do I take your IN’ as a sign of friendliness or as
evidence of laziness?

We’ve got to get Spock to Vulcan!

Leonard Nimoy’s Mr. Spock is even

Burstiness
• Variance / (syllables * 0.5)
– Variance gets us dispersion of the data
– The denominator helps us see how spread out the
data is
– The bigger the ratio, the more it is characterized
by clusters (“bursts”)

Burstiness and emotionality
• 48 Americans judged the emotional intensity of
228 utterances
– Utterances taken from 8 episodes, focusing on:
• Captain Kirk
• Mr. Spock
• Lt. Sulu
• Dr. (Bones) McCoy
– Each utterance judged by 3-5 people
– Scores were normalized per judge and then averaged
– Top 30, bottom 30 and 63 randomly chosen in
between were analyzed for speech rate and burstiness
– Restricted to utterances that were at least 5 syllables

Emotional speech in Star Trek is bursty
speech

Better than speech rate
• Among factors tested:
– Burstiness
– Speech rate
– Syllable count
– Interactions among these
• Only burstiness is significant (in a simple linear
regression model or an ordinary least squares
model, p=~0.0125)
– But note that the r-squared isn’t all that great:
0.05044

• A better approach is to use a mixed model,
where speaker is a random effect.
– This allows us to see that Kirk and Bones use
burstiness, while Sulu and Spock don’t.
• Kirk 0.4371045
• Bones 0.1710811
• Sulu -0.1518260
• Spock -0.4563595
Mixed model

Spock in reversal
Bursty, but not emotional
Emotional, but not bursty

Emotionality by Burstiness and
Speaker
AIC BIC logLik deviance REMLdev
341.4 352.7 -166.7 336.7 333.4
Random effects:
Groups Name Variance Std.Dev.
Speaker (Intercept) 0.21810 0.46701
Residual 0.87055 0.93303
Number of obs: 123, groups: Speaker, 4
Fixed effects:
Estimate Std. Error t value
(Intercept) -0.07646 0.27625 -0.2768
Burstiness 7.07226 3.07077 2.3031
> pvals.fnc(data.lmer)$fixed
Estimate MCMCmean HPD95lower HPD95upper pMCMC
Pr(>|t|)
(Intercept) -0.0765 -0.0845 -0.8558 0.6649 0.8174 0.7824
Burstiness 7.0723 7.1217 1.0825 13.1220 0.0194 0.0230

Summary
• We can move beyond the “who” of variation and into
“how” and “why”
• Indexical fields are a useful conceptual tool and we can
use them to understand constraints on meaning
• It seems likely that many indexical fields are structured
by axes like self/other-orientation
– Which are made visible to listeners and appraised by them
• Rate is not the only thing that matters for emotion
– Burstiness also communicates the drama of the situation
– It is unlikely that people go to the extent that Shatner does
– But there’s reason to believe that tempo may be as
useful—or more so—than simple rates

Thank you!
• Collins, S. 1989. Subjective and autonomic responses to Western classical music. Unpublished doctoral dissertation, University of Manchester, UK
• Eckert, P. (2008). Variation and the indexical field. Journal of Sociolinguistics, 12(4), 453–476.
• Huson, D., D. Richter, C. Rausch, T. Dezulian, M. Franz and R. Rupp. (2007). Dendroscope: An interactive viewer for large phylogenetic trees . BMC
Bioinformatics 8:460, 2007, software freely available from www.dendroscope.org
• de Jong, N. H., and T. Wempe. (2009). Praat script to detect syllable nuclei and measure speech rate automatically. Behavior research methods, 41(2),
385.
• Juslin, P. N., and P. Laukka. (2003). Communication of emotions in vocal expression and music performance: Different channels, same code?.
Psychological Bulletin, 129(5), 770–814.
• Kendall, T. (2010). Language Variation and Sequential Temporal Patterns of Talk. Linguistics Department, Stanford University: Palo Alto, CA. February.
• Kendall, T. (2009). Speech Rate, Pause, and Linguistic Variation: An Examination Through the Sociolinguistic Archive and Analysis Project, Doctoral
Dissertation. Durham, NC: Duke University.
• Scherer, K. (2003). Vocal communication of emotion: a review of research paradigms. Speech Communication, 40, 227-256.
• Scherer, K. R. (1981). Speech and emotional states. Speech evaluation in psychiatry, 189–220.
• Schnoebelen, T. (2009). The social meaning of tempo. http://www.stanford.edu/~tylers/notes/socioling/Social_meaning_tempo_Schnoebelen_3-23-
09.pdf
• Scherer, K. 2003. Vocal communication of emotion: a review of research paradigms, Speech Comm. 40 227–256.
• Scherer, K. and J. Oshinsky. (1977). Cue utilization in emotion attribution from auditory stimuli. Motiv. Emot. 1, 331–346.
• Ververidis, D., and C. Kotropoulos. (2006). Emotional speech recognition: Resources, features, and methods. Speech Communication, 48(9), 1162–
1181.
• Special thanks to Penny Eckert, John Rickford, Kate Geenberg, Kyuwon
Moon, Roey Gafter, and Mathew Lodge
• Also, fwiw, I’ve put together a lot of essays and reading notes about language and emotion here:
– http://www.stanford.edu/~tylers/emotions.shtml

Why look at TV/movies for style?
• Actors are a good source for studies of style since they make vivid the cues
that are more mixed and grey in real life. “The act that one does, the act
that one performs, is, in a sense, an act that has been going on before one
arrived on the scene” (Butler 1988: 526). Butler is talking about gender,
but this idea applies to acting as well. Actors don’t really create anything
out of whole cloth. They assemble bits and pieces. It would be difficult to
analyze the acoustic signal of wooden acting, since we’d be measuring
perceptions of lack, but even histrionic, scene-chewing samples offer us
speech cues associated with various social categories. Again, the
assumption is that actors use stylistic resources that their audiences can
be expected to understand. If audiences uniformly agree on what a
performance expresses, it doesn’t necessarily matter what the intention
was. We’re after that shared social meaning and the components that
comprise it, though we may be giving up the psychophysiological effects
on the voice that happen under natural conditions.
• Writers create scenes of dramatic interest, so that there is also a higher
proportion of arousal in a scene than in daily life.

Variation in speech tempo: Capt. Kirk, Mr. Spock, and all of us in between

Variation in speech tempo: Capt. Kirk, Mr. Spock, and all of us in between

Recommended

Recommended

More Related Content

Similar to Variation in speech tempo: Capt. Kirk, Mr. Spock, and all of us in between

Similar to Variation in speech tempo: Capt. Kirk, Mr. Spock, and all of us in between (20)

More from Tyler Schnoebelen

More from Tyler Schnoebelen (7)

Recently uploaded

Recently uploaded (20)

Variation in speech tempo: Capt. Kirk, Mr. Spock, and all of us in between

Editor's Notes