"Debiasing" has become a much-debated topic in the world of AI and data analytics, and its time to talk about it in publishing. Together with the software team Granthinka, we have begun developing tools that can allow editors to assess their manuscripts for different kinds of bias, whether it's how much space men and women characters take up, the presence or absence of visible minorities within fiction, the familiarity of actions or descriptions ascribed to different types of characters, down to the titles that are used to estimate "comparability" when it comes to marketing.
Debiasing is important because it challenges the common stereotypes that govern how individuals are represented in creative works today. But it also raises a host of challenging questions: When is bias part of the story? Are we limiting authors' freedom? How might readers react to these changes? What kinds of new markets do less "familiar" books open up (or close down)? How can we work with authors to think about little changes that gradually add-up to building new imaginary worlds?
This talk is geared towards editors, authors, and publishers. It's about conveying what is currently possible when it comes to content analytics in books, and about hearing how audience members feel about the important and complex questions surrounding bias when it comes to the content of books.
techforum.booknetcanada.ca
#TechForum
8. Diversity is a content
issue as well
Using data analytics to better understand
biases when it comes to the content of
books.
9. What do I mean by bias?
• Any feature of a book that predictably adheres to norms that
maintain social hierarchies
Su Lin Blodgett, Solon Barocas, Hal Daumé III, Hanna Wallach, "Language
(Technology) is Power: A Critical Survey of ‘Bias’ in NLP"
https://arxiv.org/abs/2005.14050
10. • Any feature of a book that predictably adheres to norms that
maintain social hierarchies
• This makes bias:
• Relational
What do I mean by bias?
11. What do I mean by bias?
• Any feature of a book that predictably adheres to norms that
maintain social hierarchies
• This makes bias:
• Relational
• Measurable
12. What do I mean by bias?
• Any feature of a book that predictably adheres to norms that
maintain social hierarchies
• This makes bias:
• Relational
• Measurable
• Perceptual and Normative
14. Stylistic Bias
Good old fashioned
over-used words
and clichés.
For example, James
Patterson has a
habit of using the
word “said” a lot as
well as “get/got.”
16. Stylistic Bias
Patterson uses
“said” about 4x
more often than
your average novel.
Kafka’s over-used
words are “but,”
“probably,” and
“perhaps.”
We want to fix
one of these.
17. Stylistic Bias
• The important point is that these are
predictable behaviors with respect
to some norm and our perception of
their value (positive or negative) is
tied to our own values (institutional,
organizational, or personal).
21. Social Bias
1 There were no trans* characters in our sample
Main characters
were balanced
at 50.2%
women1
Looking
Nobody’s
Looking
22. Social Bias
1 There were no trans* characters in our sample
Main characters
were balanced
at 50.2%
women1
mixedgendersamegender
Character interactions are
highly heteronormative
32. Two kinds of harms
• Harms of allocation
• Harms of representation
Kate Crawford. 2017. “The Trouble with Bias.” Keynote at NeurIPS.
https://www.youtube.com/watch?v=fMym_BKWQzk
33. Harms of Allocation
Weinberg DB, Kapelner A
(2018).https://doi.org/10.1371/journal.pone.0195298
Men’s books are on average 38
pages longer when controlling for
genre (N=5,839).
Men’s books are 2.3 times more
likely to be represented in books
longer than 500 pages, and 2.8
times more likely when it comes to
just fiction.
Price Size
Jessica Shattuck, “Why women should do
more literary manspreading.”
https://electricliterature.com/why-women-should-do-more-literary-manspreading/
34. Two kinds of harms
• Harms of allocation
• Harms of representation
• Under-representation and the problem of invisibility
• Misrepresentation and the problem of stereotyping
Kate Crawford. 2017. “The Trouble with Bias.” Keynote at NeurIPS.
https://www.youtube.com/watch?v=fMym_BKWQzk
36. 3 things you can do right now
1. Start a conversation at your organization
• Establish a shared understanding of the issue and create
benchmarks or goals to move towards.
• Addressing bias is about changing norms by creating new norms.
37. 3 things you can do right now
1. Start a conversation at your organization
2. Get the right tools
38.
39. 3 things you can do right now
1. Start a conversation at your organization
2. Get the right tools
3. Aim for unpredictability
40. 3 things you can do right now
1. Start a conversation at your organization
2. Get the right tools
3. Aim for unpredictability
• This process isn’t a checklist but an on-going nudge to be more
creative
• Variance is power
According to a study by USC Annenberg, in the Hot 100 Billboard charts over the last five years, 31.5% of solo artists were women. In our study of Hollywood screenplays from the last 30 years, we found that the number of words apportioned to women actors was 30%. In a sample of 3,652 public university lectures, it was found that women academics gave 31% of talks during a single year.