Gramener's Arindam Roy presented this deck to the students of Goa Institute of Management. He spoke how data stories offer a seamless communication between insights and decision maker. Stories are memorable and viral. That's the reason they create more impact when compared to traditional data dashbaords.
Know more about gramener's data storytelling capabilities and workshop at https://gramener.com/data-storytelling-workshop
1. KNOW YOUR DATA
DATA STORYTELLING
GOA INSTITUTE OF MANAGEMENT
DATE: 29-AUG-2020
TIME: 6:30 P.M.
ARINDAM ROY
Associate Director –
Data Sciences
2. WE ARE A DESIGN-LED DATA SCIENCE ORGANIZATION DELIVERING INSIGHTS AS STORIES
Our Differentiated Process
Critical
Reasoning
Design
Thinking
We help clients build data
applications
Rapid insight
Analytics AI / ML
DashboardInfographic
IntuitiveUserInterface
We offer a platform for data science, pre-built
products & solutions, and custom services to
build your data application.
We started in 2011 and have over 100 clients,
200 employees and 5 offices across the globe.
Our solutions are recommended by analysts
(Gartner, Forrester) and our clients.
3. This is a dataset (1975 – 1990) that
has been around for several years and
has been studied extensively. Yet, a
visualization can reveal patterns that
are neither obvious nor well known.
For example,
• Are birthdays uniformly distributed?
• Do doctors or parents exercise the C-section option to move dates?
• Is there any day of the month that has unusually high or low births?
• Are there any months with relatively high or low births?
Very high births in September.
But this is fairly well known. Most
conceptions happen during the
winter holiday season
Relatively few births during the
Christmas and Thanksgiving
holidays, as well as New Year
and Independence Day.
Most people prefer not
to have children on the
13th of any month, given
that it’s an unlucky day
Some special days like April
Fool’s day are avoided, but
Valentine’s Day is quite
popular
More births Fewer births … on average, for each day of the year (from 1975 to 1990)
LET’S LOOK AT 15 YEARS OF US BIRTH DATA
4. THE PATTERN IN INDIA IS QUITE DIFFERENT
This is a birth date dataset that’s
obtained from school admission data
for over 10 million children. When we
compare this with births in the US, we
see none of the same patterns.
For example,
• Is there an aversion to the 13th or is there a local cultural nuance?
• Are holidays avoided for births?
• Which months have a higher propensity for births, and why?
• Are there any patterns not found in the US data?
Very few children are born in the
month of August, and thereafter.
Most births are concentrated in
the first half of the year
We see a large number of
children born on the 5th, 10th,
15th, 20th and 25th of each month
– that is, round numbered dates
Such round numbered patterns a
typical indication of fraud. Here,
birthdates are brought forward to
aid early school admission
More births Fewer births … on average, for each day of the year (from 2007 to 2013)
5. THIS ADVERSELY IMPACTS CHILDREN’S MARKS
It’s a well-established fact that older
children tend to do better at school in
most activities. Since many children
have had their birth dates brought
forward, these younger children suffer.
The average marks of children “born” on the 1st, 5th, 10th, 15th etc.. of
the month tend to score lower marks.
• Are holidays avoided for births?
• Which months have a higher propensity for births, and why?
• Are there any patterns not found in the US data?
Higher marks Lower marks … on average, for children born on a given day of the year (from 2007 to 2013)
Children “born” on round numbered days score lower marks on average,
due to a higher proportion of younger children
7. DATA STORYTELLING IS A CRITICAL SKILL FOR DATA SCIENTISTS, ANALYSTS & MANAGERS
Stories are memorable. They spread virally
People remember stories. They’ll act on them.
People share stories. That enables collective action.
For people to act on analysis, data stories are critical.
But analysts present analysis, not stories
We present what we did. Not what you need.
You need to know what happened, why, & what to do.
Narrated in an engaging way. As a story.
We’ll learn how do that in this session.
Storytelling has a 30X Return on Investment
Rob Walker and Joshua Glenn auctioned common
items like mugs, golf balls, toys, etc. The item
descriptions were stories purpose-written by 200+
contributing writers.
Items that were bought for $250 sold for over $8,000 –
a return of over 3,000% for storytelling!
Original price: $2.00.
Final price: $50.00.
This little statue stood on the window-sill
in my favorite aunt’s front hall. Perched
between plants of varying shapes and
sizes, surrounded by shards of broken
pottery and miniature ceramic elephants
from the Red Rose Tea box, dappled
with sunlight shining through the leaded
glass figures of St. Francis in his garden
and the mossy Celtic Cross, …
8. “Can’t I just show the data?”
“Why do I need charts?”
17. GLOBAL AIRLINE REDUCED CARGO TURNAROUND TIME BY 15% WITH SCENARIO MODELING
SEE LIVE DEMO
A global airline company took up a service level
agreement to deliver cargo from the flight to the
warehouse in under 1.5 hours. This target was 15%
lower than their current best.
Several factors affect cargo delay across airports.
Availability of forklifts, staff size, cargo type, part
shipment, and many others. Altering any of these is
expensive and takes long.
Gramener built a visual analytics solution that
showed where cargo was delayed. We built an ML
model that identified the drivers of delay (forklifts,
trained staff), and the impact of these on
turnaround time. What-if scenario modelling helped
pick the optimal combination that reduced TAT.
This allowed the airline to reduce the turnaround
time by 15% from 1.76 hrs to 1.5 hrs. The worst-
case turnaround time also reduced by 34% from
2.9 hrs to 1.92 hrs.
15% 34%
cargo turnaround time
reduction (from 1.76 to 1.5 hrs)
reduction in worst-case
turnaround time
17
Evening Morning Night
Fri Mon Sat Sun Thu Tue Wed
FAH N70 RPP TDS ZDH
20-40% 40-60% 60-80% <20% Full
Recovery times are neutral during the evening and morning shifts (mornings are slightly worse), night times are the best.
Recovery times are worst on Fridays, and best on Saturdays & Wednesdays.
Specifically, Friday mornings are particularly bad. So are Thursday mornings.
The FAH product category has the best recovery time, while ZDH is much worse.
However, RPP on Sundays is unusually slow.
Part shipped products tend to perform worse than full-shipments. Specifically the <20% and 40-60% part-shipments.
This is especially problematic for ZDH
Product category
Part shipment
Weekday
Shift
This slide is best viewed in slideshow mode. The animations tell a story that isn’t obvious on the static version.
18. INSIGHTS NEEDS TO BE BIG, USEFUL, SURPRISING
DO IT: Rate each analysis against B.U.S.
Filter the analyses using this checklist
IS THE INSIGHT
BIG
IS THE INSIGHT
USEFUL
IS THE INSIGHT
SURPRISING
We want a result that
substantially changes the
outcome.
Can they take an action that
improves their objective?
What should they do next?
Is it non-obvious?
Does it overturn an existing
belief, or bring consensus?
Example B U S
There are twice as many restaurants in NYC
than any other city
Sales increased in every region except our
largest branch, which dipped by 0.1%
Increase in rainfall increases the sale of
umbrellas, and is the biggest driver of our
sales
19. HERE ARE THE ANALYSES & FILTERS FOR SOME BUSINESS PROBLEMS
Purchasing Commodities B U S Cargo Delay B U S Customer Churn B U S
The most common commodity
was ordered 10 times a week
across 2.4 vendors
Fragile cargo is a big factor in the
delay, with a 20% impact
B S
Number of inbound calls does
not impact churn.
S
The number of orders is correlated
with the number of vendors.
Reducing one will reduce the other
U
Fridays are when cargo is delayed
the most
Customers who haven’t made
any calls in the last 15 days are
the most likely to churn
B
Plant P126 was the plant with the
most violations, especially on
largest commodity
B U
Trained staff and forklifts impact
delay the most
B U S
Customers making infrequent
calls, recharging small amounts
infrequently, are most at risk
B U S
22. THINK — SEE — DO FRAMEWORK
What do user
think & feel?
What do user need
to see?
What do user
need to do?
23. THINK — SEE — DO FRAMEWORK
What do user think & feel?
What makes them successful?
What are their pain points?
What are their wants?
What do user need to see?
What KPIs are most relevant to them?
What insights are they looking for?
What would they compare with?
What relationships do they want to see?
What do user need to do?
What are their most frequent actions?
What are common faceted filters?
What details would they want to see?
What scenarios would they
collaborate/delegate in?
24. THINK — SEE — DO FRAMEWORK
What do user think & feel?
Don’t know where I’m going
Don’t what I’m doing is effective
Unaware of risks around the globe
What do user need to see?
Global View
Compliance
Progress
(More Metrics)
What do user need to do?
Strategic
Tech Deployment
Fire fighting
Communication
Andrew
(CISO)
Reports to
CIO / CFO
25. SHOW
me what is happening
with the data
EXPLAIN
to me why it’s
happening
Allow me to
EXPLOR
E
and figure it out
Just
EXPOSE
the data to me
Low effort High effort
High effort
Low effort
Creator
Consumer
DATA STORYTELLING APPROACH
28. Persona Mapping
Empathy Mapping
Use case Exploration
Opportunity areas
Scoped problem
User scenarios
Information Architecture
Exploratory Data Analysis
Design Ideation
Wireframe design
Prototyping
Usability Testing
Iterative Development
User Testing
DISCOVER
(Research Phase)
DEFINE
(Synthesis Phase)
DERIVE
(Data Insights Phase)
DESIGN
(Ideation Phase)
DEVELOP
(Implementation Phase)
Solving the right problem… backed by the right data… in the right way.
DESIGN THINKING FOR EFFECTIVE DATA STORYTELLING
29. 29
Discover
Project prioritization
based on impact &
frequency of use
Persona & Empathy
mapping through
stakeholder interviews
Information
Experience Audit
Activities
Deliverables
Empathy maps for primary personas
Data Storytelling gaps
30. 30
Define Opportunity area
mapping
Scoped brief by
prioritizing high impact
user stories
User scenario &
Customer Journey
mapping
Activities
Deliverables
List of data storytelling opportunities as user stories
Customer journey map for high-impact workflows
35. WITH THE GROWTH OF SELF-SERVICE BI, 85% OF COMPANIES HAVE LOST TRACK OF HOW MANY
DASHBOARDS THEY GENERATED
What QUESTION does
the dashboard answer?
Is the ANSWER evident
from the dashboard?
What ACTION should the
user take now?
BUT 3 THINGS
ARE UNCLEAR ON
MOST DASHBOARDS
35
36. ANNOTATE TO EXPLAIN & ENGAGE. USE FOUR TYPES OF NARRATIVES
Remember “SEAR”: Summarize, Explain, Annotate, Recommend
0
5,000
10,000
15,000
20,000
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
Marks
# students
Teachers add marks to stop some students from failing
This chart shows Class 10 students’ English marks
in Tamil Nadu, India, in 2011. The X-axis has the
mark a student has scored. The Y-axis has the #
of students who scored that mark.
Large number of
students score exactly
35 marks
Few (but not 0) students
fail at 31-34 marks
What’s unusual
Large number of students score
35 marks.
Few (but not 0) students score
between 30-35
Only some students get this benefit.
Identify a fair policy that will be applied consistently.
Summarize the visual in its title
Don’t describe the chart.
Don’t write the user’s question.
Write the answer itself. Like a headline.
Explain & interpret the visual
How should the user read it?
What do you say when you talk through it?
Explain what the visual is. Then the axes.
Then its contents. Then the inference.
Recommend an action
How should I act on this?
You need to change the audience.
(Otherwise, you made no difference.)
Annotate essential elements
What should the user focus their eyes on?
Point it out, or highlight it with colors
Interpret what they’re seeing – in words.
This is a bell curve. But the spike at 35 (the mark
at which students pass) is unusual. Teachers must
be adding marks to some of the students who are
likely to fail by a small margin.
No one scores 0-4
marks
37. Who is your
audience? They
determine the story
What is their
problem? That
defines your analysis
Find the right way to
solve the problem
Filter for big, useful,
surprising insights.
BUS
Start with takeaway.
Summarize your
story
Add supporting
analysis as a tree
Pick a format based
on how your
audience will
consume the story
Pick a visual design
based on the
takeaway
Annotate to explain
and engage
01 02 03
04 06 05
08 09 10
KEY TAKEAWAYS - 9 STEPS TO GO FROM DATA TO A DATA STORY
38. DATA STORY EXERCISE
Olivier
Hildebrand
Chief Sales Officer
Responsible for Top Line Growth of the Organization
In this meeting Olivier wants to look at this year’s revenue and identify what can be
improved.
You are one Olivier’s key data guy and you just attended this webinar on Data Storytelling.
How would you present the following data to Olivier?
Need: Identify pockets of growth in the Revenue
In Million Dollars
Product Sep-19 Oct-19 Nov-19 Dec-19 Jan-20 Feb-20 Mar-20 Apr-20 May-20 Jun-20 Jul-20 Aug-20
Product A 193 181 174 125 153 139 148 132 200 110 197 165
Product B 132 191 137 163 137 158 195 153 133 199 185 104
Product C 143 146 184 179 185 197 107 198 169 155 156 189
Product D 115 193 127 132 124 143 156 121 156 172 159 199
Product E 200 115 107 172 172 122 190 199 135 168 128 192
Product F 132 198 100 105 199 164 123 119 137 120 102 127
1.5