Scaling API-first – The story of a global engineering organization
Opening the Black Box of User Profiles in Content-based Recommender Systems
1. Opening the black box of user
profiles in content-based
recommender systems
Fair and Transparent Machine
Learning @ ICAI Meetup
April 12, 2019
David Graus | david.graus@fdmediagroep.nl | @dvdgrs 1
Opening the black box of user
profiles in content-based
recommender systems
Fair and Transparent Machine
Learning @ ICAI Meetup
April 12, 2019
David Graus | david.graus@fdmediagroep.nl | @dvdgrs 1
2. The leading information provider in the financial economic domain
FD Mediagroup
in the Netherlands
8. Why fair and transparent?
8
User studies have shown:
• Our users want personalized content
• Our users care for transparency
FD:
• Verifiability is one of the core values for FD Mediagroup
• Transparency for verifiability
19. Framework
I want to
be an
expert
I want to
stay
informed
I want to
broaden my
horizon
I want to
discover the
unexplored
Values
Broadness, diversity, autonomy, objectivity,
match with the user needs, controllability
28. User study
System
Evaluation
Your goal is to Broaden your Horizons.
There may be topics you do not normally read
about, but you may actually find interesting.
Exploring this helps to build a broad
perspective on the issues that matter to you.
Your goal is to Discover the Unexplored.
There may be topics that you haven’t
explored before that may actually become
new interests. Exploring new topics can
promote creativity and objectivity.
29. User study
Aim:
Study whether being offered a particular
goal would influence the user’s intended
reading behavior
System
Evaluation
30. User study
System
Evaluation
Objective
Pick a persona from four
data-driven profiles
Random assign
goal-order
Explain the goals: Broaden Horizons,
Discover the unexplored
Goal A
Show
visualization
Questionnaire
Persona 1
Goal B
Persona 4
32. Hypotheses
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and
high familiarity compared to the non-selected topics.
3. Discover the unexplored:
Users choose topics that have low similarity and
low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
34. Results
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and
high familiarity compared to the non-selected topics.
3. Discover the unexplored:
Users choose topics that have low similarity and
low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
35. Results
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and
high familiarity compared to the non-selected topics.
3. Discover the unexplored:
Users choose topics that have low similarity and
low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
36. Results
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and
high familiarity compared to the non-selected topics.
3. Discover the unexplored:
Users choose topics that have low similarity and
low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
37. Results
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and
high familiarity compared to the non-selected topics.
3. Discover the unexplored:
Users choose topics that have low similarity and
low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
38. Results
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and
high familiarity compared to the non-selected topics.
3. Discover the unexplored:
Users choose topics that have low similarity and
low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
39. Results
System
Evaluation
2. Broaden horizon:
Users choose topics that have high similarity and
high familiarity compared to the non-selected topics.
3. Discover the unexplored:
Users choose topics that have low similarity and
low familiarity compared to the non-selected topics.
1. Goal Framework:
Broaden Horizon chosen topics are more similar and
more familiar than Discover the unexplored
41. 41
Summary of achievements
Novel, scalable and generalizable framework of user-profile
explanations
Exploration of FD reader data; domain-knowledge and
data-driven ontology
Interface mockup
User-study to evaluate the value-driven explanations
42. 42
Future
Further designing implementation of
our framework
Formalize the epistemic goals
together with editors
Extend user studies/focus groups to
all value-driven goals
How do we provide information in the financial domain?
Many ways, probably best known for our all-day news radio station
But in this project I’ll talk most about our daily financial newspaper, aka het FD, the European newspaper of the year
Getting the right information to the right person at the right time
Through summarization, personalization, and contextualization of our journalism
Getting the right information to the right person at the right time
Through summarization, personalization, and contextualization of our journalism
More specifically, for FD
We’re building a recommender system (as a first step)
The “Them and Us slide”
We monitor reading behavior
We infer a “model” on top of that
We generate recommendations
We focus our explanations on the input,
(usually explanations are on the output).
That was the idea, submitted a project proposal at ICT with Industry
We sat together with a team of academics from different disciplines
Ranging from philosophy, political communication scientists to UX and computer scientists
Academic hackathon in a week
Enough context, on to content.
These steps roughly correspond to the process we took to finish this project
Provide understandability of reading behavior for users (not the publisher)
Purpose:
Provide users with a framework to expand the utility of the platform and achieve their epistemic (knowledge) goals
With that purpose in mind we started looking at different types of explanations
Level 1, dashboards, overviews, patterns
Level 2: Context; how are you ‘unique’, what do you do more/better than typical users, etc.
Final level where we combine insights to help users achieve specific goals
Epistemic; knowledge goals
”Transactional” --- get commitment by giving something back
How do you formulate these goals?
Take something we want, and something we think our users would want
How to measure whether/how we can do this
Answer what to explain
Many viz, whic one to pick? We need data analysis, to find the overall structure.
Started looking into our data at the user behavior level
Authors
Many viz, whic one to pick? We need data analysis, to find the overall structure.
At the content level;
Fast-forward to our user-study.
- Related goals: both about diversity in content
- Suitable test case as to whether the specific goal leads to the user exploring different degrees of diversity.
whether a particular goal w/ viz would influence intended reading behavior
In terms of topics
Using a dataviz to represent reading behavior
Not too much detail but we set up a mturk experiment (with 40ish users)
Users picked ‘persona’s’ that reflected reading behavior
Represented as “topic word clouds”
Users were presented a goal, visualization, and asked to pick which they would read next.
This is the visualization we presented. it shows the topics a user has read over X amount of time.
Real_Estate & Housing_Market are highly similar
Energy & Environment
Foods & Retail (we’re at Ahold)
Care & Banks are very dissimilar (Sport & Govt)
Hypotheses:1. Comparison between selected topics between the two goals
2. Broaden horizon will select MORE SIMILAR/FAMILIAR topics
3. Unexplored selects LESS SIMILAR/FAMILIAR topics
We did not find evidence that people select more similar topics in broaden horizon than discover
People select more familiair topics in broaden horizon than discover
H1: Partial support
Second hypothesis is rejected, people don’t select more similar and familiar topics
Hypotheses
Partial evidence for third hypothesis;
People select topics that are less familiar in discovering unexplored