Hashtagger+: Real-time Social Tagging of Streaming News - Dr. Georgiana Ifrim
1. Hashtagger+: Real-time Social
Tagging of Streaming News
Georgiana Ifrim
(joint work with Bichen Shi, Gevorg Poghosyan, Neil Hurley)
Insight Centre for Data Analytics,
University College Dublin, Ireland
1
3. Sep21 Sep23 Sep25 Sep27 Sep29 Oct01 Oct03 Oct05 Oct07
#OccupyCentral
#UmbrellaRevolution
#HongKong
Hong Kong
students begin
pro-democracy
class boycott
Thousands at
Hong Kong
protest as
Occupy Central
is launched
Hong Kong
protests:
Thousands defy
calls to go home
Hong Kong
students vow
stronger protests
if leader stays
Hong Kong
protests: Formal
talks agreed as
protests shrink
3
4. Insight Centre for Data Analytics
Motivation:
News articles – Hashtag – Twitter conversation
#IndyRef
(Referendum on Scottish
Independence)
BBC: Scottish independence: Yes
vote 'means big Scots EU boost'
BBC: Could Scotland compete on
tax with Westminster?
IrishTimes: Brown promises
more devolution for Scotland
RTE: Lloyds could move
south if Scots vote for
independence
Reuters: British PM heads to
Scotland as independence
campaign gathers steam
TheGuardian:
Scottish independence: No
camp sends for Gordon
Brown as polls tighten
April 2016 4
6. Insight Centre for Data Analytics
Problem Statement
Map a stream of articles to a stream of
hashtags in real-time, with high-precision
and high-coverage.
Joe Schmidt makes six changes to Irish side to face Japan #rugby
Paris Airshow: eight takeaways from the major aerospace event #business
The tortoise and the software: the human glitch in the machine #ux
Duke of Edinburgh leaves hospital #princephilip
April 2016 6
7. Insight Centre for Data Analytics
Problem Statement
•Real-time Recommendation: given an article, how quickly
can we recommend hashtags? (5mins ok, 5h not ok)
•High-precision (focused hashtags):
X Deadly car bomb targets Afghan bank #news
V Deadly car bomb targets Afghan bank #afghanistan #helmand
•High-coverage: how many articles get any recommended
hashtags within 5 minutes? (9 out of 10 ok, 1 out of 10 not ok)
April 2016 7
9. Insight Centre for Data Analytics
State-of-the-art:
• Multi-class classification (e.g., Naive Bayes, SVM,
LDA, CNN)
• One hashtag = one class
• Content-based features
April 2016
#GE16
#ge16: Fine Gael and Fianna Fáil to discuss government options
Ruth Coppinger to be nominated for Taoiseach #GE16 #irishwater
…
#Germanwings
"No evidence" that co-pilot told anyone he was planning #Germanwings
crash, prosecutor says
…
9
10. Insight Centre for Data Analytics
State-of-the-art:
April 2016
#GE16
#ge16: Fine Gael and Fianna Fáil to
discuss government options
Ruth Coppinger to be nominated for
Taoiseach #GE16 #irishwater
…
#German
wings
"No evidence" that co-pilot told anyone he
was planning #Germanwings crash,
prosecutor says…
… …
Model
Train
#Panama
Papers
#PanamaPapers: Mossack Fonseca leak
reveals elite's tax havens
#PanamaPapers: How the World's Rich
and Famous Hide Their Money Offshore
How about new hashtags? Concept-drift of old hashtags?
#German
wings:
One year on, Haltern commemorates
the crash
Nice flight from Manchester to Koln/
Bonn this morning.
Re-train the model
Weakness:
Apply
Apply
10
11. Insight Centre for Data Analytics
Challenges:
•Many Classes: thousands of hashtags (e.g., 26k/day)
•Dynamic Classes: hashtags emerge and die-off
•Concept Drift: usage and meaning of hashtags changes
•Efficiency/coverage: real-time tagging to capture
how the story moves over time
•Precision: state-of-the-art models have P@1 of ~50%
April 2016 11
12. Hashtagger+ Model
•Modeling Approach:
• Learning-to-rank (L2R)
• Focus on the concept of hashtag relevance
• IR Framework:Article = query, Hashtags =
documents retrieved/ranked for the query
• Workflow: Article Tweets
12
14. Insight Centre for Data Analytics
Hashtagger+ Model
April 2016
Object Class
Article1 Hashtagx Hashtagy Hashtagz
Article2 Hashtagx
Article3 Hashtagy Hashtagz
Object Class
(Article1 , Hashtagx) Relevant
(Article1 , Hashtagm) Irrelevant
(Article1 , Hashtagn) Irrelevant
SOTA: Multi-class Classification
Proposed L2R Model
14
15. • Pointwise L2R model
• Input feature vector xarticle,hashtag,time describes a given
(Article , Hashtag) pair at a point in time
• Human provided label yarticle,hashtag,time tells if the hashtag is
relevant or irrelevant to the article, at that point in time
• Time-aware features capture how strongly a hashtag is
associated with an article
Content Similarity
Hashtag Popularity, Specificity, Trending
User Credibility
15
Hashtagger+ Model
16. Insight Centre for Data Analytics
Hashtagger+ Model:
April 2016
(Article1 , #GE16) 0.34 0.73 0 … Relevant
(Article1 , #Germanwings) 0.01 0.23 0 … Irrelevant
… … … … … …
(Article2 , #GE16) 0.02 0.48 0 … Irrelevant
(Article2 , #Germanwings) 0.76 0.45 1 … Relevant
… … … … … …
Model
Train
How about new hashtags? Concept-drift of old hashtags?
Train once, use model (no retraining needed)
(Article1 , #PanamaPapers) 0.66 0.82 1 …
(Article2 , #PanamaPapers) 0.08 0.73 0 …
(Article1 , #Germanwings) 0.28 0.45 0 …
(Article2 , #Germanwings) 0.53 0.24 1 …
Apply
Apply
16
17. Insight Centre for Data Analytics
Two-Step L2R Approach
• Filtering: Article -> Set of Candidate Hashtags
• Efficient Data Collection
• Query generation from given article
• Retrieving relevant tweets for article/query
• Ranking Model: Article, Candidate Hashtags -> Ranked
Hashtag List
• Apply pre-trained L2R model to rank candidate hashtags
April 2016 17
19. Query Generation: Article -> Query
•What is a good set of keywords to describe what
the article is about? (open research problem)
•How quickly can we generate the query?
•How good is the set of tweets retrieved with a
given query?
•We compare 4 methods for query generation and
the effect on quality & size of retrieved tweet set
19
20. Tweet Retrieval: Query -> Tweets
•Given a query (generated from an article), how do
we quickly collect a good set of tweets?
•Cold-start Search for new articles:
• Re-use tweets collected for older articles
• How do we do this efficiently/effectively?
•Twitter Streaming API to continuously update
tweet collection for each article
20
22. Query Generation
22
empirical study to evaluate the impact of each query type
on the amount/quality of data collected, as well as how this
influences the recommendation effectiveness.
TABLE 1
Example article and ranked article-keyphrases using 4 approaches.
Article Headline Easyjet doubles number of female pilots
Subheadline Easyjet says it has doubled the number of female
pilots this year and is on the hunt for more.
First Sentence The Amy Johnson initiative, named after the first
female pilot to fly solo from the UK to Australia,
caused a surge in applications.
POS + Tf.idf (1) australia easyjet, (2) easyjet number, (3) easyjet
uk, (4) australia number, (5) australia uk
POS + NER + Tf.idf (1) amy johnson, (2) australia easyjet, (3) easyjet
uk, (4) australia uk, (5) easyjet number
AlchemyAPI (1) amy johnson initiative, (2) female pilots, (3)
easyjet, (4) female pilot, (5) surge
URL (1) bbc.com/news/business-38326523
3.2.2 Cold-Start Search
ar
tim
ba
ar
of
re
th
so
re
w
A
23. Query Generation
23
P@1 0.930 0.947
Coverage 67.3% 63.3%
Time 301s 200s
TABLE 4
Average cosine similarity, number of tweets, number of candidate
hashtags and hashtag frequency using tweets collected using four
query generation methods.
POS +
Tf.idf
POS + NER +
Tf.idf
AlchemyAPI URL
Cosine 0.221 0.242 0.246 0.265
Tweets 3696.2 2982.9 5083.8 4.2
Hashtags 529 442 976 1.5
Tag Freq 5.26 5.73 5.81 1.49
TABLE 5
Comparing the P@1, NDCG@3 and running time of 16 ranking
methods using Ranklib, sklearn and Cornell’s RankSVM.
L2R Algorithm P@1 NDCG@3 Time(s)
Pointwise
RandomForest(sklearn) 0.852 0.848 2.75
MultilayerPerceptron(sklearn) 0.835 0.803 6.14
SVM(poly)(sklearn) 0.823 0.827 0.78
GradientBoosting(sklearn) 0.810 0.817 1.71
LinearRegression(sklearn) 0.803 0.824 0.16
AdaBoost(sklearn) 0.801 0.840 1.51
RandomForest(ranklib) 0.792 0.778 2.01
MART(ranklib) 0.783 0.768 49.87
Time
that f
outpe
findin
proac
4.4
To e
proac
8am-1
and a
size (
ON KNOWLEDGE AND DATA ENGINEERING, VOL. , NO.
TABLE 3
age, and running time of end-to-end hashtag recommendation using tweets collected using four query g
L2R (POS + Tf.idf) L2R (POS + NER + Tf.idf) L2R (AlchemyAPI) L2R (URL)
P@1 0.930 0.947 0.901 0.410
Coverage 67.3% 63.3% 71.3% 22.1%
Time 301s 200s 588s 48s
TABLE 4
milarity, number of tweets, number of candidate
htag frequency using tweets collected using four
query generation methods.
TABLE 6
Time-window Size: Precision@1, Article Covera
time of the hashtag recommendation using
Precision@1, article coverage and running time for hashtag recommendation
24. Comparing L2R algorithms
24
Cosine 0.221 0.242 0.246 0.265
Tweets 3696.2 2982.9 5083.8 4.2
Hashtags 529 442 976 1.5
Tag Freq 5.26 5.73 5.81 1.49
TABLE 5
Comparing the P@1, NDCG@3 and running time of 16 ranking
methods using Ranklib, sklearn and Cornell’s RankSVM.
L2R Algorithm P@1 NDCG@3 Time(s)
Pointwise
RandomForest(sklearn) 0.852 0.848 2.75
MultilayerPerceptron(sklearn) 0.835 0.803 6.14
SVM(poly)(sklearn) 0.823 0.827 0.78
GradientBoosting(sklearn) 0.810 0.817 1.71
LinearRegression(sklearn) 0.803 0.824 0.16
AdaBoost(sklearn) 0.801 0.840 1.51
RandomForest(ranklib) 0.792 0.778 2.01
MART(ranklib) 0.783 0.768 49.87
GaussianNaiveBayes(sklearn) 0.764 0.757 0.05
Pairwise
RankBoost(ranklib) 0.774 0.773 15.67
RankSVM(cornell) 0.728 0.734 2.05
RankNet(ranklib) 0.654 0.718 7.45
Listwise
CoordinateAscent(ranklib) 0.778 0.765 28.11
LambdaMART(ranklib) 0.769 0.766 54.48
ListNet(ranklib) 0.751 0.756 14.56
AdaRank(ranklib) 0.737 0.749 2.53
listwise ranking algorithms, pointwise methods have higher
th
ou
fi
p
4.
To
p
8a
an
si
4.
T
ar
th
th
25. •Multi-class Classification Methods:
• Use hashtagged tweets as labeled data (hashtag = class)
• Need to wait to collect enough training data (tweet history size, e.g., 2h or
4h of past tweets)
• Need to be retrained often to keep up with: changes in tweet vocabulary,
emerging/dieing hashtags (retraining time, e.g., time required to train the
model decides how often we can re-train)
• Naive Bayes, Liblinear SVM, Neural Net
• L2R Methods:
• Trained once with hashtaged tweets or manually labeled (article, hashtag)
examples
• Pairwise L2R and Pointwise L2R (Hashtagger+)
25
Comparing to State-of-the-Art
26. Comparing to State-of-the-Art
26
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. , NO.
Article Coverage
Precision@1
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
0.10.20.30.40.50.60.70.80.91 66%,0.94(th=0.5)
77%,0.89(th=0.3)
All Articles
Hashtagger+ (search)
Hashtagger (stream)
PairwiseL2R
Liblinear (2h/30min))
Naive Bayes (4h/5min)
MultilayerPerc (1h/1h)
Fig. 7. P@1 and article coverage of the SOTA methods compared.
Precision@1
27. 27
Comparing to State-of-the-Art:
Popular vs Niche Articles
pared.
from 4h
ticle, and
candidate
rained by
d articles
4 binary
he article
bbc/rte),
L, (4) is a
ashtags).
thod pre-
y labeled
ming for
to gather
tions.
Article Coverage
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
0.10.20.3
Hashtagger (stream)
PairwiseL2R
Liblinear (2h/30min))
Naive Bayes (4h/5min)
MultilayerPerc (1h/1h)
Article Coverage
Precision@1
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
0.10.20.30.40.50.60.70.80.91
45%,0.94 (th=0.5)
58%,0.89(th=0.3)
Niche Articles
Fig. 8. P@1 and article coverage for popular versus niche articles.
28. Applications
•Hashtagger+ is deployed in a Web application
(http://insight4news.ucd.ie)
•Using the recommended hashtags:
• News Publishing on Twitter
• Story Detection & Tracking
28
30. Insight Centre for Data Analytics April 2016
No Hashtag #News Hashtagger
050000100000150000
Sum of Impressions
No Hashtag #News Hashtagger
02006001000
Sum of Engagements
No Hashtag #News Hashtagger
0200400600
Sum of Url Clicks
Twitter account (@insight4news3) automatically tweets article headlines.
Randomly allocate articles into 3 groups:
No Hashtag: Article Headline + URL
#News: Article Headline + URL + #News
Hashtagger: Article Headline + URL + Recommended Hashtags
Twitter Analytics Stats
30
34. Insight Centre for Data Analytics
Conclusion
April 2016
•Hashtagger+: a framework for real-time hashtag
recommendation to news.
•L2R model trained with human-labeled data can
address efficiency & precision challenges.
•By merging news and social media we can address
difficult problems: story & entity detection/
visualization/tracking/disambiguation/linking.
34
35. Thank you!
References
•Hashtagger+: Efficient High-Coverage Social Tagging of Streaming News,
B. Shi, G Poghosyan, G Ifrim, N Hurley [2017, under review]
•Learning-to-Rank for Real-Time High-Precision Hashtag
Recommendation for Streaming News, B Shi, G Ifrim, N Hurley [WWW16]
•Real-time News Story Detection and Tracking with Hashtags, G.
Poghosyan, G Ifrim [CNewsStory16]
•Topy: Real-time Story Tracking via Social Tags, G. Poghosyani,A. Qureshi, G
Ifrim [ECML/PKDD16]
•Insight4news: Connecting news to relevant social conversations, B Shi, G
Ifrim, N Hurley [ECML/PKDD14]
35