InsideOut10 is delivering an innovative product called WordLift. WordLift brings the power of Artificial Intelligence to Web Publishers around the World turning editorial contents into actionable knowledge. WordLift integrates with WordPress a well-known open source CMS. This presentation also includes an introduction of MICO a research co-funded by the European project MICO (Media in Context).
2. This fine event is hosted
by:
@multilingweb // LIDER
future of journalism
opendata
@wordliftit v3
@mico_project
Hello, I am:
@cyberandy
No.8 - MARK ROTHKO
This workshop is about:
6. “ok Hound, When will the
sun rise in Japan two days
before Christmas in 2021?”
Friendly, helpful and intelligent
a complete new class of voice-enabled
assistants has just arrived
7. Beta Testing the Apocalypse - TOM KACZYNSKI
ANTI MONEY LAUNDRY COMPLIANCE
AND INVESTMENT STRATEGIES
BANKS &
INVESTORS
CHECKING IF THERE ARE ON-GOING OR
PAST LEGAL PROCESSES
LAW
FIRMS
POLICY
MAKERS
NEWS AS VALUABLE INPUT IN THE LAW
MAKING PROCESS
BUSINESS
CREATING BUSINESS VALUES AND
TAKING DECISIONS BY READING NEWS
(Humans)…creating value with News
9. can interpret your data and turn
it into meaningful, personalised
content.
Associated Press announced last year
that corporate earnings stories and
sport stories are written
automatically.
Text Generation Algorithms
Logan Ingalls / Flickr
10. Analysts expect higher profit for
Paychex when the company reports
its fourth quarter results on Tuesday,
July 1, 2014. The consensus
estimate is calling for profit of 40
cents a share, reflecting a rise from
38 cents per share a year ago.
Your New Colleague…the Algorithm
has just written a new piece.
12. “If our role as journalists is to
help communities better
organize their knowledge and
themselves, then it is apparent
that we are in the service
business and that we must draw
on many tools, including
content, and place value on the
relationships we build with
members of our communities,
which will also take many forms.
Thus we are in the relationship
business.”
Jeff Jarvis
Human Factor is key!
14. MEANINGFULLY ORGANISE YOUR CONTENT
A Semantic Editor for WordPress
for journalists and bloggers to:
ASSIST THE WRITING PROCESS
WITH CONTEXTUAL INFORMATION
ADD STRUCTURED METADATA
ENRICH CONTENT SUGGESTING
IMAGES, LINKS AND WIDGETS
RECOMMEND RELEVANT CONTENT
TO READERS
BUILD AN OPEN DATASET
(ENTITIES + ANNOTATIONS + CONTENT)
15. ASSIST THE WRITING PROCESS
WITH CONTEXTUAL INFORMATION
Fact-based information are derived
from open datasets and are
contextually relevant to the article.
Editors can choose what datasets
will be used for the enrichment.
16. ENRICH CONTENT SUGGESTING
IMAGES, LINKS AND WIDGETS
Relevant and free to use
photos and illustrations from
the Commons community
meaningful
navigation
systems
for internal
interlinking
17. Bringing to the audience an
overview of all the content
being written around a
specific topic using the chord
widget.
RECOMMEND RELEVANT CONTENT
content evolution over time
INTRODUCING THE NAVIGATOR WIDGET
WHERE /entity/earth
WHO /entity/michael-caine
schema:Person
schema:Place
schema:Organisation
WHO /entity/nasa
type: /BlogPosting
/2015/07/04/coopers-endurance-crew/
Creates links to entity
pages and related
articles by using the
WHO, WHERE,
WHAT and WHEN
classifications.
18. ADD STRUCTURED METADATA
The blog post, entities (dct:references),
publishing information (schema:datePublished
and schema:dateModified), the author
(schema:author), and the number of comments
(schema:interactionCount) are published as
Linked Open Data and printed using schema.org
for on-page SEO.
http://data.redlink.io/91/be2/post/Interstellar.html
19. Editors identify the basic 'WHO, WHAT, WHEN and WHERE'of an
article and structure information around it by creating new
entities in their custom vocabulary.
Content, vocabulary and annotations constitutes the
publisher’s knowledge graph and can be queried via SPARQL.
BUILD AN OPEN DATASET
(ENTITIES + ANNOTATIONS + CONTENT)
20. (using and )
How does a blog post look in the knowledge graph?
Special thanks to @dvcama :)
owl:sameAs connects entities, detected in the blog post, such as
Wormhole (with the same entity
on DBpedia and Freebase).
21. Starting this coming September WordLift and the technologies of MICO (for
cross-media analysis) are going to be used and validated by Greenpeace Italy
on their subscribers magazine website (magazine.greenpeace.it).
Let’s move now to a real-world use case
where ecologists, journalists and visionaries
stand to defend the natural world and to
promote peace.
22. CONTENT ANALYSIS
LINKED DATA PUBLISHING
1
3
Technology Stack
Text
Legacy Data
Audio/Images
CONTENT DISCOVERY2
MICO is a 3yrs EU-
funded research project
(grant no. 610480) that
brings to the platform
Cross-Media Extraction
Cross-Media Metadata Publishing
Cross-Media Querying
Cross-Media Recommendation
• Enterprise Linked
Data
• Content Analysis
• Semantic Search
• Semantic Media
Analysis and
Search
Media extractors available in MICO today:
Animal detection, video quality, temporal segmentation,
automatic speech recognition, speech-music discrimination,
face detection and audio tampering detection.
23. Multimedia Retrieval
Cross-Media Querying:
Introducing the SPARQL extension SPARQL-MM, which adds
multimedia specific features to the standard query
language for the Semantic Web.
How can we help
Greenpeace Italy?
•Connect videos with text using
cross-media recommendations
•Provide compact contextual
information for media assets
•Create new discovery path for
their readers and subscribers
Spation-Temporal Object Model in SPARQL-MM
“Point me to scenes within
videos where Barack
Obama is standing to left
of the MD of Greenpeace
while talking about whale
hunting”
Find out more on the SPARQL extension SPARQL-MM by reading this presentation by Thomas Kurz
24. Lessons learned so far…
• The bond between data and journalism is growing stronger and even for
independent news organisation like Greenpeace providing context, clarity
and building relationships (and knowledge graphs) is vital
• Algorithms are great and AI has entered the newsrooms but journalists
shall preserve their authorship and role when crafting content - always
leave the control in the hands of humans
• Providing immediate added value in the UX of semantic apps like
WordLift is key to engage journalists and not only marketers and
management
• Tags don’t help organising contents and named entities are much better
• Linked Data is a service NOT a technology: users want to see images,
meaningful links, recommendation and interactive widgets - they don’t
care about underlying technologies like RDF and SPARQL
• Creating datasets as a side effect while editing contents helps journalists
make an impact and connect with policy makers, business and other
communities.
25. JOIN.WORDLIFT.IT
Grazie!
“[SLIDES] Creating an open database of
knowledge by tagging the WHO, WHAT,
WHERE, WHEN of your contents #journalism”
Lclick to share it on Twitter!
mico-project.eu wordlift.it insideout.io
26. CREDITS
Wilfried Runde of Deutsche Welle, “In Praise of Robots and Humans”
Justin Kosslyn from Google Ideas, on thinking about how journalists'
work gets used
Luca Rosati from News to Experience
BBC News Labs A manifesto for structured journalism
this presentation is the result of many inspiring ideas and amazing work from
media experts, journalists and technologists and here is the list:
any idea, graphics or meme belonging to us is available
for sharing, copying and re-mixing under
creative commons license 3.0
This presentation and the work behind it was partially developed within the
MICO project (Media in Context - European Commission 7th Framework Programme
grant agreement no: 610480).
FIND OUT MORE ABOUT OUR PRODUCTS
Video Hosting Platform Semantic Editor Semantic Search