XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse
ABSTRACT
A key competency that we seek to build in learners is a critical mind, i.e. ability to engage with the ideas in the literature, and to identify when significant claims are being made in articles. The ability to decode such moves in texts is essential, as is the ability to make such moves in one’s own writing. Computational techniques for extracting them are becoming available, using Natural Language Processing (NLP) tuned to recognize the rhetorical signals that authors use when making a significant scholarly move. After reviewing related NLP work, we introduce the Xerox Incremental Parser (XIP), note previous work to render its output, and then motivate the design of the XIP Dashboard, a set of visual analytics modules built on XIP output, using the LAK/EDM open dataset as a test corpus. We report preliminary user reactions to a paper prototype of such a novel dashboard, describe the visualizations implemented to date, and present user scenarios for learners, educators and researchers. We conclude with a summary of ongoing design refinements, potential platform integrations, and questions that need to be investigated through end-user evaluations.
Beyond the EU: DORA and NIS 2 Directive's Global Impact
XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse
1. 1st International Workshop on Discourse-Centric Learning Analytics
April 8, 2013, LAK13 Conference, Leuven, Belgium
XIP Dashboard: Visual Analytics
from Automated Rhetorical Parsing
of Scientific Metadiscourse
Duygu Simsek, Simon Buckingham Shum, Anna De Liddo,
Rebecca Ferguson — The Open University, UK
Ágnes Sándor — Xerox Research Centre Europe, FR
3. Metadiscourse signals important moves
in educated/scholarly narrative
(When scholarly culture works well) this
is what gets your papers accepted by
reviewers, and quoted by others
Clear statements regarding the
problem, the claim, the argument,
the evidence, the implications…
This is what we teach
students from school
upwards
3
4. Rhetorical functions of metadiscourse identified
by the Xerox Incremental Parser (XIP)
BACKGROUND KNOWLEDGE: NOVELTY: OPEN QUESTION:
Recent studies indicate … ... new insights provide direct evidence ... … little is known …
… the previously proposed … ... we suggest a new ... approach ... … role … has been elusive
Current data is insufficient …
… is universally accepted ... ... results define a novel role ...
SUMMARIZING: SIGNIFICANCE: CONTRASTING IDEAS:
The goal of this study ... studies ... have provided important … unorthodox view resolves …
advances paradoxes …
Here, we show ...
Knowledge ... is crucial for ... In contrast with previous
Altogether, our results ... indicate understanding hypotheses ...
valuable information ... from studies ... inconsistent with past findings ...
GENERALIZING: SURPRISE:
... emerging as a promising approach We have recently observed ...
surprisingly
Our understanding ... has grown
exponentially ... We have identified ... unusual
... growing recognition of the The recent discovery ... suggests
intriguing roles
importance ...
5. Xerox Incremental Parser (XIP)
Sándor, Á. and Vorndran, A. (2010). The detection of salient messages from social science research papers and its application in document search. 5
Workshop on Natural Language Processing Tools Applied to Discourse Analysis in Psychology, Buenos Aires, Argentina, May 10-14. 2010.
6. Xerox Incremental Parser (XIP)
Sándor, Á. and Vorndran, A. (2010). The detection of salient messages from social science research papers and its application in document search. 6
Workshop on Natural Language Processing Tools Applied to Discourse Analysis in Psychology, Buenos Aires, Argentina, May 10-14. 2010.
7. Xerox Incremental Parser (XIP)
Sándor, Á. and Vorndran, A. (2010). The detection of salient messages from social science research papers and its application in document search. 7
Workshop on Natural Language Processing Tools Applied to Discourse Analysis in Psychology, Buenos Aires, Argentina, May 10-14. 2010.
8. Xerox Incremental Parser (XIP)
Sándor, Á. and Vorndran, A. (2010). The detection of salient messages from social science research papers and its application in document search. 8
Workshop on Natural Language Processing Tools Applied to Discourse Analysis in Psychology, Buenos Aires, Argentina, May 10-14. 2010.
9. Initial evaluation of XIP is promising,
but methodologically complex
A striking example – but not all were like this (De Liddo et al, 2012)
Human analyst XIP
Extract from annotation comparison:
Document 1 19 sentences annotated 22 sentences annotated
11 sentences same as human annotation
Document 2 71 sentences annotated 59 sentences annotated
42 sentences same as human annotation
10. Xerox Incremental Parser (XIP)
XIP’s raw output is fine for NLP
machines/researchers, but
not learner/educator
friendly
11. Xerox Incremental Parser (XIP)
XIP’s raw output is fine for NLP
machines/researchers, but
not learner/educator
friendly
12. Xerox Incremental Parser (XIP)
5000 (or even 30) plain text files…
we need overviews
of XIP analyses from
a corpus
13. Making XIP analytics visible:
1. annotations on the full text using the OU’s
Cohere social sensemaking app (Firefox add-on)
14. Making XIP analytics visible:
2. XIP annotations visualized in Cohere as a network
around the document
15. Making XIP analytics visible (2)
2nd phase analysis
of document-concept clouds…
Connecting?
?
Merging?
Re-tagging? ?
Summarising?
? ?
16. XIP Dashboard: towards an earlier phase
dashboard for navigating XIP output
Draw attention to patterns of potential significance to
students, educators and experienced researchers alike:
§ the occurrence of domain concepts in different
metadiscourse contexts – e.g. effective tutoring
dialogue in sentences classified contrast
§ trends of the above over time, e.g. to show the
development of an idea
§ trends within and differences between research
communities as reflected in their publications
§ eventually, the above for one’s own writing
16
18. Paper prototype to elicit initial reactions
‘Intro movie’ from researcher
Participants point + click with
finger
Basic navigation seems fine
Enthusiasm for a tool that
could help with literature
analysis
Also for a tool to improve
one’s own writing by showing
trends, or inconsistencies
18
19. XIP Dashboard
Temporal trends per corpus
Similar patterns for LAK &
EDM literatures
Summary & Contrast
categories relatively
higher, and rising
(Not controlled for
different corpus sizes in
these graphs)
19
24. XIP Dashboard User scenarios…
Student / Educator / Researcher
Familiarization with the
background material in
a literature…
Comparing different
writing patterns between
communities, or
students…
Focusing on specific
concepts of interest in
combination with
rhetorical context
24
25. XIP Dashboard User Evaluations
Signal-noise
ratio?
Deeper or
shallower reading?
New insights, or just
faster insights?
Better writing, or just
gaming the system?
25
26. Summary
Early phases of work: a promising language technology
now has visual analytics we can deploy with stakeholders
Beyond number / size / frequency
http://www.glennsasscer.com/wordpress/wp-content/uploads/2011/10/iceberg.jpg
of posts; ‘hottest thread’
An important feature of
educated writing is knowing
how to signal substantive
rhetorical moves. NLP can
detect this, and we can now
generate rudimentary visual
analytics.
To be continued…