Research talk given at Italian National Research Council (CNR), Institute for Educational Technologies (ITD) on learning analytics in everyday online activities.
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
Analysing User Knowledge, Competence and Learning during Online Activities
1. Backup
Analysing User Knowledge, Competence and Learning
during Online Activities
Stefan Dietze, L3S Research Center, Hannover
26.09.2017
CNR Institute for Educational Technologies, Palermo
27/09/17 1Stefan Dietze
2. Research areas
Web science, Information Retrieval, Semantic Web, Social Web
Analytics, Knowledge Discovery, Human Computation
Interdisciplinary application areas: digital humanities,
TEL/education, Web archiving, mobility, ...
Some projects
Research @ L3S
27/09/17 2
See also: http://www.l3s.de
Stefan Dietze
3. Pavlos Fafalios (L3S)
Besnik Fetahu (L3S)
Elena Demidova (L3S)
Ujwal Gadiraju (L3S)
Eelco Herder (L3S)
Ivana Marenzi (L3S)
Nicolas Tempelmeier (L3S)
Ran Yu (L3S)
Markus Rokicki (L3S)
Renato Joao (L3S, PUC Rio)
Acknowledgements: Team
27/09/17 3Stefan Dietze
Mathieu d‘Aquin (The Open University, UK)
Mohamed Ben Ellefi (LIRMM, France)
Davide Taibi (CNR, Italy)
Konstantin Todorov (LIRMM, France)
...
4. 27/09/17 4
Learning Analytics on the Web/for online learning ?
Stefan Dietze
Anything can be a learning resource
The activity makes the difference (not the
resource): i.e. how a resource is being used
Learning Analytics in online/non-learning
environments?
o Activity streams,
o Social graphs (and their evolution),
o Behavioural traces (mouse movements,
keystrokes)
o ...
Research challenges:
o How to detect „learning“?
o How to detect learning-specific notions
such as „competences“, „learning
performance“ etc?
5. 27/09/17 5
„AFEL – Analytics for Everyday (Online) Learning“
Stefan Dietze
Examples of AFEL data sources:
• Activity streams and behavioral traces
• L3S Twitter Crawl: 6 bn tweets
• Common Crawl (2015): 2 bn documents
• Web Data Commons (2016): 44 bn quads
• „German Academic Web“: 6 TB Web crawl
• Web search query logs
• Wikipedia edit history: 3 M edits/month
(engl.)
• ....
H2020 project (since 12/2015) aimed at understanding/supporting learning in social Web environments
6. Challenges/Tasks in AFEL & beyond: some examples
27/09/17 6Stefan Dietze
I Efficient data capture
Crawling & extracting activity data
Crawling, extracting and indexing learning
resources (eg Common Crawl)
II Efficient data analysis
Understanding learning resources: entity
extraction & clustering on large Web crawls of
resources
“Search as learning”: detecting learning in
heterogeneous search query logs & click streams
Detecting learning activities: detection of learning
pattern (eg competent behavior) in absence of
learning objectives & assessments (!)
o Obtaining performance indicators from
behavioral traces?
o Quasi-experiments in crowdsourcing
platforms to obtain training data
Gadiraju, U., Demartini, G., Kawase, R., Dietze, S. Human beyond the
Machine: Challenges and Opportunities of Microtask
Crowdsourcing. In: IEEE Intelligent Systems, Volume 30 Issue 4 –
Jul/Aug 2015.
Gadiraju, U., Kawase, R., Dietze, S, Demartini, G., Understanding
Malicious Behavior in Crowdsourcing Platforms: The Case of
Online Surveys. ACM CHI Conference on Human Factors in Computing
Systems (CHI2015), April 18-23, Seoul, Korea.
7. Gadiraju, U., Demartini, G., Kawase, R., Dietze, S. Human beyond
the Machine: Challenges and Opportunities of Microtask
Crowdsourcing. In: IEEE Intelligent Systems, Volume 30 Issue 4 –
Jul/Aug 2015.
Gadiraju, U., Fetahu, B., Kawase, R., Siehndel, P., Dietze, S., Crowd
Anatomy Beyond the Good and Bad -- Behavioral Traces for
Crowd Worker Modeling and Pre-selection. The Journal of
Collaborative Computing and Work Practices (CSCW), under review.
27/09/17 7Stefan Dietze
Predicting competence in online users?
Capturing assessment data: microtasks in Crowdflower
“Content Creation (CC)”: transcription of captchas
“Information Finding (IF)”: middle name of famous persons
1800 assessments: 2 tasks * 3 durations * 3 difficulty levels
* 100 users (per assessment)
Level 1
„Daniel Craig“
Level 2
„George Lucas“
(profession: Archbishop)
Level 3
„Brian Smith“
(profession: Ice Hockey, born: 1972)
Behavioral Traces: keystrokes- and mouse movements
timeBeforeInput, timeBeforeClick
tabSwitchFreq
windowToggleFreq
openNewTabFreq
WindowFocusFrequency
totalMouseMovements
scrollUpFreq, scrollDownFreq
….
Total amount of events: 893.285 (CC Tasks), 736.664 (IF Tasks)
Find the middle name of:
8. 27/09/17 8Stefan Dietze
Behavioural traces to predict competence?
Training data
Manual annotation of 1800 assessments
Performance types [CHI15]:
o “Competent Worker” ,
o “Diligent Worker”
o “Fast Deceiver”
o “Incompetent Worker”
o “Rule Breaker”
o “Smart Deceiver”
o “Sloppy Worker”
Prediction of performance types from
behavioral traces?
Predicting learner types from behavioral traces
“Random Forest Classifier” (per task)
10-fold cross validation
Prediction performance: Accuracy, F-Measure
Gadiraju, U., Fetahu, B., Kawase, R., Siehndel, P., Dietze, S.,
Crowd Anatomy Beyond the Good and Bad -- Behavioral
Traces for Crowd Worker Modeling and Pre-selection. The
Journal of Collaborative Computing and Work Practices
(CSCW), under review.
Results
Longer assessments more signals
Simpler assessments more conclusive signals
“Competent Workers” (CW, DW): accuracy of 91% respectively 87%
Most significant features: “TotalTime”, “TippingPoint”,
“MouseMovementFrequency”, “WindowFocusFrequency”
9. 27/09/17 9Stefan Dietze
Other features to predict competence in learning/assessments?
“Dunning-Kruger Effect”
Incompetence in task/domain reduces capacity to
recognise/assess own incompetence
Research question
Self-assessment as feature to predict competence?
Results
Self-assessment as (additional) reliable indicator of
competence (94% accuracy), superior to mere
performance measurement
Tendency to over-estimate own competence
increases with increasing difficulty level
David Dunning. 2011. The Dunning-Kruger Effect: On Being Ignorant of
One’s Own Ignorance. Advances in experimental social psychology 44
(2011), 247.
Performance („Accuracy“) of users classified as „competent“
Gadiraju, U., Fetahu, B., Kawase, R., Siehndel, P., Dietze, S., Using
Worker Self-Assessments for Competence-based Pre-Selection in
Crowdsourcing Microtasks. In: ACM Transactions on Computer-
Human Interaction (ACM TOCHI) Vol. 24, Issue 4, August 2017.
10. 27/09/17 10Stefan Dietze
“Search As Learning”: predicting learning/knowledge in Web search
Challenges
Detecting individual search missions in large
query logs
Detecting “informational” search missions (as
opposed to “transactional” or “navigational”
missions, see [Broder, 2002])
Predict knowledge state of users in absence of
assessment data
Predict knowledge gain (or “learning”)
throughout search missions
11. 27/09/17 11Stefan Dietze
“Search As Learning”: predicting learning/knowledge in Web search
Challenges
Detecting individual search missions in large
query logs
Detecting “informational” search missions (as
opposed to “transactional” or “navigational”
missions, see [Broder, 2002])
Predict knowledge state of users in absence of
assessment data
Predict knowledge gain (or “learning”)
throughout search missions
Initial results
Search mission detection with average F1 score
75% (experiments based on AOL query logs)
Quasi experiments to generate search mission
data (queries, behavioral traces, pre- and post-
tests) for 400 search missions
Ongoing: prediction of knowledge gain/state
12. 27/09/17 12Stefan Dietze
Summary & outlook
Learning analytics in online & Web-based settings
o Detection of learning & learning-related notions in
absence of assessment/performance indicators
o Analysis of range of data, including behavioral
traces, activity streams, self assessment etc
o Actual big data (dynamics/velocity)
Positive results from initial models and classifiers
Other tasks (e.g. detection of learning during Web
search)
Application of developed models and classifiers in
online (learning) environments (e.g. AFEL project), such
as GNOSS/Didactalia (200.000 users), “LearnWeb”,
“Bibsonomy” etc
Several ongoing research initiatives, eg research
initiative at LUH on “Digital Higher Education in MINT
Subjects”
13. 27/09/17 13Stefan Dietze
Summary & outlook
Learning analytics in online & Web-based settings
o Detection of learning & learning-related notions in
absence of assessment/performance indicators
o Analysis of range of data, including behavioral
traces, activity streams, self assessment etc
o Actual big data (dynamics/velocity)
Positive results from initial models and classifiers
Other tasks (e.g. detection of learning during Web
search)
Application of developed models and classifiers in
online (learning) environments (e.g. AFEL project), such
as GNOSS/Didactalia (200.000 users), “LearnWeb”,
“Bibsonomy” etc
Several ongoing research initiatives, eg research
initiative at LUH on “Digital Higher Education in MINT
Subjects”
?http://stefandietze.net