2. Free
the data
by Tom Raftery http://www.flickr.com/photos/traftery/4773457853/sizes/l
2
3. Why ?
by Tom Raftery http://www.flickr.com/photos/traftery/4773457853/sizes/l
2
4. Because we
will get new
insights
by Tom Raftery http://www.flickr.com/photos/traftery/4773457853/sizes/l
2
5. Issues and Considerations regarding Sharable Data
Sets for Recommender Systems in TEL
29.09.2010 RecSysTEL workshop at RecSys and ECTEL conference, Barcelona, Spain
Hendrik Drachsler #dataTEL
Centre for Learning Sciences and Technology
@ Open University of the Netherlands 3
6. What is dataTEL
• dataTEL is a Theme Team funded by the
STELLAR network of excellence.
• It address 2 STELLAR Grand Challenges
1. Connecting Learner
2. Contextualisation
4
9. The TEL recommender
are a bit like this...
We need to select for each application an
appropriate recsys that fits its needs.
6
10. But...
“The performance results
of different research
efforts in recommender
systems are hardly
comparable.”
(Manouselis et al., 2010)
Kaptain Kobold
http://www.flickr.com/photos/
kaptainkobold/3203311346/
7
11. But...
The TEL recommender
“The performance results
experiments lack
of different research
transparency. They need
efforts in recommender
to be repeatable to test:
systems are hardly
comparable.”
• Validity
• Verificationet al., 2010)
(Manouselis
• Compare results Kaptain Kobold
http://www.flickr.com/photos/
kaptainkobold/3203311346/
7
15. The Long Tail of Learning
Graphic Wilkins, D., (2009); Long10 concept Anderson, C. (2004)
tail
16. The Long Tail of Learning
Formal Data
Informal Data
Graphic Wilkins, D., (2009); Long10 concept Anderson, C. (2004)
tail
17. Guidelines for Data Set
Aggregation
Justin Marshall, Coded Ornament by
rootoftwo
http://www.flickr.com/photos/rootoftwo/
267285816
11
18. Guidelines for Data Set
Aggregation
1. Create a data set that
realistically reflects the
variables of the learning
setting.
Justin Marshall, Coded Ornament by
rootoftwo
http://www.flickr.com/photos/rootoftwo/
267285816
11
19. Guidelines for Data Set
Aggregation
1. Create a data set that
realistically reflects the
variables of the learning
setting.
2. Use a sufficiently large
set of user profiles
Justin Marshall, Coded Ornament by
rootoftwo
http://www.flickr.com/photos/rootoftwo/
267285816
11
20. Guidelines for Data Set
Aggregation
1. Create a data set that
realistically reflects the
variables of the learning
setting.
2. Use a sufficiently large
set of user profiles
3. Create data sets that
are comparable to others
Justin Marshall, Coded Ornament by
rootoftwo
http://www.flickr.com/photos/rootoftwo/
267285816
11
21. Guidelines for Data Set
Aggregation
1. Create a data set that
realistically reflects the
variables of the learning
setting.
2. Use a sufficiently large
set of user profiles
3. Create data sets that
are comparable to others
Justin Marshall, Coded Ornament by
rootoftwo
http://www.flickr.com/photos/rootoftwo/
267285816
11
22. dataTEL::Objectives
Standardize research on
recommender systems in TEL
Five core questions:
1.How can data sets be shared according to privacy and legal
protection rights?
2.How to development a policy to use and share data sets?
3.How to pre-process data sets to make them suitable for
other researchers?
4.How to define common evaluation criteria for TEL
recommender systems?
5.How to develop overview methods to monitor the
performance of TEL recommender systems on data sets?
12
26. 1. Protection Rights
OVERSHARING
Were the founders of PleaseRobMe.com actually
allowed to grab the data from the web and present it
in that way?
13
27. 1. Protection Rights
OVERSHARING
Were the founders of PleaseRobMe.com actually
allowed to grab the data from the web and present it
in that way?
Are we allowed to use data from social handles and
reuse it for research purposes?
13
28. 1. Protection Rights
OVERSHARING
Were the founders of PleaseRobMe.com actually
allowed to grab the data from the web and present it
in that way?
Are we allowed to use data from social handles and
reuse it for research purposes?
and there is more company protection rights!
13
36. 3. Pre-Process Data Sets
For informal data sets:
1. Collect data
2. Process data
3. Share data
For formal data sets
from LMS:
1. Data storing scripts
2. Anonymisation scripts
15
38. 4. Evaluation Criteria
1. Accuracy
2. Coverage
3. Precision
Combine approach by Kirkpatrick model by
Drachsler et al. 2008 Manouselis et al. 2010
16
39. 4. Evaluation Criteria
1. Accuracy
2. Coverage
3. Precision
1. Effectiveness of learning
2. Efficiency of learning
3. Drop out rate
4. Satisfaction
Combine approach by Kirkpatrick model by
Drachsler et al. 2008 Manouselis et al. 2010
16
40. 4. Evaluation Criteria
1. Accuracy 1. Reaction of learner
2. Coverage 2. Learning improved
3. Precision 3. Behaviour
4. Results
1. Effectiveness of learning
2. Efficiency of learning
3. Drop out rate
4. Satisfaction
Combine approach by Kirkpatrick model by
Drachsler et al. 2008 Manouselis et al. 2010
16
41. 4. Evaluation Criteria
1. Accuracy 1. Reaction of learner
2. Coverage 2. Learning improved
3. Precision 3. Behaviour
4. Results
1. Effectiveness of learning
2. Efficiency of learning
3. Drop out rate
4. Satisfaction
Combine approach by Kirkpatrick model by
Drachsler et al. 2008 Manouselis et al. 2010
16
42. 4. Evaluation Criteria
1. Accuracy 1. Reaction of learner
2. Coverage 2. Learning improved
3. Precision 3. Behaviour
4. Results
1. Effectiveness of learning
2. Efficiency of learning
3. Drop out rate
4. Satisfaction
Combine approach by Kirkpatrick model by
Drachsler et al. 2008 Manouselis et al. 2010
16
43. 5. Data Set Framework to Monitor
Performance
Datasets
Formal Informal
Data A Data B Data C
Algorithms: Algorithms: Algorithms:
Algoritmen A Algoritmen D Algoritmen B
Algoritmen B Algoritmen E Algoritmen D
Algoritmen C
Models: Models: Models:
Learner Model A Learner Model C Learner Model A
Learner Model B Learner Model E Learner Model C
Measured attributes: Measured attributes: Measured attributes:
Attribute A Attribute A Attribute A
Attribute B Attribute B Attribute B
Attribute C Attribute C Attribute C
17
44. Join us for a Coffee ...
http://www.teleurope.eu/pg/groups/9405/datatel/
18
45. Upcoming event ...
Workshop: dataTEL- Data Sets for Technology Enhanced Learning
Date: March 30th to March 31st, 2011
Location: Ski resort La Clusaz in the French Alps, Massif des
Aravis
Funding: Food and lodging for 3 nights for 10 selected participants
Submissions: For being funded, please send extended abstracts to
http://www.easychair.org/conferences/?conf=datatel2011
Deadline for submissions: October 25th, 2010
CfP: http://www.teleurope.eu/pg/pages/view/46082/
19
46. Many thanks for your interests
This silde is available at:
http://www.slideshare.com/Drachsler
Email: hendrik.drachsler@ou.nl
Skype: celstec-hendrik.drachsler
Blogging at: http://www.drachsler.de
Twittering at: http://twitter.com/HDrachsler
20