Recommending courses to students in online platforms is studied widely. Almost all studies target closed platforms, that belong to a University or some other educational provider. This makes the course recommenders situation specific. Over the last years, a demand has developed for recommender system that suit open online platforms. Those platforms have some common characteristics, such as the lack of rich user profiles with content metadata. Instead they log user interactions within the platform that can be used for analysis and personalization. In this paper, we investigate how user interactions and activities tracked within open online learning platforms can be used to provide recommendations. We present a study in which we investigate the application of several state-of-the-art recommender algorithms, including a graph-based recommender approach. We use data from the OpenU open online learning platform that is in use by the Open University of the Netherlands. The results show that user-based and memory-based methods perform better than model-based and factorization methods. Particularly, the graph-based recommender system proves to outperform the classical approaches on prediction accuracy of recommendations in terms of recall. We conclude that, if the algorithms are chosen wisely, recommenders can contribute to a better experience of learners in open online courses.
Soude Fazeli, Enayat Rajabi, Leonardo Lezcano, Hendrik Drachsler, Peter Sloep
Difference Between Search & Browse Methods in Odoo 17
Recommendations for Open Online Education: An Algorithmic Study
1. Recommendations for Open Online
Education:
An Algorithmic Study
Soude Fazeli1, Enayat Rajabi2, Leonardo Lezcano3, Hendrik
Drachsler1, Peter Sloep1
1 Open University Netherlands, 2 Dalhousie University,
3 eBay Inc.
27.07.2016, ICALT 2016, Austin, Texas, USA
2. 3
• Hendrik Drachsler
Associate Professor
Learning Technologies
• Research topics:
Personalization,
Recommender Systems,
Learning Analytics,
Mobile devices
• Application domains:
Schools, HEI, Medical
education
WhoAmI
2006 - 2009
@HDrachsler
4. Context of the study
• Goal: Personalization of Learning (based on prior knowledge)
• Problem: Selection from a huge variety of possibilities (Information
overflow)
• Solution: Recommender systems that points a target user to content of
interest based on her user profile
Recommendations for Open Education: An Algorithimic Study
Pagina 4
5. Problem definition
Recommendations for Open Education: An Algorithimic Study
Pagina 5
Institutional Course RecSys Open Education RecSys
VS.
Rich learner and course
metadata
Sparse learner and course
metadata
6. Pagina 6
RQ: How to recommend courses to
learners in open education platforms?
Recommendations for Open Education: An Algorithimic Study
Research Question
7. Pagina 7
1. Content-based 2. Collaborative filtering ✓
Recommendations for Open Education: An Algorithimic Study
Recommender system algorithms
Our Input Data are mainly user indirect ratings, thus
collaborative filtering are more relevant for us
8. 8
Drachsler, H., Verbert, K., Santos, O., and Manouselis, N. (2015).
Recommender Systems for Learning. 2nd Handbook on Recommender
Systems. Berlin:Springer
Recommender system algorithms
9. Pagina 9
• Memory-based
• Use statistical approaches to infer similarity between users based on
the users’ data stored in memory
• k-Nearest Neighbour method (kNN, with neighbourhood size k)
• Similarity metrics: Pearson correlation, Cosine similarity, and the
Jaccard coefficient.
• Model-based
• Use probabilistic approaches to create a model of users’ feedback
• Matrix factorization, and Bayesian networks
• are faster than memory-based algorithms
• more costly (required resources and maintenance)
In this study, we use both memory-based (both user-based and item-based) and
model-based algorithms to test which one performs best on the Open U platform.
Recommendations for Open Education: An Algorithimic Study
Collaborative Filtering (CF) algorithms
10. Pagina 10
H1: Item-based outperforms user-based approaches
H2: Model-based outperforms memory-based approaches
Recommendations for Open Education: An Algorithimic Study
Hypothesis
11. Experiment
Pagina 11
1. Dataset
• From Open Education Platform: OpenU
A broad national online learning platform for lifelong learning
• Data collected: from March 2009 until September 2013
• Users: OpenU Users are professionals from various domains
Dataset Users Learning
objects
Transactions Sparsity
(%)
OpenU 3462 105 92,689 98.14
Recommendations for Open Education: An Algorithimic Study
12. Pagina 12
• Figure 1: Course completion in related to the students’ activity
• Each blue X: the Percentage of Online Interactions (POI) for a given student and a
given course, relative to the highest online interactions of a student in that course.
• Online interactions = student’s contributions to chat sessions and forum messages.
The course completion rate for
OpenU students goes up dramatically
with increases in students’
interactions (course-mates and the
academic staff)
Recommendations for Open Education: An Algorithimic Study
Experiment
1. Data set
13. Pagina 13
Experiment
2. Algorithms
2.1. Memory-based
• Most CF algorithms are based on kNN methods:
• Find like-minded users and introduces them as the target user’s
nearest neighbours
• The appropriate similarity measure depends on whether the input data is:
• Explicit (e.g. 5-star ratings) or
• Implicit user feedback (e.g. views, downloads, clicks, etc.)
• Open U = Implicit user feedback (activities) -> Jaccard coefficient and
Cosine are appropriate
Recommendations for Open Education: An Algorithimic Study
14. Pagina 14
Experiment
2. Algorithms
2.2. Model-based
• Bayesian Personalized Ranking (BPR) method proposed by Rendle et al.
• They applied their BPR to the state-of-the-art matrix factorization
models to improve the learning process in the Bayesian model used
(BPRMF).
• MostPopular approach
• Makes recommendations based on general popularity of items
• Items are weighted based on how often they have been seen in the past
S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme, “BPR: Bayesian
Personalized Ranking from Implicit Feedback,” in UAI ’09 Proceedings of the Twenty-
Fifth Conference on Uncertainty in Artificial Intelligence, 2009, pp. 452–461
15. Pagina 15
Experiment
2. Algorithms
2.3. Graph-based
• Implicit networks: a graph
– Nodes: users; Edges: similarity
relationships; Weights: similarity values
• Improve the process of finding nearest
neighbors
– By invoking graph search algorithms
– Memory-based and user-based
– For more information, see our ECTEL2014
paper:
S. Fazeli, B. Loni, H. Drachsler, and P. Sloep, “Which Recommender system Can Best Fit
Social Learning Platforms?,” in 9th European Conference on Technology Enhanced
Learning, EC-TEL 2014, 2014, pp. 84–97.
16. Pagina 16
Experiment
3. Settings
• Metrics
• Precision (ratio number of relevant items recommended to the total
number of recommended items)
• Recall shows the probability that a relevant item is recommended
• Both Precision and recall range from 0 to 1.
• The number of courses in this experiment is 105 thus
• The number of top-n items to be recommended is 5 (approx. 5% of
the courses) and 10 (approx.10% of the courses).
• For each memory-based CF algorithm, we evaluated six neighbourhood
sizes (k={5,10,20,30,50,100}).
Recommendations for Open Education: An Algorithimic Study
17. Pagina 17
1. Memory-based
• User-based with Jaccard (UB1)
• User-based with Cosine (UB2)
• Item-based with Jaccard (IB1)
• Item-based with Cosine (IB2)
2. Model-based
• MostPopular (MB1)
• Bayesian Personalized Ranking with Matrix Factorization (MB2)
3. Graph-based
• User-based with T-index (UB3)
Experiment
Which algorithm and parameters are best suited for
the users of the Open U learning platform?
18. Experimental study
3. Results
Pagina 18
Values for the highest-
scoring
neighbourhood size
are in bold, the
highest values among
all are underlined
19. Discussions
H1: Item-based outperformed user-based methods.
User-based CFs exceeded all expectations - contrary to what the
recommender systems literature suggests.
• Item-based results were expected to trump the user-based
since the number of items (courses) is much smaller than the
number of users for our dataset.
• User-based algorithms performed better on the Open U data
than those that make use of similarities between items
(courses).
• Therefore: we reject H1.
Pagina 19
Recommendations for Open Education: An Algorithimic Study
20. Discussions
H2: Matrix factorization methods outperform memory-based
methods.
• The user-based recommenders (UB1, UB2, UB3), which are memory-
based, widely outperform the model-based ones (MB1, MB2).
• We expected the matrix factorization (model-based CFs) to perform
better since they often prove to outperform prediction accuracy of
recommendations particularly when explicit user feedback is available
(e.g. 5-star ratings).
• So we reject also H2.
Pagina 20
Recommendations for Open Education: An Algorithimic Study
21. Pagina 21
• This study sought to find out how best to generate
personalized recommendations from user activities
within an open online course platform.
• The results show that user-based and memory-
based methods perform better than item and
model-based factorization methods.
• The UB1 algorithms seem to be most suited to
provide accurate recommendations to the users of
our Open U platform.
Recommendations for Open Education: An Algorithimic Study
Conclusion
22. Pagina 22
Ongoing and Further work
1. Integrating the selected recommender algorithms in the OpenU platform
to provide online recommendations.
2. Studying how the graph-based approach can help to improve the
process of finding like-minded neighbours in terms of social network
analysis (SNA)
3. User study
– To measure novelty and serendipity of the recommendations made
for OpenU users.
Recommendations for Open Education: An Algorithimic Study
Almost all studies on recommending courses have been on “closed online course platforms” that typically belong to a specific university:
Examples: CourseAgent of the University of Pittsburgh and CourseRank of Stanford University.
But in Coursera, edX, Udacity, MiriadaX, FutureLearn or OpenU:
The need arises for recommenders that are free from curricular concerns but address the individual information needs of the learners.
Moreover, most of the existing course recommenders are dependent on learner or course content metadata.
In many open online course platforms, often no comprehensive metadata about the learners are available.
Typically, learners provide as little data they can, merely to get through the sign-up procedure.
Also, availability of rich content metadata for open online courses is often an issue.
Data regarding user interactions and activities become richer and more extensive
As learners spend more time online and also more and more learners take advantage of the open online offers.
Which recommender algorithms performs best for learners in Open Education platforms?
To develop a recommender system one first needs to find out about the available input data. The user activities in the OpenU dataset are mainly implicit, coming from tracking data, like signing up for a course or contributing to a forum. Therefore, Collaborative Filtering (CF) recommenders can be applied. Such methods make recommendations for a target user based on other users’ opinions and interests. Content-based methods should be used when, first, rich content data is available (not the case in this study) and, second, user rating information (5-star, binary, unary) is not available. However, “even if very few ratings are available, simple rating-based predictors outperform purely metadata- based ones” [9],. This is due to the difference between the
item descriptions and the items themselves, and users often rate how much they like an items, not their descriptions.
Upon signing up users receive a student ID and can register for free in any OpenU course they are interested in.
Figure 1 shows how course completion is related to the students’ activities and interactions.
Each blue X: the Percentage of Online Interactions (POI) for a given student and a given course, relative to the highest online interactions of a student in that course.
Online interactions:
Calculated based on a student’s contributions to chat sessions and forum messages.
Most CF algorithms are based on kNN methods:
Find like-minded users and introduces them as the target user’s nearest neighbours
The appropriate similarity measure depends on whether the input data is:
Explicit (e.g. 5-star ratings) or
Implicit user feedback (e.g. views, downloads, clicks, etc.)
The OpenU data is implicit user feedback: (userID,itemID) tuples
Item refers to course in the current study
Also known as positive feedback only
Thus the Jaccard coefficient and Cosine are appropriate since they use implicit user feedback
Metrics
Precision is defined as the ratio of the number of relevant items recommended to the total number of recommended items.
Recall shows the probability that a relevant item is recommended: the number of relevant items divided by the total number of relevant items in the entire test set.
Both Precision and recall range from 0 to 1.
The number of courses in this experiment is 105 thus
The number of top-n items to be recommended is 5 (approx. 5% of the courses) and 10 (approx.10% of the courses).
Data split
A random 80% of the data was assigned to a training set, the rest constituted the test set.
For each memory-based CF algorithm, we evaluated six neighbourhood sizes(k={5,10,20,30,50,100}).
Model-based algorithms use latent factors (f={3, 5,10}).