Doubly constrained offline reinforcement learning for learning path recommendation

Yue Yun, Huan Dai, Rui An, Yupei Zhang, Xuequn Shang

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

Learning path recommendation refers to the application of interactive recommendation systems in the field of education, aimed at optimizing learning outcomes while minimizing the workload of learners, teachers, and curriculum designers. Reinforcement Learning (RL) has proven effective in capturing and modeling the complex interactions among course activities, learner behaviors, and educational outcomes. Therefore, combining the two approaches presents endless possibilities for personalized education through the use of interactive recommendation systems in the education domain. However, traditional RL algorithms require extensive interaction with the environment during the training phase. Using unverified recommendation logic in interactions with actual students can give rise to unmanageable problems and hinder effective performance in an educational setting. This is because extrapolation introduces substantial evaluation errors that result in recommendations deviating significantly from the actual educational requirements. To address this limitation, we propose a novel method of offline reinforcement learning called Doubly Constrained deep Q-learning Network (DCQN). This method utilizes two generative models to fit existing student historical interaction data, which in turn, constrains the original policy network to generate new actions based on past interactions, avoiding the occurrence of overestimated actions and reducing extrapolation errors. Empirical results on demonstrate that this approach performs better than existing techniques across D4RL, i.e., datasets for deep data-driven reinforcement learning and real educational datasets.

Original languageEnglish
Article number111242
JournalKnowledge-Based Systems
Volume284
DOIs
StatePublished - 25 Jan 2024

Keywords

  • Extrapolation errors
  • Learning path recommendation
  • Offline reinforcement learning
  • Recommendation systems
  • Reinforcement learning

Fingerprint

Dive into the research topics of 'Doubly constrained offline reinforcement learning for learning path recommendation'. Together they form a unique fingerprint.

Cite this