Program2Tutor: Automatic Curriculum Generation with Bandits

13 Apr 2019

Reading: Program2Tutor by Tongmu Mu, Karan Goel, and Emma Brunskill

Problem: Determining topic relationships and statistical student learning modeling is a comlpex process. Prior Work: 1. Prerequisite or knowledge graph of the related matericl can be automatically constructed given an algorithmic representation of the underlying material. 2. How to automatically progress a student through a knowledge graph with minimal assumptions about a model of the underlying student learning process. Contribution: 1. Preliminary work on uniting the idea of automatically generating a knowledge graph and progressing a student while learning. 2. Novel approach for identifying the initial background knowledge of the learner. In this work, the authors unify the ideas of automatic curriculum generation from execution traces and automatic problem selection using reinforcement learning techniques. Specifically, they use multi-armed bandit algorithm for problem selection (ZPDES) as it is less reliant than other methods on the underlying student learning model which can be advantageous. Additionally, the authors build on prior work in probabilistically detecting the knowledge boundary of the student and present a novel approach to determine the initial knowledge state of the student within the curriculum. Automatic Curriculum Generation from Execution Traces Let \( n \) be any positive integer. We say that a trace \( T_{1}\) is at least as complex as trace \( T_{2} \) if every n-gram of trace \( T_{2} \) is also present in trace \( T_{1}\). Progressing Students Using Multi-Armed Bandits This algorithm at each timestep selects a problem from within the set of problem types on the boundary of the student's knowledge, or the ZPD, that it predicts will give the most most reward, which is measured in terms of student learning progress. On initialization, all problems start with an initial unnormalized weight \( w_{a} = w_{i}\). The weights are normalized to ensure a proper probability distribution: The weights of the problems \( w_{a}\) in the ZPD are normalized and denoted as \( w_{a}^{n} \) and with probability \( p_{a} = w_{a}^{n}(1 - \gamma) + \gamma \frac{1}{|ZPD|}\), where \( a \in ZPD\). Once a problem is selected, it is presented to the student and correctness of the student answer for the \( i^{th}\) time the problem is presented is recorded as \( C_{i}\). References B. Clement, D. Roy, P.-Y. Oudeyer, and M. Lopes. Multi-armed bandits for intelligent tutoring systems. JEDM-Journal of Educational Data Mining, 7(n), 2015.

Rishabh Ranawat

Program2Tutor: Automatic Curriculum Generation with Bandits

Related Posts

DataRater: Meta-Learned Dataset Curation 15 Jan 2025

XQuiz: AI-Powered Learning from Your Twitter Feed 12 Jan 2025

DataMixer: A Library for Combining Datasets 10 Jan 2025