Who Am I? History-Aware Profiles for Student Simulation in Tutoring Dialogues
我是谁?面向辅导对话中学生模拟的历史感知档案
Zhangqi Duan, Shuyan Huang, Alexander Scarlatos, Jaewook Lee, Simon Woodhead, Andrew Lan
AI总结 提出历史条件的学生模拟任务,通过强化学习训练档案生成器和模拟器,利用学生历史信息准确预测对话轮次,在数学学习平台数据集上显著优于基线。
详情
开发基于大型语言模型(LLM)的自动化辅导工具的一个关键部分是学生模拟,即使用LLM扮演学生角色,这可以促进辅导模型的评估和训练。现有工作主要关注对话内模拟,缺乏关于学生知识和行为的上下文,部分原因是没有基于过去的学生问答或对话交互。在这项工作中,我们引入了历史条件的学生模拟任务,其目标是通过利用学生学习历史中的信息准确预测学生对话轮次。我们提出了一个双组件框架,其中档案生成器总结学生历史,模拟器基于生成的档案预测学生轮次。我们使用强化学习(RL)训练这两个组件,生成针对忠实学生模拟优化的档案。我们在从数学学习平台收集的首个真实世界学生对话和问答响应数据集上评估了我们的方法和基线。大量实验表明,我们的方法显著优于基线,并证明了历史、档案和RL训练的重要性。
A key part of developing large language model (LLM)-powered, automated tutoring tools is student simulation, i.e., using LLMs to role-play as students, which can facilitate tutor model evaluation and training. Existing work mostly focuses on within-dialogue simulation, which lacks context on student knowledge and behavior, partly due to not grounding in past student question-answering or dialogue interactions. In this work, we introduce the task of history-conditioned student simulation, where the goal is to accurately predict student dialogue turns by leveraging information in the student's learning history. We propose a two-component framework in which a profile generator summarizes a student's history and a simulator predicts student turns conditioned on the resulting profile. We train both components with reinforcement learning (RL), yielding profiles optimized for faithful student simulation. We evaluate our method and baselines on the first-of-its-kind real-world dataset of student dialogues and question responses that we collect from a math learning platform. Extensive experiments show that our method significantly outperforms baselines, and demonstrate the importance of history, profiles, and RL training.