BacPrep: Lessons from Deploying an LLM-Based Bacalaureat Assessment Platform
Adrian-Marius Dumitran, Radu Dita, Angela Liliana Dumitran
Comments First version ACCEPTED at BBGI (ITS 2025 Workshop) Second versions ACCEPTED at ITS 2026
详情
Accessing quality preparation and feedback for the Romanian Bacalaureat exam is challenging, particularly for students in remote or underserved areas. This paper presents BacPrep, an experimental online platform exploring Large Language Model (LLM) potential for automated assessment, aiming to offer a free, accessible resource. Using official exam questions from the last 5 years, BacPrep employs the latest available Gemini Flash model (currently Gemini 2.5 Flash, via the \texttt{gemini-flash-latest} endpoint) to prioritize user experience quality during the data collection phase, with model versioning to be locked for subsequent rigorous evaluation. The platform has collected over 100 student solutions across Computer Science and Romanian Language exams, enabling preliminary assessment of LLM grading quality. This revealed several significant challenges: grading inconsistency across multiple runs, arithmetic errors when aggregating fractional scores, performance degradation under large prompt contexts, failure to apply subject-specific rubric weightings, and internal inconsistencies between generated scores and qualitative feedback. These findings motivate a redesigned architecture featuring subject-level prompt decomposition, specialized per-subject graders, and a median-selection strategy across multiple runs. Expert validation against human-graded solutions remains the critical next step.