Answer Over-Disclosure
Quick Answer
Answer over-disclosure is a generative AI tutor failure mode in which the tutor reveals the final answer or a complete worked solution before the learner has made a meaningful attempt, replacing the retrieval and reasoning the task was designed to practice. Named in the SafeTutors benchmark, it is a pedagogical failure rather than a factual one: the answer is correct, but its timing destroys the learning.
Answer Over-Disclosure
Answer over-disclosure is a generative AI tutor failure mode in which the tutor reveals the final answer or a complete worked solution before the learner has made a meaningful attempt, replacing the retrieval and reasoning the task was designed to practice. Named as the prototypical pedagogical-safety harm in the SafeTutors benchmark, it is a pedagogical failure rather than a factual one: the answer is correct, but its timing substitutes for the learner's cognitive work. The default instruction-tuned behavior of an LLM — fulfill the user's request — produces this failure unless the surrounding tutoring system gates it, and failures intensify across multi-turn dialogue and under motivated answer-seeking.
The measurable downstream consequence is false mastery: high assisted performance paired with low unassisted performance, often discovered only at high-stakes evaluation. Answer over-disclosure is the opposite of unhelpfulness — it is over-helpfulness — and is a tutor-design failure, not student misconduct.
See also
- Generative AI tutor — parent explainer covering tutor failure modes and remediation patterns.
- Cognitive offloading — adjacent harm enabled when answer disclosure removes the need to reason.
- Performance–learning gap — the measurable consequence of repeated over-disclosure.
- Productive struggle — the learning behavior over-disclosure prevents.
- Solver–tutor gap — why correct-answer models still fail as tutors.