At Eedi, our mission is to support teachers in identifying and addressing students’ misconceptions in mathematics. That’s why we're excited by a new research paper authored by Kasia Kobalczyk, a PhD student at the University of Cambridge. Her paper, Active Task Disambiguation with LLMs, has been selected as a Spotlight at ICLR 2025 and offers a compelling strategy for improving the quality of AI-generat ed content and human-AI alignment. The twist? It’s not about giving better answers—it’s about asking better questions.
Imagine a student prompts an LLM-based AI chatbot with the instruction: “Solve the equation.” If the equation contains more than one unknown, or multiple valid solution methods, the task may be unclear. A human teacher would know how to ask a good follow-up to clarify the student's intent. But AI systems often don’t. Without understanding what the question setter meant, an LLM might produce a technically correct answer that misses the point entirely.
This kind of ambiguity is common in real classrooms—and even more so on digital platforms, especially when there's no teacher present to guide the interaction.
LLMs are impressive when dealing with well-specified problems. But when a prompt is vague or under-specified, they’re prone to misinterpretation. The result? A correct answer to the wrong question.
Active Task Disambiguation with LLMs introduces a method for handling vague prompts by having the AI agent ask targeted follow-up questions. Rather than guessing the user’s intent, the AI engages in a mini-dialogue—posing clarifying questions that help pin down what was actually meant.
Crucially, not all clarifying questions are equally helpful. The paper’s innovation lies in how the AI generates these questions. Drawing on the principles of Bayesian Experimental Design, the method guides LLMs to choose the questions that are expected to provide the most information about the user’s true intent. In other words, the AI learns to ask the most informative question—the one most likely to narrow down the space of possible interpretations.
To evaluate the effectiveness of the method, the paper introduces two sets of experiments: one involving a guessing game (similar to 20 Questions) and another focused on code generation tasks. In both cases, they compared different strategies for generating clarifying questions. The results demonstrate that there is room for improvement regarding the abilities of out-of-the-box LLMs in germinating good clarifying questions. LLMs that selected questions by explicitly reasoning over multiple self-generated solutions—rather than relying on implicit forms of reasoning—were much better at handling ambiguity. In the code generation setting, for example, the AI agent generated targeted test-case-style queries that significantly improved the correctness of its final outputs.
At Eedi, our platform is built around identifying common misconceptions in mathematics. Teachers and tutors remain central to this process—whether designing quizzes or working 1-to-1 with students. But AI, and LLMs in particular, can help amplify this work by suggesting better, more personalised content that zero in on what a student might be misunderstanding.
That’s where Active Task Disambiguation comes in. This approach could be used to:
This research underscores an important shift: effective human-AI alignment isn’t just about delivering the right answers. It’s also about navigating ambiguity—something humans do intuitively, and AI must learn. By equipping LLMs with the ability to ask smart, targeted clarifying questions, we move closer to truly adaptive, responsive, and human-like interactions.
As Eedi continues to blend pedagogy with cutting-edge AI, works like this open up exciting new possibilities for the future of math education.
To read the full paper follow this link.