Eedi Blog

All blog posts

Improved Human-AI Alignment by Asking Smarter Clarifying Questions

Simon Woodhead

At Eedi, our mission is to support teachers in identifying and addressing students’ misconceptions in mathematics. That’s why we're excited by a new research paper authored by Kasia Kobalczyk, a PhD student at the University of Cambridge. Her paper, Active Task Disambiguation with LLMs, has been selected as a Spotlight at ICLR 2025 and offers a compelling strategy for improving the quality of AI-generat ed content and human-AI alignment. The twist? It’s not about giving better answers—it’s about asking better questions.

An example ambiguous problem statement. With multiple plausible solutions, the true user intent is unclear. The AI agent can ask a clarifying question, but which one is the most informative?

The Problem: AI Struggles with Ambiguity

Imagine a student prompts an LLM-based AI chatbot with the instruction: “Solve the equation.” If the equation contains more than one unknown, or multiple valid solution methods, the task may be unclear. A human teacher would know how to ask a good follow-up to clarify the student's intent. But AI systems often don’t. Without understanding what the question setter meant, an LLM might produce a technically correct answer that misses the point entirely.

This kind of ambiguity is common in real classrooms—and even more so on digital platforms, especially when there's no teacher present to guide the interaction.

LLMs are impressive when dealing with well-specified problems. But when a prompt is vague or under-specified, they’re prone to misinterpretation. The result? A correct answer to the wrong question.

The Solution: Teaching AI to Ask the Right Questions

Active Task Disambiguation with LLMs introduces a method for handling vague prompts by having the AI agent ask targeted follow-up questions. Rather than guessing the user’s intent, the AI engages in a mini-dialogue—posing clarifying questions that help pin down what was actually meant.

Crucially, not all clarifying questions are equally helpful. The paper’s innovation lies in how the AI generates these questions. Drawing on the principles of Bayesian Experimental Design, the method guides LLMs to choose the questions that are expected to provide the most information about the user’s true intent. In other words, the AI learns to ask the most informative question—the one most likely to narrow down the space of possible interpretations.

The workflow of Active Task Disambiguation. 1. The initial problem statement is presentedto the AI agent. 2. The agent reasons about the problem to infer the set of solutions compatible with the currently available requirements. In order to approximate the space of plausible solutions, a set of candidate solutions is sampled. 3. To discern between different solution variants, the agent generates candidate clarifying questions. 4. A question with the highest utility is selected and presented to the oracle.5. Based on the oracle answer, the problem statement is extended by the new specification; the process can be repeated with the extended problem statement resulting in a reduced space of compatible solutions.

Putting It to the Test: What the Experiments Showed

To evaluate the effectiveness of the method, the paper introduces two sets of experiments: one involving a guessing game (similar to 20 Questions) and another focused on code generation tasks. In both cases, they compared different strategies for generating clarifying questions. The results demonstrate that there is room for improvement regarding the abilities of out-of-the-box LLMs in germinating good clarifying questions. LLMs that selected questions by explicitly reasoning over multiple self-generated solutions—rather than relying on implicit forms of reasoning—were much better at handling ambiguity. In the code generation setting, for example, the AI agent generated targeted test-case-style queries that significantly improved the correctness of its final outputs.

Accuracy of generated code solutions after eliciting 4 additional requirements with different querying strategies.  EIG-based strategies lead to higher accuracy of the outputs given the same number of test cases queried.

Why This Matters for AI-Powered Math Education

At Eedi, our platform is built around identifying common misconceptions in mathematics. Teachers and tutors remain central to this process—whether designing quizzes or working 1-to-1 with students. But AI, and LLMs in particular, can help amplify this work by suggesting better, more personalised content that zero in on what a student might be misunderstanding.

That’s where Active Task Disambiguation comes in. This approach could be used to:

  • Support quiz generation, by identifying which clarifying questions best differentiate between similar misconceptions.
  • Assist tutors in real-time, by proposing high-yield follow-up questions during live student-tutor sessions.
  • Improve teacher workflows, by surfacing potentially ambiguous tasks and offering suggestions to refine them.

The Bottom Line

This research underscores an important shift: effective human-AI alignment isn’t just about delivering the right answers. It’s also about navigating ambiguity—something humans do intuitively, and AI must learn. By equipping LLMs with the ability to ask smart, targeted clarifying questions, we move closer to truly adaptive, responsive, and human-like interactions.

As Eedi continues to blend pedagogy with cutting-edge AI, works like this open up exciting new possibilities for the future of math education.

To read the full paper follow this link.

Written by
Simon Woodhead
Data scientist and co-founder

More from the Eedi Blog

Improved Human-AI Alignment by Asking Smarter Clarifying Questions

At Eedi, our mission is to support teachers in identifying and addressing students’ misconceptions in mathematics. That’s why we're excited by a new research paper authored by Kasia Kobalczyk, a PhD student at the University of Cambridge. Her paper, Active Task Disambiguation with LLMs, has been selected as a Spotlight at ICLR 2025 and offers a compelling strategy for improving the quality of AI-generat ed content and human-AI alignment. The twist? It’s not about giving better answers—it’s about asking better questions.

From Wrong Answers to Real Insights: How We Used a Kaggle Challenge to Map Student Misconceptions

At Eedi, we’re obsessed with a deceptively simple question: why do students get things wrong? We’re not talking about slips like picking B instead of A by accident. We’re talking about the deeper kind of wrong, like when a student always works left to right, ignoring the order of operations. These kind of mistakes reveal something something more profound: a **misconception**. If we can spot them, we can intervene and prevent a *misconception cascade*, where unresolved misconceptions lead to new ones.

Eedi’s Magic Number 🪄

Why 120 questions per year can transform maths progress in your class... At Eedi, teachers often ask: *“How many quizzes do my students really need to complete to meaningfully improve their maths?”* Thanks to a rigorous, externally conducted study, we now have a clear answer: students significantly benefit when they complete at least **120 Eedi check-in questions** over the course of a school year. Here’s what this means in practice — and why it matters for you and your students.