At Eedi, we’re obsessed with a deceptively simple question: why do students get things wrong?
We’re not talking about slips like picking B instead of A by accident. We’re talking about the deeper kind of wrong, like when a student always works left to right, ignoring the order of operations. These kind of mistakes reveal something something more profound: a misconception.
If we can spot them, we can intervene and prevent a misconception cascade, where unresolved misconceptions lead to new ones.
Over the years, we’ve collected a lot of student responses to our diagnostic multiple-choice maths questions (MCQs). But unlike standard MCQs, each incorrect answer (called a distractor) is carefully crafted to reveal a specific misconception.
But here’s the thing: while we had all this rich data, we didn’t have labels linking distractors to the misconceptions they revealed.
And manually tagging them? Painful. Slow. Inconsistent. Not scalable.
So we asked ourselves:
Could a machine learning model help us do this better? Could it learn to tag distractors with the right misconceptions, or at least give teachers a solid head start?
We had no idea. We hadn’t built a model for this. But we knew how to find out.
Rather than cooking something up in secret, we opened the challenge to the world. We launched a competition on Kaggle, the go-to platform for data scientists to flex their skills. We called it Eedi - Mining Misconceptions in Mathematics.
So we kept the task clear:
🧠 Given a distractor and a list of misconception descriptions, predict which ones match.
Simple to say, tough to solve — especially when the many of the misconceptions in the test set were unseen — they had not been encountered during training.
Now, this wasn’t your typical NLP challenge. These distractors live in the weird and wonderful world of maths education — full of numbers, logic traps, and deeply specific student reasoning. It’s not the kind of task that generic language models handle well out of the box.
This was completely new territory for us. We hadn’t tried solving this problem before and didn’t have a go-to model. We just had a hunch it was doable and that the global data science community might come up with solutions we’d never think of on our own.
And wow, did they deliver. We saw a fantastic mix of submissions, some wildly creative, others deeply technical, all impressive in their own ways.
It was our first time exploring this specific problem, but not our first time in the competition space. Our NeurIPS 2020 dataset won best dataset at EDM 2021, and our NeurIPS 2022 dataset was voted best at CLeaR 2023.
We offered two types of prizes:
This winning solution used a multi-stage retrieve-and-rerank pipeline built on Qwen LLMs:
This team used chain-of-thought prompting with Qwen2.5 to guide the model in reasoning through each distractor.
Focused on robustness and generalisation, especially to unseen misconceptions:
Built a fast, compact model using Qwen2.5-0.5B as the base:
Took a minimalist approach with all-MiniLM-L6-v2 (22.7M parameters):
The competition gave us more than just leaderboard results. It offered something even more valuable: insight into what’s possible.
It’s already inspired early prototypes inside Eedi, and helped us reimagine how we might support teachers in the process of tagging misconceptions, making it faster, more consistent, and more scalable across subjects and topics.
We want to say a huge thank you to every participant, and to our partners at Vanderbilt University, The Learning Agency Lab, and Kaggle for making this all possible.
And of course, we’re deeply grateful to our supporters; the Bill & Melinda Gates Foundation, Schmidt Futures, and the Chan Zuckerberg Initiative, for backing this work.
Competitions like this make us better. They bring new minds to tough problems. They challenge assumptions. And they often lead to tools that help real teachers and real students.
We know it’s not realistic to be experts in every domain. But we believe in asking good questions, sharing good data, and creating space for others to build alongside us.
If you’re a researcher, engineer, or just someone who geeks out over learning and data, we’d love to have you on the next one.