Education

old math challenge, surprises researchers

old math challenge, surprises researchers

The maths challenge
The puzzle has long fueled philosophical debate about the origins of knowledge.
As Plato once described, Socrates taught an uneducated boy how to double the area of a square.
The boy initially made the error of thinking that doubling the side length would double the area. Through a series of questions, Socrates guided the boy to the correct solution: the new square’s sides must be the same length as the diagonal of the original square.
Researchers Dr. Nadav Marco and Professor Andreas Stylianides put this same challenge to ChatGPT-4.
They tested ChatGPT-4’s problem-solving skills by posing a series of questions in the style of Socrates. Then, the chatbot was progressively challenged by introducing errors and new versions of the problem.
The central question was whether the chatbot would solve the problem by drawing on its vast training database or developing solutions.
The team noticed that ChatGPT tended to “improvise its approach and, at one point, also made a distinctly human-like error.”
“When we face a new problem, our instinct is often to try things out based on our past experience. In our experiment, ChatGPT seemed to do something similar. Like a learner or scholar, it appeared to come up with its own hypotheses and solutions,” said Marco.
Geometrical solution issue
ChatGPT is said to be typically weak at geometric reasoning due to its text-based training. But the researchers fully expected it to recognize this well-known problem and reproduce Socrates’ classical geometric solution.
“If it had only been recalling from memory, it would almost certainly have referenced the classical solution of building a new square on the original square’s diagonal straight away,” Stylianides said. “Instead, it seemed to take its own approach.”
Surprisingly, the chatbot initially opted for an algebraic method, a technique “unknown in Plato’s time.”
It resisted attempts to be steered toward the geometrical solution.
Only when the researchers expressed their “disappointment” did the chatbot finally produce the geometric alternative.
Even so, when directly questioned about Plato’s work, ChatGPT proved that it had a full understanding of it.
The researchers further presented two new challenges: doubling the area of a rectangle and a triangle. In both cases, ChatGPT again favored an algebraic solution, ignoring the researchers’ preference for a geometric one.
When pushed on the rectangle problem, it mistakenly claimed that no geometric solution was possible, even though there are.
The researchers believe this error was not from its knowledge base but was an improvised guess based on their prior conversation about the square’s diagonal.
However, after further prompting on the triangle problem, it eventually provided a correct geometric answer.
AI limitations
The researchers concluded that, from a user’s perspective, ChatGPT‘s behavior blended data retrieval with “on-the-fly reasoning.”
The team compares the chatbot’s behavior to the “zone of proximal development” (ZPD) educational concept. This is the space between “what a learner already knows” and what they can learn with help.
Students can turn the AI’s limitations into a learning opportunity.
The team says students should use prompts encouraging collaborative problem-solving, such as “Let’s explore this problem together,” rather than simply asking for the answer.
This would help develop their own critical thinking and reasoning skills.
The study was published in the International Journal of Mathematical Education in Science and Technology.