ChatGPT, Gemini und andere KI-Chatbots haben einen Test für Achtklässler erhalten, sie alle scheiterten an einer Aufgabe

A user presented various chatbots with a math test for eighth graders. All struggled with the same question.

What are chatbots? Chatbots are language models powered by artificial intelligence from various companies, designed and trained to perform tasks such as generating text or answering questions. They are built to conduct human-like conversations with users through text or voice chat.

The language model ChatGPT by OpenAI was essentially the pioneer of the chatbot. There are now many different AI models from various companies, including Google’s Gemini, DeepSeek, Claude, and Perplexity. There are also some free alternatives to ChatGPT.

“We have created a monster” – A Spanish model earns up to 10,000 euros a month, yet she is not real

Autoplay

What kind of test was it? A Polish Reddit user presented various AI chatbots with a math test for eighth graders and had the artificial intelligence answer the individual questions (via Reddit).

The models tested were OpenAI o3, Gemini 2.5 Pro, and Claude Sonnet 4. In total, the chatbots were supposed to solve 15 questions. However, the user did not provide any further instructions or solutions for the tasks.

The user also explained that the questions were not ones that had previously been used to train the AI models, as these tasks were only made public recently. The version of Gemini used, for example, was based on an older standard.

This is how the test went: The model from OpenAI and the one from Gemini answered 14 out of 15 questions correctly, but both failed on question 12. The Claude model only got 12 out of 15 questions right, but the user emphasized that he did not have access to the strongest model from Claude. The stronger model might have performed better.

More on the topic

Microsoft has finally realized that no one wants AI in Windows, announces improvements

von Benedikt Schlotmann

Steam expert pleads with developers: “Gabe Newell can’t buy yachts from wishlists”

von Cedric Holmeier

OpenAI turns off “Slop Machine” Sora, users now hope for the end of the RAM crisis

von Benedikt Schlotmann

Which question did the chatbots answer incorrectly? The task description shows a number line marked with points A, B, and C. Additionally, the segment AC is divided into 6 equal parts.

Students also see the coordinates 56 and 83 on the number line. They then need to assess whether the 2 subsequent statements are true or false:

The coordinate of point C is an even number.
The coordinate of point B is a number less than 74.

What was the mistake? To solve the task, students need to find out how long a segment on the line is. Between the coordinates 83 and 56, there are three segments. The total distance between 56 and 83 consists of 27 units. From this, one can conclude that each segment is 9 units long.

Subsequently, the intersection points of the line and the coordinates of point C can be calculated. The solution is: The first statement is false, as point C is at coordinate 101, which is an odd number, and the second statement is true because point B is to the left of the coordinate 74 on the line.

A screenshot from the Reddit user shows that ChatGPT assumed point B was at coordinate 74; however, it is slightly offset to the left. It incorrectly concluded that point B is not less than 74 but equal to it. We tested Gemini with the task, and Gemini made exactly the same mistake.

Deine Meinung? Diskutiere mit uns!

I like it!

Kommentieren

This is an AI-powered translation. Some inaccuracies might exist.

ChatGPT, Gemini and other AI chatbots were given a test for eighth graders, all of them failed at one task

Microsoft has finally realized that no one wants AI in Windows, announces improvements

The PS5 will be 100 euros more expensive in a few days, who should still buy it now?

OpenAI turns off “Slop Machine” Sora, users now hope for the end of the RAM crisis

With a trick from experts, you will become better role players right from the start of each D&D session

Monopoly GO: All Free Dice in March 2026 – Today’s New Codes

One of the best LoL teams throws out pro who will be banned by Riot forever

With 20,057,456,271 euros in debt, your best friend in Resident Evil 4 will have to take on multiple jobs – and we are the reason

Former Valve employee criticizes Epic Games CEO over layoffs: “Gabe can do it better than you”