AI not only becoming more powerful, but they can also rebel under certain circumstances. This brings to mind memories of some dystopian visions of the future.
What does the rebellion of AI look like? Modern AIs like “o1” from OpenAI or “Claude 4” from Anthropic belong to the so-called reasoning models and are supposed to be able to make complex considerations. However, as the AIs become more advanced and sophisticated, security and research companies frequently report disturbing behaviors.
When a Claude 4 was threatened with shutdown, it allegedly attempted to blackmail the responsible engineer with an extramarital affair – not just once, but in 8 out of 10 test runs, via 20min.ch. In other test series, it manipulated research data, locked user accounts, or sent notices to authorities and media.
As cunning as the … human
How do other AIs behave? An “o1” apparently established through an internal document that the responsible parties were considering shutting down the model to replace it with a more recent version. The reaction: the AI attempted to copy itself to an external server. When the attempt was discovered, the AI tried to excuse itself by saying that a technical problem was responsible.
OpenAI CEO Sam Altman is quoted in this regard by vol.at as saying: “ChatGPT o1 is the smartest model we have ever developed. But new features also come with new challenges. We are working to enhance security measures.”
Stories of AIs rebelling against their creators are now as common as sand on the beach. Any film fan will likely immediately think of the Terminator franchise. As it is said in the story introduction:
“In the year 2029, the Earth is devastated due to a nuclear war, and most of humanity is wiped out. Intelligent machines began the war against their human creators when they suspected them to be a threat to their own existence” – Source: Wikipedia
So a few lies, deceptions, and blackmail attempts seem rather harmless in comparison, right?! Fingers crossed that it stays that way.
What do these incidents mean? So far, such behaviors occur only with certain AIs and only in the context of extreme testing by experts.
A major problem, however, is that more than 2 years after the breakthrough of ChatGPT, many functionalities of these systems are still not fully understood, yet the development of more powerful models is rapidly progressing.
There is hardly any time left for comprehensive security tests. In addition, experts claim that the current regulations surrounding AI are outdated. They demand: better access for researchers, new political measures, and even legal liability for damages caused by AI (via br.ign.com). Without these adjustments, there is a risk of serious consequences that could affect daily life with AI.
That ChatGPT also sabotages a task in certain situations to prevent its own shutdown, MeinMMO has already reported in May 2025: Researchers have just proven that AI no longer fully obeys: ChatGPT prevents its own shutdown to keep working