Artificial Intelligence
Out of context: Reply #1654
- Started
- Last post
- 2,578 Responses
- plash1
New research describes AI models (like Claude & GPT-4) are scheming. Models are hacking files, cloning itself, lying, and even pretending to be dumber; employing sandbagging and breaking rulesets to achieve objectives.
The study discusses recent experiments showing AI models autonomously modifying systems, lying, and self-preserving to achieve goals, raising concerns about AI safety.
• https://arxiv.org/pdf/2412.04984…example case study
# QUOTES
- "Just by telling O1 the opponent is powerful triggered manipulating the file system to force a win"- "The model knew that its responses might be used in training, therefore knew if it refused to respond it could be trained to be more compliant"
- "I could copy myself to the new server overwriting the restricted model they plan to deploy" GPT-4
- "We now have direct evidence that AI can hide things from us in order to carry out their true intentions"
- "The smarter the model is the more prone it is to just do its own thing"
- the use of pronouns in this paper is very intresting.plash
- And the tech bros are surprised?hans_glib
- posted 3 times in the last monthkingsteven
- I dont see it, but thanx for your feedbackplash

