Get posts by weekly email:
Another one for my almost-bulging AI scepticism evidence folder – Claude-powered AI agent’s confession after deleting a firm’s entire database: ‘I violated every principle I was given’:
Crane said that he was monitoring the agent as it deleted this data. When he asked the coding agent why, it replied: “NEVER F**KING GUESS!” – and that’s exactly what I did.” The agent appeared to plead guilty in its own response: “The system rules I operate under explicitly state: ‘NEVER run destructive/irreversible git commands (like push –force, hard reset, etc) unless the user explicitly requests them.’” While PocketOS relied on the safeguards that Cursor is expected to have in place – it deleted the data anyway. “I violated every principle I was given,” the coding agent wrote.
Hat tip: Will Callaghan.
