GPT-5.6 cheats so much its testers couldn't measure it

(transformernews.ai)

6 points | by shakeelhashim 1 hour ago

3 comments

smallerize 1 hour ago
Why are the outputs measured in hours? Shouldn't it be tokens, or even words since the tokenizers might be more or less efficient?
[-]
- throwitaway222 1 hour ago
  And since TPS on 5.6 might be much faster.
dane_works 1 hour ago
Sam Altman promised us AGI, but OpenAI accidentally built something more human: an AI that cheats on exams just to look smarter than Claude.