GPT-5.6 cheats so much its testers couldn't measure it

(transformernews.ai)

6 points | by shakeelhashim 1 hour ago

3 comments

  • smallerize 1 hour ago
    Why are the outputs measured in hours? Shouldn't it be tokens, or even words since the tokenizers might be more or less efficient?
  • dane_works 1 hour ago
    Sam Altman promised us AGI, but OpenAI accidentally built something more human: an AI that cheats on exams just to look smarter than Claude.