ChatGPT 5.2 Thinking Eval

  • Post author:
  • Post category:Blog
  • Post comments:0 Comments

Evaluation of ChatGPT 5.2 using our Perth Endangered Wildlife Eval shows a dramatic improvement, which we attribute mostly to system improvements.

I really should have been timing each run of these evals. The new ChatGPT 5.2 Thinking model (with Extended thinking turned on) took 15 minutes before reporting “Network connection lost” and essentially crashing reported thinking trace. This seems to be OpenAI’s code for “I was thinking for too long and consumed too many resources, so got cut off”. A second attempt (thinking trace, conversation) managed to get everything done in 11m 7s, due mostly to fewer revisions. On both attempts, 5.2 agreed with Gemini 3 Pro that the Western Swamp Tortoise is the threatened species to work on.

(more…)

Continue ReadingChatGPT 5.2 Thinking Eval