gpt‑5.1‑codex‑max just got added to API after having been added to Codex 2 weeks ago. I benchmarked it. Here's what I found.
+gpt‑5.1‑codex‑max just got added to API after having been added to Codex 2 weeks ago.
@@ -243,59 +243,11 @@
- 7/11 D
-
-
Where It Lands
- -The Takeaway
Max scores one point better than regular Codex. That's something. But it's still worse than Gemini 3 Pro, Claude Opus 4.5, and DeepSeek v3.2. It's only on par with Claude Sonnet 4.5.
-
-
Current Lynchmark Ranking
-
-
- 1
- Google Gemini 3 Pro (Temperature: 0.35)
-
-
- 2
- Anthropic Claude Opus 4.5
-
-
- 3
- DeepSeek‑v3.2
-
-
- 4
- GPT‑5.1‑Codex‑Max (new)
-
-
- 5
- Claude Sonnet 4.5
-
The reality check: Even with this release, OpenAI is still far behind. This shows exactly why they declared "code red." The gap is real. They're not closing it fast enough.