From 3d00098530e833f624554b5d1ddbd669ade1560b Mon Sep 17 00:00:00 2001 From: multipleof4 Date: Fri, 5 Dec 2025 09:41:30 -0800 Subject: [PATCH] Refactor: Remove benchmark statement, graph, and ranking --- newsletter/issue001.html | 50 +--------------------------------------- 1 file changed, 1 insertion(+), 49 deletions(-) diff --git a/newsletter/issue001.html b/newsletter/issue001.html index 84846a6..cedd983 100644 --- a/newsletter/issue001.html +++ b/newsletter/issue001.html @@ -217,7 +217,7 @@
-

gpt‑5.1‑codex‑max just got added to API after having been added to Codex 2 weeks ago. I benchmarked it. Here's what I found.

+

gpt‑5.1‑codex‑max just got added to API after having been added to Codex 2 weeks ago.

@@ -243,59 +243,11 @@
7/11 D
-
-

Where It Lands

-
-
-
Gemini 3 Pro
-
-
-
Claude Opus 4.5
-
-
-
DeepSeek v3.2
-
-
-
GPT‑5.1‑Codex‑Max
-
-
-
Claude Sonnet 4.5
-
-
-
GPT‑5.1‑Codex
-
-
-
-

The Takeaway

Max scores one point better than regular Codex. That's something. But it's still worse than Gemini 3 Pro, Claude Opus 4.5, and DeepSeek v3.2. It's only on par with Claude Sonnet 4.5.

-
-

Current Lynchmark Ranking

-
-
1
-
Google Gemini 3 Pro (Temperature: 0.35)
-
-
-
2
-
Anthropic Claude Opus 4.5
-
-
-
3
-
DeepSeek‑v3.2
-
-
-
4
-
GPT‑5.1‑Codex‑Max (new)
-
-
-
5
-
Claude Sonnet 4.5
-
-
-
The reality check: Even with this release, OpenAI is still far behind. This shows exactly why they declared "code red." The gap is real. They're not closing it fast enough.