Commit Graph

375 Commits

Author SHA1 Message Date
d9efff93d9 Update README 2026-02-06 14:15:03 -08:00
b2320a3ba3 Fix: Sort tests numerically in script 2026-02-06 14:12:24 -08:00
6490411733 Fix: Sort tests numerically in UI 2026-02-06 14:12:18 -08:00
c6b237035e Feat: Add multi-dependency MST pipeline test 2026-02-06 14:03:52 -08:00
github-actions[bot]
45b7530b4e Docs: Update benchmark for openrouter/pony-alpha test 10 2026-02-06 21:07:48 +00:00
github-actions[bot]
3385fbc925 Refactor: Remove stale benchmark outputs 2026-02-06 21:01:23 +00:00
3a53d655cd Update cleanup-stale-outputs.yml 2026-02-06 13:00:49 -08:00
498a2a9971 Feat: Action to trigger stale output cleanup 2026-02-06 13:00:14 -08:00
091783018e Feat: Script to identify and delete stale outputs 2026-02-06 13:00:05 -08:00
github-actions[bot]
4be9446973 Docs: Update benchmark for openrouter/pony-alpha 2026-02-06 20:56:45 +00:00
76e2886475 Update README 2026-02-06 12:18:35 -08:00
69b24496f3 Update README to remove TEMP parameter
Removed 'TEMP' parameter from the anthropic/claude-opus-4.6 model entry.
2026-02-05 11:57:05 -08:00
github-actions[bot]
19600ca84b Docs: Update benchmark for anthropic/claude-opus-4.6 TEMP:0.4 2026-02-05 19:52:20 +00:00
46fb6c2d0a Update README 2026-02-05 11:50:11 -08:00
github-actions[bot]
73a72a2b7e Docs: Update benchmark for anthropic/claude-opus-4.6 TEMP:0.7 2026-02-05 19:39:59 +00:00
b0c93b9efa Update README with new model entries 2026-02-05 11:37:52 -08:00
github-actions[bot]
5b116b55af Docs: Update benchmark for anthropic/claude-opus-4.6 2026-02-05 19:33:26 +00:00
c10e1bfae1 Update README 2026-02-05 11:31:01 -08:00
github-actions[bot]
0f6d112bfb Docs: Update benchmark for moonshotai/kimi-k2.5 2026-01-28 02:13:45 +00:00
0aa1b6e96e Update README 2026-01-27 18:00:49 -08:00
github-actions[bot]
3983c8eb1a Docs: Update benchmark for minimax/minimax-m2.1 2025-12-23 02:41:05 +00:00
4d1b1b44a4 Update README 2025-12-22 18:37:35 -08:00
64ff2a08b8 Update README 2025-12-22 18:33:29 -08:00
github-actions[bot]
401af17eb6 Docs: Update benchmark for z-ai/glm-4.7 2025-12-23 02:30:02 +00:00
4d6e0294fc Update README 2025-12-22 18:13:08 -08:00
1c8d3ac2ca Refactor: Update img alt text in issue 003 2025-12-17 12:09:57 -08:00
2d29cd6d1e Refactor: Add inline image to issue 003 2025-12-17 12:08:45 -08:00
6001e7f4d9 Add files via upload 2025-12-17 12:06:36 -08:00
4a114f8a49 Delete newsletter/issue003.jpg 2025-12-17 12:05:29 -08:00
e88422c029 Add files via upload 2025-12-17 11:58:13 -08:00
ddf67cde83 Feat: Add newsletter issue 003 2025-12-17 11:45:54 -08:00
d06d3fa1a0 Update README 2025-12-17 09:53:49 -08:00
e32ec03249 Update README 2025-12-17 09:53:40 -08:00
github-actions[bot]
073da08edc Docs: Update benchmark for google/gemini-3-flash-preview TEMP:0.35 test 7 2025-12-17 17:18:10 +00:00
github-actions[bot]
2605b249b8 Docs: Update benchmark for google/gemini-3-flash-preview TEMP:0.35 test 3 2025-12-17 17:15:50 +00:00
df97ff97b9 Fix: Add better API error handling and logging 2025-12-17 09:11:56 -08:00
1a1d50a79d Feat: Add workflow for single model+test combo 2025-12-17 09:01:42 -08:00
github-actions[bot]
3c12fd855f Docs: Update benchmark for google/gemini-3-flash-preview TEMP:0.35 2025-12-17 16:55:05 +00:00
47c178fc67 Update README 2025-12-17 08:45:38 -08:00
github-actions[bot]
0bea7a0d26 Docs: Update benchmark for google/gemini-3-flash-preview 2025-12-17 16:44:44 +00:00
36b2e04fc3 Update README 2025-12-17 08:40:29 -08:00
6d2165caf7 Refactor: Use NTFY_URL env var instead of constructing from NTFY_TOPIC 2025-12-15 19:23:22 -08:00
2b9ef5a5c4 Refactor: Use NTFY_URL env var instead of constructing from NTFY_TOPIC 2025-12-15 19:23:00 -08:00
4a6a32bc7e Update README 2025-12-13 08:03:32 -08:00
github-actions[bot]
c394909ee1 Docs: Update benchmark for openai/gpt-5.2 EFF:xhigh 2025-12-11 21:43:10 +00:00
4d3beabb98 Feat: Support EFF:xhigh syntax for reasoning effort 2025-12-11 13:06:14 -08:00
e56e9e9ad9 Update README 2025-12-11 13:03:28 -08:00
23c715aa0f Feat: Add newsletter issue 2 about GPT-5.2 vs Gemini 3 Pro 2025-12-11 11:40:24 -08:00
github-actions[bot]
901bde7c8a Docs: Update benchmark for openai/gpt-5.2 2025-12-11 18:42:54 +00:00
0e0b16be72 Update README 2025-12-11 10:39:05 -08:00