|
|
da087e9ca0
|
Feat: Add workflow for single model benchmarks
|
2025-11-14 16:01:08 -08:00 |
|
|
|
ab4f7671c0
|
Refactor: Allow benchmark runs for a single model
|
2025-11-14 16:01:04 -08:00 |
|
|
|
57f89cc881
|
Revert: Update index.html
|
2025-11-14 11:49:05 -08:00 |
|
|
|
3f399a20fb
|
Update README
|
2025-11-14 11:46:08 -08:00 |
|
|
|
9d188647b1
|
Fix: add per-model grade summary
|
2025-11-13 19:49:15 -08:00 |
|
|
|
1b4541a603
|
Feat: Add summary grade row per model
|
2025-11-13 19:44:10 -08:00 |
|
github-actions[bot]
|
9a64997884
|
Docs: Update benchmark results
|
2025-11-14 03:31:28 +00:00 |
|
|
|
7052d4f4b5
|
Refactor: remove explicit CDN hint
|
2025-11-13 19:19:01 -08:00 |
|
|
|
1c9b2174d6
|
Revert: Update index.html
|
2025-11-13 18:56:21 -08:00 |
|
|
|
9932f76e57
|
Feat: add summaries trophies + table view
|
2025-11-13 18:46:02 -08:00 |
|
|
|
eb8cff6256
|
Fix: Clarify exact average calculation expectation
|
2025-11-13 17:29:02 -08:00 |
|
|
|
e6f5a570a8
|
Fix: Make graph truly bidirectional as stated
|
2025-11-13 17:29:00 -08:00 |
|
|
|
08242437db
|
Fix: Clarify scheduler slot expectations and make test more lenient
|
2025-11-13 17:27:49 -08:00 |
|
|
|
09c5e5dc3c
|
Fix: Correct expected value format in test
|
2025-11-13 17:18:29 -08:00 |
|
|
|
f40ddc263d
|
Feat: Add debug page for test execution details
|
2025-11-13 16:58:22 -08:00 |
|
|
|
021c1797a2
|
Delete debug.html
|
2025-11-13 16:50:45 -08:00 |
|
|
|
d793bbc3ce
|
Revert: Update test.js
|
2025-11-13 16:50:25 -08:00 |
|
|
|
4214ff9e93
|
Revert: Update test.js
|
2025-11-13 16:50:18 -08:00 |
|
|
|
f29805b459
|
Revert: Update test.js
|
2025-11-13 16:50:09 -08:00 |
|
|
|
3f447fead4
|
Revert: Update test.js
|
2025-11-13 16:49:43 -08:00 |
|
|
|
9c33f7f591
|
Revert: Update test.js
|
2025-11-13 16:49:32 -08:00 |
|
|
|
d33b11c8fd
|
Revert: Update test.js
|
2025-11-13 16:49:27 -08:00 |
|
|
|
53833b2084
|
Revert: Update test.js
|
2025-11-13 16:49:19 -08:00 |
|
|
|
f2e9b766dc
|
Revert: Update test.js
|
2025-11-13 16:48:59 -08:00 |
|
|
|
d6a8afab90
|
Refactor: Export test case inputs for debug page
|
2025-11-13 16:45:40 -08:00 |
|
|
|
4f27efa895
|
Refactor: Export test case inputs for debug page
|
2025-11-13 16:45:36 -08:00 |
|
|
|
106fd18aab
|
Refactor: Export test case inputs for debug page
|
2025-11-13 16:45:32 -08:00 |
|
|
|
a84cf7e674
|
Refactor: Export test case inputs for debug page
|
2025-11-13 16:45:29 -08:00 |
|
|
|
7ed0b15f54
|
Refactor: Export test case inputs for debug page
|
2025-11-13 16:45:26 -08:00 |
|
|
|
4b407b5f3d
|
Refactor: Export test case inputs for debug page
|
2025-11-13 16:45:23 -08:00 |
|
|
|
6c5fcba939
|
Refactor: Export test case inputs for debug page
|
2025-11-13 16:45:20 -08:00 |
|
|
|
8830581edb
|
Refactor: Export test case inputs for debug page
|
2025-11-13 16:45:18 -08:00 |
|
|
|
f8928bd9a9
|
Feat: Create debug page to show runtime outputs
|
2025-11-13 16:45:13 -08:00 |
|
|
|
f0a5c91d2b
|
Feat: Add debug output dashboard
|
2025-11-13 16:38:36 -08:00 |
|
|
|
21b9af7ffd
|
Fix: Differentiate null results in UI
|
2025-11-13 14:18:01 -08:00 |
|
github-actions[bot]
|
f2ef5831a7
|
Docs: Update benchmark results
|
2025-11-13 21:50:29 +00:00 |
|
|
|
59752cb111
|
Update README
|
2025-11-13 13:39:11 -08:00 |
|
|
|
78780bb183
|
Refactor: Improve markdown test robustness with DOM parsing
|
2025-11-13 13:33:19 -08:00 |
|
github-actions[bot]
|
a38ae2d0c5
|
Docs: Update benchmark results
|
2025-11-13 21:24:36 +00:00 |
|
|
|
63fdb538ff
|
Update README
|
2025-11-13 13:14:36 -08:00 |
|
|
|
b28d60e25e
|
Revert: Update test.js
|
2025-11-13 13:05:33 -08:00 |
|
|
|
8d8d4ed108
|
Revert: Update test.js
|
2025-11-13 13:05:15 -08:00 |
|
|
|
fc2832e202
|
Revert: Update test.js
|
2025-11-13 13:04:51 -08:00 |
|
|
|
2b3988cc93
|
Revert: Update test.js
|
2025-11-13 13:04:35 -08:00 |
|
|
|
d811c29f99
|
Revert: Update test.js
|
2025-11-13 13:03:54 -08:00 |
|
|
|
268c81d873
|
Revert: Update anthropic_claude-sonnet-4.5.js
|
2025-11-13 13:03:47 -08:00 |
|
|
|
ddb18f5d70
|
Revert: Update test.js
|
2025-11-13 13:03:31 -08:00 |
|
|
|
86478189e8
|
Revert: Update test.js
|
2025-11-13 13:03:15 -08:00 |
|
|
|
70c66de114
|
Revert: Update openai_gpt-5-codex.js
|
2025-11-13 13:03:09 -08:00 |
|
|
|
a65d9cf612
|
Revert: Update anthropic_claude-sonnet-4.5.js
|
2025-11-13 13:03:03 -08:00 |
|