Commit Graph

192 Commits

Author SHA1 Message Date
7012269f53 Feat: Add geospatial analysis benchmark 2025-11-18 08:33:42 -08:00
ced91e61b5 Feat: Add scrypt-js benchmark test 2025-11-15 17:53:03 -08:00
github-actions[bot]
24de0a1a87 Docs: Update benchmark for openrouter/sherlock-dash-alpha 2025-11-16 00:31:49 +00:00
github-actions[bot]
9450b6f936 Docs: Update benchmark for openrouter/sherlock-think-alpha 2025-11-16 00:31:00 +00:00
fc98f13849 Update README 2025-11-15 16:24:54 -08:00
fd29f4653e Fix: add per-model grade summary 2025-11-14 16:17:25 -08:00
github-actions[bot]
0d5effb238 Docs: Update benchmark for moonshotai/kimi-k2-thinking 2025-11-15 00:13:33 +00:00
da087e9ca0 Feat: Add workflow for single model benchmarks 2025-11-14 16:01:08 -08:00
ab4f7671c0 Refactor: Allow benchmark runs for a single model 2025-11-14 16:01:04 -08:00
57f89cc881 Revert: Update index.html 2025-11-14 11:49:05 -08:00
3f399a20fb Update README 2025-11-14 11:46:08 -08:00
9d188647b1 Fix: add per-model grade summary 2025-11-13 19:49:15 -08:00
1b4541a603 Feat: Add summary grade row per model 2025-11-13 19:44:10 -08:00
github-actions[bot]
9a64997884 Docs: Update benchmark results 2025-11-14 03:31:28 +00:00
7052d4f4b5 Refactor: remove explicit CDN hint 2025-11-13 19:19:01 -08:00
1c9b2174d6 Revert: Update index.html 2025-11-13 18:56:21 -08:00
9932f76e57 Feat: add summaries trophies + table view 2025-11-13 18:46:02 -08:00
eb8cff6256 Fix: Clarify exact average calculation expectation 2025-11-13 17:29:02 -08:00
e6f5a570a8 Fix: Make graph truly bidirectional as stated 2025-11-13 17:29:00 -08:00
08242437db Fix: Clarify scheduler slot expectations and make test more lenient 2025-11-13 17:27:49 -08:00
09c5e5dc3c Fix: Correct expected value format in test 2025-11-13 17:18:29 -08:00
f40ddc263d Feat: Add debug page for test execution details 2025-11-13 16:58:22 -08:00
021c1797a2 Delete debug.html 2025-11-13 16:50:45 -08:00
d793bbc3ce Revert: Update test.js 2025-11-13 16:50:25 -08:00
4214ff9e93 Revert: Update test.js 2025-11-13 16:50:18 -08:00
f29805b459 Revert: Update test.js 2025-11-13 16:50:09 -08:00
3f447fead4 Revert: Update test.js 2025-11-13 16:49:43 -08:00
9c33f7f591 Revert: Update test.js 2025-11-13 16:49:32 -08:00
d33b11c8fd Revert: Update test.js 2025-11-13 16:49:27 -08:00
53833b2084 Revert: Update test.js 2025-11-13 16:49:19 -08:00
f2e9b766dc Revert: Update test.js 2025-11-13 16:48:59 -08:00
d6a8afab90 Refactor: Export test case inputs for debug page 2025-11-13 16:45:40 -08:00
4f27efa895 Refactor: Export test case inputs for debug page 2025-11-13 16:45:36 -08:00
106fd18aab Refactor: Export test case inputs for debug page 2025-11-13 16:45:32 -08:00
a84cf7e674 Refactor: Export test case inputs for debug page 2025-11-13 16:45:29 -08:00
7ed0b15f54 Refactor: Export test case inputs for debug page 2025-11-13 16:45:26 -08:00
4b407b5f3d Refactor: Export test case inputs for debug page 2025-11-13 16:45:23 -08:00
6c5fcba939 Refactor: Export test case inputs for debug page 2025-11-13 16:45:20 -08:00
8830581edb Refactor: Export test case inputs for debug page 2025-11-13 16:45:18 -08:00
f8928bd9a9 Feat: Create debug page to show runtime outputs 2025-11-13 16:45:13 -08:00
f0a5c91d2b Feat: Add debug output dashboard 2025-11-13 16:38:36 -08:00
21b9af7ffd Fix: Differentiate null results in UI 2025-11-13 14:18:01 -08:00
github-actions[bot]
f2ef5831a7 Docs: Update benchmark results 2025-11-13 21:50:29 +00:00
59752cb111 Update README 2025-11-13 13:39:11 -08:00
78780bb183 Refactor: Improve markdown test robustness with DOM parsing 2025-11-13 13:33:19 -08:00
github-actions[bot]
a38ae2d0c5 Docs: Update benchmark results 2025-11-13 21:24:36 +00:00
63fdb538ff Update README 2025-11-13 13:14:36 -08:00
b28d60e25e Revert: Update test.js 2025-11-13 13:05:33 -08:00
8d8d4ed108 Revert: Update test.js 2025-11-13 13:05:15 -08:00
fc2832e202 Revert: Update test.js 2025-11-13 13:04:51 -08:00