2025-11-18 09:50:14 -08:00
2025-11-13 13:00:44 -08:00
2025-11-18 08:45:07 -08:00
2025-11-14 16:17:25 -08:00
2025-11-13 13:00:54 -08:00
2025-11-18 08:50:09 -08:00
2025-11-18 17:37:06 +00:00

Set the percentage of tests to run during the benchmark. 100% runs all tests.

<!-- CONFIG_START -->
RUN_PERCENTAGE: 100
SHARED_PROMPT: "Provide production-ready and maintainable JavaScript code. Apply code golfing practices but don't put everything in a single line. No comments. Your code will execute in the browser."
<!-- CONFIG_END -->


The following models are included in the benchmark run.

<!-- MODELS_START -->
google/gemini-3-pro-preview
anthropic/claude-sonnet-4.5 TEMP:0.7
openai/gpt-5.1-codex
moonshotai/kimi-k2-thinking
google/gemini-2.5-pro
openrouter/sherlock-think-alpha
<!-- MODELS_END -->

Description
LLM Benchmark
Readme 2.2 MiB
Languages
JavaScript 85.2%
HTML 14.8%