Set the percentage of tests to run during the benchmark. 100% runs all tests. RUN_PERCENTAGE: 100 SHARED_PROMPT: "Provide production-ready and maintainable JavaScript code. Apply code golfing practices but don't put everything in a single line. No comments. Your code will execute in the browser." The following models are included in the benchmark run. google/gemini-3-pro-preview anthropic/claude-sonnet-4.5 TEMP:0.7 openai/gpt-5.1-codex moonshotai/kimi-k2-thinking google/gemini-2.5-pro openrouter/sherlock-think-alpha google/gemini-3-pro-preview TEMP:1.1 google/gemini-3-pro-preview TEMP:1 google/gemini-3-pro-preview TEMP:0.9 google/gemini-3-pro-preview TEMP:0.8 google/gemini-3-pro-preview TEMP:0.7 google/gemini-3-pro-preview TEMP:0.6 google/gemini-3-pro-preview TEMP:0.5 google/gemini-3-pro-preview TEMP:0.4 google/gemini-3-pro-preview TEMP:0.3 google/gemini-3-pro-preview TEMP:0.2 google/gemini-3-pro-preview TEMP:0.1 google/gemini-3-pro-preview TEMP:0