Set the percentage of tests to run during the benchmark. 100% runs all tests.

<!-- CONFIG_START -->
RUN_PERCENTAGE: 25
<!-- CONFIG_END -->


The following models are included in the benchmark run.

<!-- MODELS_START -->
google/gemini-2.5-pro
anthropic/claude-sonnet-4.5
openai/gpt-5-codex
<!-- MODELS_END -->