Feat: Add run percentage config & rename file

2026-07-18 13:55:46 +00:00 · 2025-10-13 05:50:08 -07:00
parent ac7a450d31
commit 43d17e9b0f
1 changed files with 35 additions and 0 deletions
--- a/35
+++ b/35
@@ -0,0 +1,35 @@
 # LLM Algorithmic Benchmark
 This repository contains a suite of difficult algorithmic tests to benchmark the code generation capabilities of various Large Language Models.
 The tests are run automatically via GitHub Actions, and the results are updated in this README.
 ## Configuration
 Set the percentage of tests to run during the benchmark. 100% runs all tests.
 <!-- CONFIG_START -->
 RUN_PERCENTAGE: 100
 <!-- CONFIG_END -->
 ## Models Under Test
 The following models are included in the benchmark run.
 <!-- MODELS_START -->
 google/gemini-2.5-pro
 anthropic/claude-sonnet-4.5
 openai/gpt-5-codex
 <!-- MODELS_END -->
 ## Benchmark Results
 The table below shows the pass/fail status for each model on each test.
 <!-- RESULTS_START -->
 | Model | 1_dijkstra | 2_convex_hull | 3_lis | 4_determinant |
 | --- | --- | --- | --- | --- |
 | google/gemini-2.5-pro | ❌ Fail | ❌ Fail | ❌ Fail | ❌ Fail |
 | anthropic/claude-sonnet-4.5 | ❌ Fail | ❌ Fail | ❌ Fail | ❌ Fail |
 | openai/gpt-5-codex | ❌ Fail | ❌ Fail | ❌ Fail | ❌ Fail |
 <!-- RESULTS_END -->