Feat: Add run percentage config & rename file

2026-07-19 14:25:45 +00:00 · 2025-10-13 05:50:08 -07:00
parent ac7a450d31
commit 43d17e9b0f
1 changed files with 35 additions and 0 deletions
--- a/35
+++ b/35
@@ -0,0 +1,35 @@
+# LLM Algorithmic Benchmark
+
+This repository contains a suite of difficult algorithmic tests to benchmark the code generation capabilities of various Large Language Models.
+
+The tests are run automatically via GitHub Actions, and the results are updated in this README.
+
+## Configuration
+
+Set the percentage of tests to run during the benchmark. 100% runs all tests.
+
+<!-- CONFIG_START -->
+RUN_PERCENTAGE: 100
+<!-- CONFIG_END -->
+
+## Models Under Test
+
+The following models are included in the benchmark run.
+
+<!-- MODELS_START -->
+google/gemini-2.5-pro
+anthropic/claude-sonnet-4.5
+openai/gpt-5-codex
+<!-- MODELS_END -->
+
+## Benchmark Results
+
+The table below shows the pass/fail status for each model on each test.
+
+<!-- RESULTS_START -->
+| Model | 1_dijkstra | 2_convex_hull | 3_lis | 4_determinant |
+| --- | --- | --- | --- | --- |
+| google/gemini-2.5-pro | ❌ Fail | ❌ Fail | ❌ Fail | ❌ Fail |
+| anthropic/claude-sonnet-4.5 | ❌ Fail | ❌ Fail | ❌ Fail | ❌ Fail |
+| openai/gpt-5-codex | ❌ Fail | ❌ Fail | ❌ Fail | ❌ Fail |
+<!-- RESULTS_END -->