diff --git a/README.md b/README.md deleted file mode 100644 index 1d8d73c..0000000 --- a/README.md +++ /dev/null @@ -1,29 +0,0 @@ -# LLM Algorithmic Benchmark - -This repository contains a suite of difficult algorithmic tests to benchmark the code generation capabilities of various Large Language Models. - -The tests are run automatically via GitHub Actions, and the results are updated in this README. - -## Models Under Test - -The following models are included in the benchmark run. - - -google/gemini-2.5-pro -anthropic/claude-sonnet-4.5 -openai/gpt-5-codex - - -## Benchmark Results - -The table below shows the pass/fail status for each model on each test. - - -| Model | 1_dijkstra | 2_convex_hull | 3_lis | 4_determinant | -| --- | --- | --- | --- | --- | -| google/gemini-2.5-pro | ❌ Fail | ❌ Fail | ❌ Fail | ❌ Fail | -| anthropic/claude-sonnet-4.5 | ❌ Fail | ❌ Fail | ❌ Fail | ❌ Fail | -| openai/gpt-5-codex | ❌ Fail | ❌ Fail | ❌ Fail | ❌ Fail | - - -