Files
lynchmark/README
2025-10-13 13:28:41 +00:00

48 lines
1.2 KiB
Plaintext

# LLM Algorithmic Benchmark
This repository contains a suite of difficult algorithmic tests to benchmark the code generation capabilities of various Large Language Models.
The tests are run automatically via GitHub Actions, and the results are updated in this README.
## Configuration
Set the percentage of tests to run during the benchmark. 100% runs all tests.
<!-- CONFIG_START -->
RUN_PERCENTAGE: 25
<!-- CONFIG_END -->
## Models Under Test
The following models are included in the benchmark run.
<!-- MODELS_START -->
google/gemini-2.5-pro
anthropic/claude-sonnet-4.5
openai/gpt-5-codex
<!-- MODELS_END -->
## Benchmark Results
The list below shows the pass/fail status and execution time for each model on each test.
<!-- RESULTS_START -->
**google/gemini-2.5-pro**
- 1_dijkstra: ❌ Fail (0.037s)
- 2_convex_hull: ⚪ Not Run
- 3_lis: ⚪ Not Run
- 4_determinant: ⚪ Not Run
**anthropic/claude-sonnet-4.5**
- 1_dijkstra: ❌ Fail (0.035s)
- 2_convex_hull: ⚪ Not Run
- 3_lis: ⚪ Not Run
- 4_determinant: ⚪ Not Run
**openai/gpt-5-codex**
- 1_dijkstra: ❌ Fail (0.033s)
- 2_convex_hull: ⚪ Not Run
- 3_lis: ⚪ Not Run
- 4_determinant: ⚪ Not Run
<!-- RESULTS_END -->