multipleof4/lynchmark

mirror of https://github.com/multipleof4/lynchmark.git synced 2026-01-14 00:27:55 +00:00

Go to file

multipleof4 ac363ef174 Feat: Store code generation times

2025-10-13 10:33:42 -07:00

.github/workflows

Feat: Add results.json to commit step

2025-10-13 10:33:33 -07:00

Refactor: Generate browser-runnable test files

2025-10-13 10:24:15 -07:00

Refactor: Make test harness browser-compatible

2025-10-13 10:24:44 -07:00

.gitignore

Refactor: Update temp file name for ESM compatibility

2025-10-13 06:05:14 -07:00

index.html

Feat: Display generation time from results.json

2025-10-13 10:33:38 -07:00

package.json

Refactor: Configure project to use ES Modules by default

2025-10-13 06:05:16 -07:00

README

Docs: Update benchmark results

2025-10-13 13:28:41 +00:00

README.md

Refactor: Point to live results page in README

2025-10-13 10:33:35 -07:00

results.json

Feat: Store code generation times

2025-10-13 10:33:42 -07:00

README.md

LLM Algorithmic Benchmark

This repository contains a suite of difficult algorithmic tests to benchmark the code generation capabilities of various Large Language Models.

The tests are run automatically via GitHub Actions, and the results are updated in this README.

Configuration

Set the percentage of tests to run during the benchmark. 100% runs all tests.

RUN_PERCENTAGE: 25

Models Under Test

The following models are included in the benchmark run.

google/gemini-2.5-pro anthropic/claude-sonnet-4.5 openai/gpt-5-codex

Benchmark Results

Live benchmark results, including pass/fail status and code generation time, are available on our results page.

The results are updated automatically via GitHub Actions.