multipleof4/lynchmark: LLM Benchmark - lynchmark - Planet Renox Source

multipleof4/lynchmark

mirror of https://github.com/multipleof4/lynchmark.git synced 2026-07-19 06:15:45 +00:00

Go to file

multipleof4 021c1797a2 Delete debug.html

2025-11-13 16:50:45 -08:00

.github/workflows

Fix: Update git add command in workflow

2025-11-13 12:48:50 -08:00

Revert: Update run-benchmark.js

2025-11-13 13:01:02 -08:00

Revert: Update test.js

2025-11-13 16:50:25 -08:00

.gitignore

Revert: Update .gitignore

2025-11-13 13:00:44 -08:00

index.html

Fix: Differentiate null results in UI

2025-11-13 14:18:01 -08:00

package.json

Revert: Update package.json

2025-11-13 13:00:54 -08:00

README

Update README

2025-11-13 13:39:11 -08:00

results.json

Docs: Update benchmark results

2025-11-13 21:50:29 +00:00

README

Set the percentage of tests to run during the benchmark. 100% runs all tests.

<!-- CONFIG_START -->
RUN_PERCENTAGE: 100
SHARED_PROMPT: "Provide production-ready and maintainable JavaScript code. Apply code golfing practices but don't put everything in a single line. No comments. Your code will execute in the browser."
<!-- CONFIG_END -->


The following models are included in the benchmark run.

<!-- MODELS_START -->
openai/gpt-5.1-codex
openai/gpt-5.1-chat
google/gemini-2.5-pro
anthropic/claude-sonnet-4.5 TEMP:0.7
<!-- MODELS_END -->