Why CDN Imports Matter in LLM Benchmarks
++ Most benchmarks test code in isolation. Lynchmark tests what actually matters: finding correct CDN URLs and making code work in real browsers. +
+The Hidden Skill Gap
++ When you ask an LLM to "write a function that uses lodash," most benchmarks will accept any code that looks like it uses lodash. But in the real world, that code needs to: +
+-
+
- Find the correct CDN URL for the library +
- Use the proper import syntax for that CDN +
- Handle the library's actual API (not just what the LLM thinks it should be) +
- Execute successfully in a browser environment +
+ This is where many LLMs fail spectacularly. They might generate perfect-looking algorithm code, but use non-existent CDN URLs or incorrect import patterns. +
+Real-World Example: The scrypt-js Test
+
+ In Test #10, models must import scrypt-js from a CDN. Here's what separates passing from failing implementations:
+
❌ Common Failure
+import { scrypt } from 'https://cdn.skypack.dev/scrypt-js';
+// Wrong: Skypack doesn't export named 'scrypt' this way
+ ✅ Correct Solution
+const { scrypt } = await import('https://cdn.jsdelivr.net/npm/scrypt-js@3.0.1/+esm');
+// Correct: Uses jsDelivr with proper destructuring
+ + The difference isn't just syntax—it's about knowing which CDNs work for which libraries, and how those libraries are actually packaged for browser use. +
+What Traditional Benchmarks Miss
++ Standard coding benchmarks test algorithmic thinking in isolation. They don't verify that the generated code can actually run in a real environment with real dependencies. +
+What Lynchmark Captures
++ Practical deployment knowledge: Which CDN hosts which libraries, how to import them correctly, and whether the resulting code executes successfully in a browser with no build step. +
+The Takeaway
++ When evaluating LLMs for real-world coding tasks, test their ability to work with actual dependencies in real environments. Perfect algorithm implementation means nothing if the code can't import its required libraries. +
++ This is why Lynchmark runs every generated solution in a real browser with real CDN imports—it's the only way to know if the code actually works. +
+