Feat: Document slash rejection, update normalization rules

This commit is contained in:
2026-02-15 09:03:00 -08:00
parent e5b167d731
commit 606feec0ff

View File

@@ -44,15 +44,24 @@ All queries are normalized before caching and searching:
| Multiple spaces collapsed | `orange++cat` | `orange cat` |
| Trailing slashes stripped | `orange+cat/` | `orange cat` |
| Control characters removed | `orange\x00cat` | `orangecat` |
| **Slashes rejected** | `wp-admin/setup.php` | `bad.webp` served |
| **Max length: 200 characters** | — | 400 error if exceeded |
### Characters that work fine
- **Letters, numbers, spaces** — standard queries
- **Hyphens** (`spider-man`), **dots** (`node.js`), **apostrophes** (`90's`) — passed through
- **Slashes** (`AC/DC`) — kept as-is in the normalized query
- **Unicode** (`café`, `日本`) — supported via URL encoding
### Slashes must be encoded
Literal slashes (`/`) in the URL path are **rejected** to prevent bot abuse (e.g. `wp-admin/setup-config.php` probes). If your query genuinely contains a slash, encode it as `%2F`:
| Query | URL |
|---|---|
| AC/DC | `/AC%2FDC` ✅ |
| AC/DC | `/AC/DC` ❌ (rejected) |
### Things to know
- **Query parameters (`?...`)** are ignored — `/orange+cat?size=large``orange cat`