Refactor: Trim README, switch Bing to Google CSE

This commit is contained in:
2026-02-13 12:52:58 -08:00
parent 38ff34a284
commit 189953b719

199
README.md
View File

@@ -18,11 +18,11 @@ That's it. The image is searched, cached, and served.
1. A request hits `direct-img.link/<query>`
2. If cached (within 30 days) → serves the image instantly from edge
3. If not cached → searches via image API → compresses to WebP → caches in R2 → serves
3. If not cached → searches via Google Custom Search API → compresses to WebP → caches in R2 → serves
## URL Format
Use `+` to separate words, just like Google:
Use `+` to separate words:
```
https://direct-img.link/orange+cat
@@ -39,8 +39,6 @@ https://direct-img.link/new+york+city
## For AI System Prompts
Add this to your system prompt:
```
When including images in your markdown responses, use https://direct-img.link/<query>
as the image URL. Use + to separate words. Example: ![orange cat](https://direct-img.link/orange+cat)
@@ -50,141 +48,166 @@ as the image URL. Use + to separate words. Example: ![orange cat](https://direct
### Global (Cloudflare WAF)
Applied to all requests before they hit any function:
| Rule | Limit | Action |
|---|---|---|
| Global rate limit | 60 requests/min per IP | Block for 1 min |
| Burst protection | 10 requests/10s per IP | Challenge |
Cache hits and new searches both count toward these limits.
### New Searches (Cache Misses)
- **10 new searches per day per IP** (resets at midnight UTC)
- **Cache hits are unlimited** (within WAF limits above)
Only fresh searches that call the image API count toward the daily limit. If your query is already cached by anyone, it's free.
- **Google API quota:** 100 free queries/day, then $5/1k
## Caching
- Images are cached for **30 days**
- After expiry, the next request triggers a fresh search
- This keeps time-sensitive queries (e.g. `/us+president`) reasonably current
## Support
This is a free community service. Donations help cover API and infrastructure costs, and allow us to offer higher rate limits for everyone.
Free community service. Donations help cover API and infrastructure costs.
**BTC Address:** `bc1qkqdmhk0we49qn74ua9752ysfxzd7uxqettymhv`
**BTC:** `bc1qkqdmhk0we49qn74ua9752ysfxzd7uxqettymhv`
---
## Infrastructure
## Self-Hosting
### Cloudflare Resources
### 1. Google Programmable Search Engine
1. Go to [programmablesearchengine.google.com](https://programmablesearchengine.google.com/) → **Add**
2. Toggle **Image search** to **On**
3. Under **Sites to search**, click **Add** and paste these (one per line):
```
*.wikimedia.org/*
*.wikipedia.org/*
*.reddit.com/*
*.imgur.com/*
*.flickr.com/*
*.unsplash.com/*
*.pexels.com/*
*.pixabay.com/*
*.pinimg.com/*
*.deviantart.com/*
*.artstation.com/*
*.500px.com/*
*.gettyimages.com/*
*.alamy.com/*
*.shutterstock.com/*
*.istockphoto.com/*
*.nationalgeographic.com/*
*.smithsonianmag.com/*
*.britannica.com/*
*.nasa.gov/*
*.si.edu/*
*.imdb.com/*
*.themoviedb.org/*
*.espn.com/*
*.cnn.com/*
*.bbc.co.uk/*
*.reuters.com/*
*.apnews.com/*
*.nytimes.com/*
*.theguardian.com/*
*.amazon.com/*
*.ebay.com/*
*.etsy.com/*
*.foodnetwork.com/*
*.allrecipes.com/*
*.seriouseats.com/*
*.architecturaldigest.com/*
*.nature.com/*
*.sciencephoto.com/*
*.wired.com/*
*.theverge.com/*
*.techcrunch.com/*
*.rottentomatoes.com/*
*.billboard.com/*
*.vogue.com/*
*.gq.com/*
*.webmd.com/*
*.mayoclinic.org/*
*.space.com/*
*.worldwildlife.org/*
```
4. **Save** and copy your **Search Engine ID** (`cx`)
5. You can edit this site list anytime from the control panel
### 2. Google Custom Search API Key
1. Go to [Google Cloud Console](https://console.cloud.google.com/)
2. Create a project → **APIs & Services → Library**
3. Enable **Custom Search API**
4. **APIs & Services → Credentials → Create Credentials → API Key**
### 3. Cloudflare Resources
Create in your Cloudflare dashboard:
| Resource | Name | Purpose |
|---|---|---|
| R2 Bucket | `direct-img-store` | Stores compressed WebP images |
| KV Namespace | `DIRECT_IMG_CACHE` | Query → cache existence + timestamp |
| KV Namespace | `DIRECT_IMG_RATE` | Per-IP daily new-search counter |
| KV Namespace | `DIRECT_IMG_CACHE` | Cache existence + timestamp |
| KV Namespace | `DIRECT_IMG_RATE` | Per-IP daily search counter |
### Pages Bindings
### 4. Pages Bindings
Set in **Settings → Functions → Bindings**:
**Settings → Functions → Bindings:**
| Type | Variable name | Resource |
| Type | Variable | Resource |
|---|---|---|
| R2 Bucket | `R2_IMAGES` | `direct-img-store` |
| KV Namespace | `DIRECT_IMG_CACHE` | `DIRECT_IMG_CACHE` |
| KV Namespace | `DIRECT_IMG_RATE` | `DIRECT_IMG_RATE` |
### Environment Variables / Secrets
### 5. Secrets
Set in **Settings → Environment variables**:
**Settings → Environment variables:**
| Type | Variable | Description |
|---|---|---|
| Secret | `BING_API_KEY` | Bing Image Search API subscription key |
| Variable | Description |
|---|---|
| `GOOGLE_API_KEY` | Custom Search API key |
| `GOOGLE_CSE_ID` | Search Engine ID (`cx`) |
### 6. WAF Rules
**Security → WAF → Rate limiting rules:**
1. **Global** — 60 req/min per IP → Block 60s
2. **Burst** — 10 req/10s per IP → Challenge
### 7. Deploy
Fork this repo, connect to Cloudflare Pages, deploy.
---
## Infrastructure Details
### R2: `direct-img-store`
Key is derived deterministically from the query — no need to store it in KV.
**Key format:** `<sha256-of-normalized-query>.webp`
Example: `"orange cat"``a1b2c3d4...ef.webp`
All images stored as compressed WebP.
**Key:** `<sha256-of-normalized-query>.webp` — derived from query, no lookup needed.
### KV: `DIRECT_IMG_CACHE`
Confirms a cached image exists for a query. The R2 key is derived from the same query at request time.
**Key:** normalized query (lowercase, trimmed, spaces from `+`)
```
orange cat
```
**Value:**
```json
{"t":1719000000}
```
`t` = unix timestamp when cached. Useful for debugging and cache-age headers.
**TTL:** 30 days (`expirationTtl: 2592000`) — KV auto-deletes expired keys. No cron needed.
**Size:** ~20 bytes per entry. Free tier (1 GB) supports millions of entries.
**Key:** normalized query (lowercase, trimmed) → **Value:** `{"t":1719000000}`**TTL:** 30 days
### KV: `DIRECT_IMG_RATE`
Tracks daily new-search count per IP.
**Key:** `<ip>:<YYYY-MM-DD>`
```
192.168.1.1:2025-01-15
```
**Value:**
```json
{"c":7}
```
`c` = count of new searches made today.
**TTL:** 48 hours (`expirationTtl: 172800`) — generous buffer past midnight, auto-cleanup.
### Cloudflare WAF Rules (Dashboard)
Set manually in **Security → WAF → Rate limiting rules**:
1. **Global rate limit**
- Match: URI Path starts with `/`
- Rate: 60 requests per 1 minute
- Per: IP
- Action: Block for 60 seconds
2. **Burst protection**
- Match: URI Path starts with `/`
- Rate: 10 requests per 10 seconds
- Per: IP
- Action: Managed Challenge
**Key:** `<ip>:<YYYY-MM-DD>`**Value:** `{"c":7}`**TTL:** 48 hours
---
## Stack
- **Cloudflare Pages** — hosting + edge functions
- **Cloudflare R2** — image storage (zero egress fees)
- **Cloudflare KV** — metadata cache + rate limiting
- **Cloudflare WAF** — global rate limiting + DDoS protection
- **Bing Image Search API** — image sourcing
- **Cloudflare R2** — image storage
- **Cloudflare KV** — cache + rate limiting
- **Cloudflare WAF** — rate limiting + DDoS protection
- **Google Custom Search API** — image sourcing
---