Last updated: May 2026
OpenRouter's free tier in May 2026 gives you 20 requests per minute and 50-1000 requests per day against 28+ free models — including DeepSeek R1, Llama 3.3 70B, Qwen3 Coder 480B (262K context), Gemma 3, and Google Gemini 2.0 Flash. No credit card required. A separate BYOK (Bring Your Own Key) program gives 1 million free routing requests per month when you use your own provider keys. This guide covers the exact rate-limit math, the current free model roster, BYOK setup, the variant syntax (:free, :nitro, :floor), upgrade triggers, and how OpenRouter free stacks against HuggingFace Inference and Groq.
For free tiers across all providers, see Free LLM API Credits. For OpenRouter's deeper role as a gateway, see LLM Gateway in 2026.
Free tier rate limits at a glance
| Limit | Free tier | After $10 in lifetime credits |
|---|---|---|
| Per-minute | 20 requests | 20 requests |
| Per-day | 50 requests | 1,000 requests |
| Token-per-minute | Provider-dependent | Provider-dependent |
| Free models | All :free variants | All :free variants |
The 20 RPM cap is consistent — purchasing credits doesn't unlock higher per-minute throughput on free models. The daily limit is the lever: spending $10 once (it never expires) raises the daily floor from 50 to 1000 forever.
Token-per-minute caps depend on the underlying provider. DeepSeek's free hosting limits are stricter than Llama 3.3 70B free; Google Gemini 2.0 Flash free has its own provider-side limits.
The 28+ free models worth knowing (May 2026)
| Model | Context | Strength |
|---|---|---|
| Qwen3 Coder 480B (free) | 262K | Strongest free coding model |
| DeepSeek R1 (free) | 128K | Reasoning, math, GPT-4 class |
| DeepSeek V3 (free) | 128K | General-purpose, strong all-rounder |
| Meta Llama 4 Scout (free) | 10M | Largest context in any free model |
| Meta Llama 3.3 70B Instruct (free) | 128K | Solid all-purpose, well-supported |
| Meta Llama 3.1 8B Instruct (free) | 128K | Fast, cheap, low-VRAM-friendly |
| Qwen 2.5 7B Instruct (free) | 128K | Small Asian-language alternative |
| Google Gemma 3 12B (free) | 128K | Google's open model, safety-tuned |
| Google Gemini 2.0 Flash (free) | 1M | Multimodal, large context |
| Mistral Small 24B (free) | 128K | EU-hosted alternative |
| Phi-3 Medium (free) | 128K | Microsoft, strong-for-size |
| Nous Hermes 3 (free) | 128K | Fine-tuned Llama variant |
The exact roster shifts month to month. Some models cycle in and out as upstream providers add or pull free hosting. Check openrouter.ai/models with the :free filter for the live list.
How to sign up and get an API key
- Go to openrouter.ai and sign up with email or GitHub. No credit card required.
- Open Settings → Keys in the dashboard.
- Click Create Key. Give it a name (e.g. "dev-laptop"). Copy the key — it starts with
sk-or-. - Point your code at the OpenAI-compatible endpoint:
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="sk-or-...",
)
resp = client.chat.completions.create(
model="deepseek/deepseek-r1:free",
messages=[{"role": "user", "content": "Explain transformers in one paragraph."}],
)
print(resp.choices[0].message.content)
- Model ID format:
vendor/model-name:variant. The:freesuffix is the free variant; omit it for paid pricing.
That's the whole setup. No credit card, no email verification beyond signup, no application form.
Model variants — :free, :nitro, :floor
OpenRouter routes each request to one of multiple underlying providers (Together, Fireworks, DeepInfra, Hyperbolic, etc.). Variants control how that choice is made:
:free— picks free-tier hosting where available. Subject to free-tier rate limits and provider availability.:nitro— picks the highest-throughput provider for the model. Useful when latency / TPS matters more than cost.:floor— picks the cheapest available provider. Useful for paid usage where cost dominates.:extended— picks a provider offering an extended context window on this model.:thinking— turns on the model's reasoning mode (for models that support it like DeepSeek R1, GPT-5-thinking, Claude Sonnet 4 thinking).
You can stack appropriate variants. For free-tier high-throughput: deepseek/deepseek-r1:free. For paid cost-min: meta-llama/llama-3.3-70b-instruct:floor.
BYOK — 1 million free requests per month
OpenRouter's BYOK (Bring Your Own Key) program lets you point OpenRouter at your own provider keys (OpenAI, Anthropic, Google, Together, Groq, etc.) while still using OpenRouter's unified API and analytics.
The math: every customer gets 1,000,000 free BYOK requests per month. Above 1M, you pay 5% of the model's normal OpenRouter rate as a routing fee.
When BYOK makes sense:
- You already have credits or commitments at one or more providers (OpenAI commit, Anthropic credits, etc.) and want unified observability across them.
- You want to mix free OpenRouter models with your own provider-paid models in the same code path.
- You hit free-tier rate limits but still want to keep using OpenRouter's gateway features (analytics, retries, fallback).
Setup:
- Go to Settings → Provider Keys.
- Paste your OpenAI / Anthropic / Google / etc. API keys.
- Call OpenRouter as normal — model routing will use your provider key when configured.
When the free tier breaks (upgrade signals)
Stay on free if:
- Daily volume fits the 50/1000 cap.
- 20 RPM doesn't bite (no traffic spikes above 1 req/3 seconds).
- The rotating
:freeroster covers your model needs. - No SLA requirement (free hosting can drop or be rate-limited by providers).
Add $10+ in credits when:
- You need more than 1000 free requests/day (raises the daily cap permanently).
- You want access to specific paid models (GPT-5, Claude Sonnet 4, etc.).
- You want
:nitrovariants for production throughput. - You want to stop worrying about free-tier rate-limiting during traffic spikes.
Upgrade to BYOK when:
- You already have provider credits / commitments and just want OpenRouter as a gateway.
- 1M monthly requests fits your traffic — that's ~32K requests/day, plenty for most apps.
OpenRouter free vs HuggingFace Inference Providers (PRO)
| Aspect | OpenRouter free | HF Inference Providers (PRO, $9/mo) |
|---|---|---|
| Cost | $0 (free tier) | $9/month |
| Models | 28+ free models | 15+ providers, 100+ models |
| Daily limit | 50 / 1000 requests | 2M monthly Inference Provider credits |
| Per-minute limit | 20 RPM | Provider-dependent |
| Setup | 1 minute | 1 minute |
| Best for | Prototyping, hobby projects | ML teams already on Hub |
For pure prototyping with no budget, OpenRouter free beats everything. For developers who use HuggingFace Hub for models / datasets / Spaces, HF PRO is competitive at $9/month.
OpenRouter free vs Groq direct
Groq has its own free tier (30 RPM, 6K TPM, 14,400 RPD), runs open-source models only, and is significantly faster (Llama 3.3 70B at 394 TPS on Groq vs ~80-150 TPS on OpenRouter free routing). For Llama / Qwen / DeepSeek workloads where speed matters, Groq direct beats OpenRouter free on latency.
OpenRouter free wins on model breadth — 28+ models including Google Gemini 2.0 Flash, DeepSeek R1, Qwen3 Coder 480B that aren't all available on Groq. Use both: Groq for speed-critical paths, OpenRouter for breadth and fallback.
Common mistakes with the OpenRouter free tier
- Treating
:freeas production-grade. Free hosting can be rate-limited or paused by providers. For customer-facing production, add $10 of credits and use paid variants. - Hitting 20 RPM without backoff. Implement exponential backoff with retries. Bursts above 20 RPM will 429.
- Forgetting the daily limit. 50 requests/day on the unfunded free tier vanishes fast in development. Drop $10 in credits early.
- Ignoring model ID format.
deepseek/deepseek-r1(paid passthrough) anddeepseek/deepseek-r1:free(free hosting) are different. Always include:freefor free tier. - Single-model dependency on free. Free models can drop out of the roster. Use a fallback chain (LiteLLM / OpenRouter's auto routing).
What the free tier is not for
- High-volume production at >1000 req/day per user.
- Strict-SLA workloads (free hosting has no uptime guarantee).
- Frontier closed models (GPT-5, Claude Opus 4.7 — not available free).
- Use cases that need :nitro speed or :extended context.
For all of those, $10 in credits is the right next step.