OpenRouter Free Tier in 2026: All 28+ Free Models, Real Rate Limits, BYOK 1M Requests, and How to Set It Up

Last updated: May 2026

OpenRouter's free tier in May 2026 gives you 20 requests per minute and 50-1000 requests per day against 28+ free models — including DeepSeek R1, Llama 3.3 70B, Qwen3 Coder 480B (262K context), Gemma 3, and Google Gemini 2.0 Flash. No credit card required. A separate BYOK (Bring Your Own Key) program gives 1 million free routing requests per month when you use your own provider keys. This guide covers the exact rate-limit math, the current free model roster, BYOK setup, the variant syntax (:free, :nitro, :floor), upgrade triggers, and how OpenRouter free stacks against HuggingFace Inference and Groq.

For free tiers across all providers, see Free LLM API Credits. For OpenRouter's deeper role as a gateway, see LLM Gateway in 2026.

Free tier rate limits at a glance

Limit	Free tier	After $10 in lifetime credits
Per-minute	20 requests	20 requests
Per-day	50 requests	1,000 requests
Token-per-minute	Provider-dependent	Provider-dependent
Free models	All :free variants	All :free variants

The 20 RPM cap is consistent — purchasing credits doesn't unlock higher per-minute throughput on free models. The daily limit is the lever: spending $10 once (it never expires) raises the daily floor from 50 to 1000 forever.

Token-per-minute caps depend on the underlying provider. DeepSeek's free hosting limits are stricter than Llama 3.3 70B free; Google Gemini 2.0 Flash free has its own provider-side limits.

The 28+ free models worth knowing (May 2026)

Model	Context	Strength
Qwen3 Coder 480B (free)	262K	Strongest free coding model
DeepSeek R1 (free)	128K	Reasoning, math, GPT-4 class
DeepSeek V3 (free)	128K	General-purpose, strong all-rounder
Meta Llama 4 Scout (free)	10M	Largest context in any free model
Meta Llama 3.3 70B Instruct (free)	128K	Solid all-purpose, well-supported
Meta Llama 3.1 8B Instruct (free)	128K	Fast, cheap, low-VRAM-friendly
Qwen 2.5 7B Instruct (free)	128K	Small Asian-language alternative
Google Gemma 3 12B (free)	128K	Google's open model, safety-tuned
Google Gemini 2.0 Flash (free)	1M	Multimodal, large context
Mistral Small 24B (free)	128K	EU-hosted alternative
Phi-3 Medium (free)	128K	Microsoft, strong-for-size
Nous Hermes 3 (free)	128K	Fine-tuned Llama variant

The exact roster shifts month to month. Some models cycle in and out as upstream providers add or pull free hosting. Check openrouter.ai/models with the :free filter for the live list.

Go to openrouter.ai and sign up with email or GitHub. No credit card required.
Open Settings → Keys in the dashboard.
Click Create Key. Give it a name (e.g. "dev-laptop"). Copy the key — it starts with sk-or-.
Point your code at the OpenAI-compatible endpoint:

from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="sk-or-...",
)

resp = client.chat.completions.create(
    model="deepseek/deepseek-r1:free",
    messages=[{"role": "user", "content": "Explain transformers in one paragraph."}],
)
print(resp.choices[0].message.content)

Model ID format: vendor/model-name:variant. The :free suffix is the free variant; omit it for paid pricing.

That's the whole setup. No credit card, no email verification beyond signup, no application form.

Model variants — `:free`, `:nitro`, `:floor`

OpenRouter routes each request to one of multiple underlying providers (Together, Fireworks, DeepInfra, Hyperbolic, etc.). Variants control how that choice is made:

:free — picks free-tier hosting where available. Subject to free-tier rate limits and provider availability.
:nitro — picks the highest-throughput provider for the model. Useful when latency / TPS matters more than cost.
:floor — picks the cheapest available provider. Useful for paid usage where cost dominates.
:extended — picks a provider offering an extended context window on this model.
:thinking — turns on the model's reasoning mode (for models that support it like DeepSeek R1, GPT-5-thinking, Claude Sonnet 4 thinking).

You can stack appropriate variants. For free-tier high-throughput: deepseek/deepseek-r1:free. For paid cost-min: meta-llama/llama-3.3-70b-instruct:floor.

BYOK — 1 million free requests per month

OpenRouter's BYOK (Bring Your Own Key) program lets you point OpenRouter at your own provider keys (OpenAI, Anthropic, Google, Together, Groq, etc.) while still using OpenRouter's unified API and analytics.

The math: every customer gets 1,000,000 free BYOK requests per month. Above 1M, you pay 5% of the model's normal OpenRouter rate as a routing fee.

When BYOK makes sense:

You already have credits or commitments at one or more providers (OpenAI commit, Anthropic credits, etc.) and want unified observability across them.
You want to mix free OpenRouter models with your own provider-paid models in the same code path.
You hit free-tier rate limits but still want to keep using OpenRouter's gateway features (analytics, retries, fallback).

Setup:

Go to Settings → Provider Keys.
Paste your OpenAI / Anthropic / Google / etc. API keys.
Call OpenRouter as normal — model routing will use your provider key when configured.

When the free tier breaks (upgrade signals)

Stay on free if:

Daily volume fits the 50/1000 cap.
20 RPM doesn't bite (no traffic spikes above 1 req/3 seconds).
The rotating :free roster covers your model needs.
No SLA requirement (free hosting can drop or be rate-limited by providers).

Add $10+ in credits when:

You need more than 1000 free requests/day (raises the daily cap permanently).
You want access to specific paid models (GPT-5, Claude Sonnet 4, etc.).
You want :nitro variants for production throughput.
You want to stop worrying about free-tier rate-limiting during traffic spikes.

Upgrade to BYOK when:

You already have provider credits / commitments and just want OpenRouter as a gateway.
1M monthly requests fits your traffic — that's ~32K requests/day, plenty for most apps.

OpenRouter free vs HuggingFace Inference Providers (PRO)

Aspect	OpenRouter free	HF Inference Providers (PRO, $9/mo)
Cost	$0 (free tier)	$9/month
Models	28+ free models	15+ providers, 100+ models
Daily limit	50 / 1000 requests	2M monthly Inference Provider credits
Per-minute limit	20 RPM	Provider-dependent
Setup	1 minute	1 minute
Best for	Prototyping, hobby projects	ML teams already on Hub

For pure prototyping with no budget, OpenRouter free beats everything. For developers who use HuggingFace Hub for models / datasets / Spaces, HF PRO is competitive at $9/month.

OpenRouter free vs Groq direct

Groq has its own free tier (30 RPM, 6K TPM, 14,400 RPD), runs open-source models only, and is significantly faster (Llama 3.3 70B at 394 TPS on Groq vs ~80-150 TPS on OpenRouter free routing). For Llama / Qwen / DeepSeek workloads where speed matters, Groq direct beats OpenRouter free on latency.

OpenRouter free wins on model breadth — 28+ models including Google Gemini 2.0 Flash, DeepSeek R1, Qwen3 Coder 480B that aren't all available on Groq. Use both: Groq for speed-critical paths, OpenRouter for breadth and fallback.

Common mistakes with the OpenRouter free tier

Treating :free as production-grade. Free hosting can be rate-limited or paused by providers. For customer-facing production, add $10 of credits and use paid variants.
Hitting 20 RPM without backoff. Implement exponential backoff with retries. Bursts above 20 RPM will 429.
Forgetting the daily limit. 50 requests/day on the unfunded free tier vanishes fast in development. Drop $10 in credits early.
Ignoring model ID format. deepseek/deepseek-r1 (paid passthrough) and deepseek/deepseek-r1:free (free hosting) are different. Always include :free for free tier.
Single-model dependency on free. Free models can drop out of the roster. Use a fallback chain (LiteLLM / OpenRouter's auto routing).

What the free tier is not for

High-volume production at >1000 req/day per user.
Strict-SLA workloads (free hosting has no uptime guarantee).
Frontier closed models (GPT-5, Claude Opus 4.7 — not available free).
Use cases that need :nitro speed or :extended context.

For all of those, $10 in credits is the right next step.

Free tier rate limits at a glance

The 28+ free models worth knowing (May 2026)

How to sign up and get an API key

Model variants — :free, :nitro, :floor