Last updated: May 2026
Decision rule in one paragraph. Pick RunPod if you want the widest GPU catalogue, per-second billing and a one-click web UI with Pods and Serverless modes. Pick Lambda Labs if you need multi-GPU SXM clusters (8×H100 / 8×B200) or care about no-egress-fees plus a predictable per-minute flat rate. Pick Vast.ai if you accept interruptions in exchange for the cheapest verified rates on the market — and your workloads can checkpoint and resume.
This article compares the three providers most indie AI developers shortlist in 2026, with verified per-hour pricing across H100 and A100 (snapshot 2026-05-20), the comparison axes that actually matter, and an honest "where none of them are right" section. For the GPU-side decision (H100 vs A100 itself), see H100 vs A100 in 2026.
The three in one sentence each
- RunPod — broad GPU catalogue (18+ classes from RTX 5090 to B200), per-second billing, web UI plus CLI plus API, Community Cloud tier (no SLA) and Secure Cloud tier (SLA), plus a serverless mode for bursty inference.
- Lambda Labs — focused AI cloud, narrow catalogue centred on H100 / H200 / B200 SXM and A100, per-minute billing with no egress fees, signature 1-Click Clusters from 16 to 2,000+ GPUs.
- Vast.ai — distributed peer-to-peer marketplace, 68+ GPU classes, market-set per-second prices from individual hosts, roughly half the price of fixed-price providers on average, but the cheap tier is interruptible.
Side-by-side comparison
| Axis | RunPod | Lambda Labs | Vast.ai |
|---|---|---|---|
| Billing granularity | per-second | per-minute | per-second |
| H100 SXM 80GB on-demand | $3.29/hr | $4.29/hr (1×) · $3.99/hr (8×, per GPU) | $2.13/hr (30-day median, range $1.33–$6.71) |
| H100 PCIe 80GB | $2.89/hr | $3.29/hr | not split out — H100 SXM/NVL only |
| A100 SXM 80GB | $1.49/hr | $2.79/hr (8×, per GPU) | $1.00/hr (median, $0.27–$2.67) |
| A100 PCIe 80GB | $1.39/hr | not listed at 80GB | $0.67/hr (median, $0.11–$1.53) |
| H200 / B200 | H200 $4.39, B200 $5.89 | H100/B200 cluster (contact) | H200 $3.95 med, B200 $3.97 med |
| GPU catalogue breadth | 18+ classes (H100 SXM/PCIe/NVL, A100, H200, B200, L40S, RTX Pro 6000, RTX 5090, RTX 4090, L4, RTX 6000 Ada, A40, A6000, A5000…) | narrow — H100/A100/H200/B200 SXM + 1× variants | 68+ classes — full spectrum RTX 3060 → B300 SXM |
| Multi-GPU clusters | Instant Clusters (up to 8 GPUs); Reserved Clusters scale to 10,000+ | 1-Click Clusters 16–2,000+ HGX H100/B200 SXM | single-host instances — no native multi-host cluster |
| SLA / availability | Secure Cloud SLA; Community Cloud no SLA | on-demand SLA on standard tier | market — on-demand more stable, interruptible can be reclaimed |
| Egress fees | typically none | no egress fees | varies per host |
| Scale-to-zero | Serverless tier yes; Pods no | no (always on) | per-second billing acts similarly |
| Persistent storage | Network Storage $0.05–0.20/GB/mo (tiered) | filesystems available | varies per host |
| Web UI / CLI / API | full | full | full (more technical) |
| Best for | broad workload mix; fast spin-up; serverless inference | multi-GPU SXM training; ML labs | budget batch; fault-tolerant inference; experiments |
Verified pricing — 2026-05-20
The numbers above pulled from each provider's live pricing page on 2026-05-20.
| Provider | GPU | $/hour | Source |
|---|---|---|---|
| RunPod (Pods) | H100 SXM 80GB | $3.29 | runpod.io/pricing |
| RunPod (Pods) | H100 PCIe 80GB | $2.89 | same |
| RunPod (Pods) | H100 NVL 94GB | $3.19 | same |
| RunPod (Pods) | A100 SXM 80GB | $1.49 | same |
| RunPod (Pods) | A100 PCIe 80GB | $1.39 | same |
| RunPod (Pods) | H200 141GB | $4.39 | same |
| RunPod (Pods) | B200 192GB | $5.89 | same |
| RunPod (Pods) | RTX 4090 24GB | $0.69 | same |
| RunPod (Pods) | RTX 5090 32GB | $0.99 | same |
| RunPod (Pods) | L40S 48GB | $0.86 | same |
| Lambda Labs | H100 SXM 80GB (1×) | $4.29 | lambda.ai/service/gpu-cloud |
| Lambda Labs | H100 SXM 80GB (8×, /GPU) | $3.99 | same |
| Lambda Labs | H100 PCIe 80GB | $3.29 | same |
| Lambda Labs | A100 SXM 80GB (8×, /GPU) | $2.79 | same |
| Lambda Labs | A100 SXM 40GB | $1.99 | same |
| Vast.ai (market median, 30d) | H100 SXM 80GB | $2.13 ($1.33–$6.71 observed) | vast.ai/pricing |
| Vast.ai | H100 NVL 94GB | $1.69 ($1.51–$3.47) | same |
| Vast.ai | A100 SXM4 80GB | $1.00 ($0.27–$2.67) | same |
| Vast.ai | A100 PCIe 80GB | $0.67 ($0.11–$1.53) | same |
| Vast.ai | RTX 4090 24GB | $0.37 ($0.13–$4.00) | same |
| Vast.ai | RTX 5090 32GB | $0.53 | same |
| Vast.ai | L40S 48GB | $0.53 | same |
Pricing changes constantly. Always re-verify before committing a long-running rental.
When RunPod wins
Mixed-workload teams. You're running 7B inference today, training a 13B LoRA tomorrow, prototyping with a 4090 next week. RunPod's catalogue has all of them; no need to juggle accounts.
Per-second billing on short jobs. A 15-minute experimental run costs 1/4 of an hour, not a full hour. For interactive research that bills in minutes rather than days, this matters.
Serverless mode for bursty inference. If you're shipping an inference endpoint that gets traffic in spikes, the Serverless tier scales to zero when idle and bills per-second on traffic — for sub-1000-req/day workloads, often the cheapest path of any provider.
Community Cloud for experimentation. Same prices as Secure Cloud, no SLA, no minimum — fine for jobs you can restart if a host evicts you.
When Lambda Labs wins
Multi-GPU SXM clusters. Lambda's 1-Click Clusters (16 to 2,000+ HGX H100 or B200 SXM GPUs with NVLink + NVSwitch) are the indie-accessible way to rent training-scale compute. RunPod and Vast can't match this — RunPod's Instant Clusters cap at 8 GPUs on one host, and Vast is single-host. If your job needs 16+ H100s with high-bandwidth interconnect, Lambda is the answer.
No egress fees + predictable billing. Per-minute billing with no surprise egress cost makes Lambda the easier provider to forecast in a budget meeting. Hyperscalers (AWS, GCP) bury egress costs that can add 20-30% on top of compute.
The boutique service tier. Account managers and direct support exist for serious projects — useful when you need 8× H100s on a deadline and a hosting issue could cost you days.
When Vast.ai wins
Lowest verified per-hour rate on the market. A100 PCIe at $0.67/hr median (vs RunPod $1.39 and Lambda effectively unavailable at this tier). H100 SXM at $2.13/hr median (vs RunPod $3.29 and Lambda $4.29). For fault-tolerant workloads, the savings are 35-60%.
Fault-tolerant batch workloads. Pretraining a small model from scratch? Generating 100K synthetic samples? Running an evaluation suite overnight? These can all checkpoint and resume; the interruptible-tier savings on Vast compound across long runs.
RTX consumer GPUs at indie prices. RTX 4090 at $0.37/hr median ($0.13 at the cheap end). RTX 5090 at $0.53/hr median. RTX 3090 at $0.15/hr. For workloads that fit in 24GB VRAM, this is the structurally cheapest GPU access anywhere — by a wide margin.
Hyper-budget experimentation. If your monthly GPU budget is under $100, Vast is essentially the only option that gets you real GPU hours.
Where none of them are right
- Production with SLA at the lowest price. Hyperstack on-demand beats all three on the SLA tier: H100 SXM $2.40/hr, A100 SXM $1.60/hr. Not in this comparison because it's a different player class — see H100 vs A100 in 2026 for the broader provider table.
- Regulated workloads (HIPAA, FedRAMP, SOC 2 + Type II depending on enterprise needs) — AWS p5 or GCP A3 are usually mandated, despite costing 2-3× more per hour.
- Sustained 24/7 production inference at 100K+ req/day on one model. At that volume, per-token pricing on hosted-inference platforms (Groq, Together AI, Fireworks, HuggingFace Inference Providers) usually beats renting any of these. See also OpenRouter Free Tier for the gateway-based alternative.
- Workloads needing >80GB VRAM. H200 (141GB) on RunPod or Vast, or rent a multi-GPU cluster from Lambda — see workload section.
Workload picks (quick — full math in the H100 vs A100 article)
- Inference of Llama 3.3 70B Q4. Cheapest fixed-rate: RunPod A100 SXM at $1.49/hr. Cheapest if interruptible-tolerant: Vast.ai A100 SXM at $1.00/hr median.
- LoRA fine-tune of 13B model. RunPod H100 PCIe at $2.89/hr for a stable run; Vast.ai H100 SXM at $2.13/hr median if you can checkpoint.
- Multi-GPU pretraining (8× H100 or bigger). Lambda 1-Click Clusters is the only practical indie option — $3.99/GPU/hr × 8 = $31.92/hr for an 8×H100 SXM cluster with NVLink/NVSwitch. RunPod and Vast aren't built for this.
- Hobbyist experiments under $30/month. Vast.ai RTX 4090 at $0.37/hr median × ~80 hr = $29.60. Nobody else gets you this kind of GPU access at this price.
- 24/7 production endpoint, SLA required, smallest bill. None of these three on the budget axis. Hyperstack is the cheapest SLA-tier on-demand — see the H100 vs A100 article for that comparison.
Common gotchas per provider
RunPod Community Cloud. Same prices as Secure, no SLA. Hosts can evict instances. Don't run production endpoints here; use Secure Cloud or Pods on Secure tier when uptime matters.
RunPod Network Storage tiering. Idle volume disk is $0.20/GB/mo (vs running at $0.10) — if you're parking a big checkpoint, this adds up. Use Network Storage Standard at $0.05-0.07/GB/mo for long-term storage instead.
Lambda waitlist for 1-Click Clusters. Capacity is allocated; for 8× SXM clusters there can be a wait queue. Sign up and reserve before you actually need the cluster.
Vast host reliability varies. Always check the host's reliability score and reviews before a long-running job. Use the on-demand tier (not interruptible) for non-batch work, and pick hosts with consistent uptime track records.
Vast network egress varies. Unlike RunPod and Lambda, Vast hosts can set their own data transfer policies. If your workload moves big checkpoints or datasets in and out, factor this into provider choice.
FAQ
Which is fastest to spin up? RunPod and Vast spin up Pods in under 60 seconds with per-second billing. Lambda's on-demand single-GPU spin-up is similar; their 1-Click Clusters can take longer depending on queue and configuration.
Can I move workloads between providers? Yes — all three support standard Docker images and run the same OS/CUDA stacks. The friction is in dataset and checkpoint movement (egress costs at AWS/GCP; minimal on RunPod and Lambda; varies per host on Vast). If you're choosing for long-term use, anchor on the provider with no egress fees (Lambda) or set up cheap object storage that all three can read from.
What's the cheapest way to get H100 in 2026? If you accept interruptions: Vast.ai H100 SXM at ~$2.13/hr 30-day median (range observed down to $1.33). If you need guaranteed uptime: Hyperstack PCIe at $1.90/hr (covered in the H100 vs A100 article, not in this 3-way). Among the three providers here: RunPod H100 PCIe at $2.89/hr is the cheapest fixed-price option.
Do any of these offer free credits? RunPod occasionally runs promo credits; Lambda Labs offers academic-research credits via partnerships; Vast.ai doesn't run a free-credits program. For broader free-GPU paths see Free GPU Compute and NVIDIA Inception Program.
Which is best for fine-tuning a Llama 70B? For LoRA fine-tune at 70B (single GPU 80GB at low rank), all three work — Vast is cheapest, RunPod most flexible. For full fine-tune at 70B (multi-GPU needed), Lambda 1-Click 8×H100 SXM is the practical answer.
Which can scale to 100+ GPUs? Lambda 1-Click Clusters scale to 2,000+ HGX H100/B200 SXM. RunPod Reserved Clusters can scale to 10,000+ via sales contracts. Vast is single-host only — multi-host parallelism is not natively supported.
Bottom line
For most indie AI developers and small teams in 2026:
- Default choice → RunPod (broad catalogue, per-second billing, serverless option, both no-SLA and SLA tiers).
- Cost-driven choice → Vast.ai (cheapest by ~40-60% if you can handle interruptions; the only realistic option for sub-$100/month GPU budgets).
- Multi-GPU training choice → Lambda Labs (the only one of these three set up for 16-2,000+ GPU clusters).
For production-with-SLA at the lowest price, look outside this 3-way — Hyperstack wins that tier (~$2.40/hr H100 SXM on-demand SLA). See the H100 vs A100 comparison for the broader provider table.
Pricing across all three providers will keep moving as B200 supply ramps and H100 prices drift toward A100. Re-check this article every quarter.
Related guides on this site
- H100 vs A100 in 2026: Real Pricing + Workload Picks — the GPU-side decision
- Free GPU Compute — Where to Get Hours Without a Credit Card
- Best GPU for AI 2026
- VPS with GPU — Where to Rent
- Cheap Dedicated Server in 2026
- $500K in Free Cloud Credits 2026: 15 Programs Compared
- NVIDIA Inception Program — Free Credits for AI Startups
- HuggingFace Inference API 2026 — free tier, Endpoints, Providers
- OpenRouter Free Tier 2026 — 28+ free models, limits, BYOK
- Free LLM API Credits — Every Route from $0 to $10K