Can I move workloads between RunPod, Lambda and Vast.ai?

Yes - all three support standard Docker images and run the same OS/CUDA stacks. The friction is dataset and checkpoint movement (egress costs at hyperscalers; minimal on RunPod and Lambda; varies per host on Vast). Anchor on Lambda for no-egress storage of large checkpoints.

What is the cheapest way to get H100 in 2026?

If you accept interruptions: Vast.ai H100 SXM at ~$2.13/hr 30-day median (range $1.33-$6.71). If you need guaranteed uptime: Hyperstack PCIe at $1.90/hr (outside this 3-way). Among the three here: RunPod H100 PCIe at $2.89/hr is the cheapest fixed-price option.

Which provider can scale to 100+ GPUs?

Lambda 1-Click Clusters scale to 2,000+ HGX H100 or B200 SXM GPUs. RunPod Reserved Clusters scale to 10,000+ via sales contracts. Vast.ai is single-host only - multi-host parallelism is not natively supported.

RunPod vs Lambda Labs vs Vast.ai: GPU Rental Compare 2026

Q: Do RunPod, Lambda or Vast.ai offer free credits?

RunPod occasionally runs promo credits; Lambda offers academic-research credits via partnerships; Vast.ai doesn't run a free-credits program. For broader free-GPU paths see Free GPU Compute and the NVIDIA Inception Program.

Q: Which provider is best for fine-tuning a Llama 70B?

For LoRA fine-tune at 70B (single 80GB GPU at low rank), all three work - Vast is cheapest, RunPod most flexible. For full fine-tune at 70B (multi-GPU needed), Lambda 1-Click 8×H100 SXM cluster is the practical answer.

Last updated: May 2026

Decision rule in one paragraph. Pick RunPod if you want the widest GPU catalogue, per-second billing and a one-click web UI with Pods and Serverless modes. Pick Lambda Labs if you need multi-GPU SXM clusters (8×H100 / 8×B200) or care about no-egress-fees plus a predictable per-minute flat rate. Pick Vast.ai if you accept interruptions in exchange for the cheapest verified rates on the market - and your workloads can checkpoint and resume.

This article compares the three providers most indie AI developers shortlist in 2026, with verified per-hour pricing across H100 and A100 (snapshot 2026-05-20), the comparison axes that actually matter, and an honest "where none of them are right" section. For the GPU-side decision (H100 vs A100 itself), see H100 vs A100 in 2026.

The three in one sentence each

RunPod - broad GPU catalogue (18+ classes from RTX 5090 to B200), per-second billing, web UI plus CLI plus API, Community Cloud tier (no SLA) and Secure Cloud tier (SLA), plus a serverless mode for bursty inference.
Lambda Labs - focused AI cloud, narrow catalogue centred on H100 / H200 / B200 SXM and A100, per-minute billing with no egress fees, signature 1-Click Clusters from 16 to 2,000+ GPUs.
Vast.ai - distributed peer-to-peer marketplace, 68+ GPU classes, market-set per-second prices from individual hosts, roughly half the price of fixed-price providers on average, but the cheap tier is interruptible.

Side-by-side comparison

Axis	RunPod	Lambda Labs	Vast.ai
Billing granularity	per-second	per-minute	per-second
H100 SXM 80GB on-demand	$3.29/hr	$4.29/hr (1×) · $3.99/hr (8×, per GPU)	$2.13/hr (30-day median, range $1.33-$6.71)
H100 PCIe 80GB	$2.89/hr	$3.29/hr	not split out - H100 SXM/NVL only
A100 SXM 80GB	$1.49/hr	$2.79/hr (8×, per GPU)	$1.00/hr (median, $0.27-$2.67)
A100 PCIe 80GB	$1.39/hr	not listed at 80GB	$0.67/hr (median, $0.11-$1.53)
H200 / B200	H200 $4.39, B200 $5.89	H100/B200 cluster (contact)	H200 $3.95 med, B200 $3.97 med
GPU catalogue breadth	18+ classes (H100 SXM/PCIe/NVL, A100, H200, B200, L40S, RTX Pro 6000, RTX 5090, RTX 4090, L4, RTX 6000 Ada, A40, A6000, A5000…)	narrow - H100/A100/H200/B200 SXM + 1× variants	68+ classes - full spectrum RTX 3060 → B300 SXM
Multi-GPU clusters	Instant Clusters (up to 8 GPUs); Reserved Clusters scale to 10,000+	1-Click Clusters 16-2,000+ HGX H100/B200 SXM	single-host instances - no native multi-host cluster
SLA / availability	Secure Cloud SLA; Community Cloud no SLA	on-demand SLA on standard tier	market - on-demand more stable, interruptible can be reclaimed
Egress fees	typically none	no egress fees	varies per host
Scale-to-zero	Serverless tier yes; Pods no	no (always on)	per-second billing acts similarly
Persistent storage	Network Storage $0.05-0.20/GB/mo (tiered)	filesystems available	varies per host
Web UI / CLI / API	full	full	full (more technical)
Best for	broad workload mix; fast spin-up; serverless inference	multi-GPU SXM training; ML labs	budget batch; fault-tolerant inference; experiments

Verified pricing - 2026-05-20

The numbers above pulled from each provider's live pricing page on 2026-05-20.

Provider	GPU	$/hour	Source
RunPod (Pods)	H100 SXM 80GB	$3.29	runpod.io/pricing
RunPod (Pods)	H100 PCIe 80GB	$2.89	same
RunPod (Pods)	H100 NVL 94GB	$3.19	same
RunPod (Pods)	A100 SXM 80GB	$1.49	same
RunPod (Pods)	A100 PCIe 80GB	$1.39	same
RunPod (Pods)	H200 141GB	$4.39	same
RunPod (Pods)	B200 192GB	$5.89	same
RunPod (Pods)	RTX 4090 24GB	$0.69	same
RunPod (Pods)	RTX 5090 32GB	$0.99	same
RunPod (Pods)	L40S 48GB	$0.86	same
Lambda Labs	H100 SXM 80GB (1×)	$4.29	lambda.ai/service/gpu-cloud
Lambda Labs	H100 SXM 80GB (8×, /GPU)	$3.99	same
Lambda Labs	H100 PCIe 80GB	$3.29	same
Lambda Labs	A100 SXM 80GB (8×, /GPU)	$2.79	same
Lambda Labs	A100 SXM 40GB	$1.99	same
Vast.ai (market median, 30d)	H100 SXM 80GB	$2.13 ($1.33-$6.71 observed)	vast.ai/pricing
Vast.ai	H100 NVL 94GB	$1.69 ($1.51-$3.47)	same
Vast.ai	A100 SXM4 80GB	$1.00 ($0.27-$2.67)	same
Vast.ai	A100 PCIe 80GB	$0.67 ($0.11-$1.53)	same
Vast.ai	RTX 4090 24GB	$0.37 ($0.13-$4.00)	same
Vast.ai	RTX 5090 32GB	$0.53	same
Vast.ai	L40S 48GB	$0.53	same

Pricing changes constantly. Always re-verify before committing a long-running rental.

When RunPod wins

Mixed-workload teams. You're running 7B inference today, training a 13B LoRA tomorrow, prototyping with a 4090 next week. RunPod's catalogue has all of them; no need to juggle accounts.

Per-second billing on short jobs. A 15-minute experimental run costs 1/4 of an hour, not a full hour. For interactive research that bills in minutes rather than days, this matters.

Serverless mode for bursty inference. If you're shipping an inference endpoint that gets traffic in spikes, the Serverless tier scales to zero when idle and bills per-second on traffic - for sub-1000-req/day workloads, often the cheapest path of any provider.

Community Cloud for experimentation. Same prices as Secure Cloud, no SLA, no minimum - fine for jobs you can restart if a host evicts you.

When Lambda Labs wins

Multi-GPU SXM clusters. Lambda's 1-Click Clusters (16 to 2,000+ HGX H100 or B200 SXM GPUs with NVLink + NVSwitch) are the indie-accessible way to rent training-scale compute. RunPod and Vast can't match this - RunPod's Instant Clusters cap at 8 GPUs on one host, and Vast is single-host. If your job needs 16+ H100s with high-bandwidth interconnect, Lambda is the answer.

No egress fees + predictable billing. Per-minute billing with no surprise egress cost makes Lambda the easier provider to forecast in a budget meeting. Hyperscalers (AWS, GCP) bury egress costs that can add 20-30% on top of compute.

The boutique service tier. Account managers and direct support exist for serious projects - useful when you need 8× H100s on a deadline and a hosting issue could cost you days.

When Vast.ai wins

Lowest verified per-hour rate on the market. A100 PCIe at $0.67/hr median (vs RunPod $1.39 and Lambda effectively unavailable at this tier). H100 SXM at $2.13/hr median (vs RunPod $3.29 and Lambda $4.29). For fault-tolerant workloads, the savings are 35-60%.

Fault-tolerant batch workloads. Pretraining a small model from scratch? Generating 100K synthetic samples? Running an evaluation suite overnight? These can all checkpoint and resume; the interruptible-tier savings on Vast compound across long runs.

RTX consumer GPUs at indie prices. RTX 4090 at $0.37/hr median ($0.13 at the cheap end). RTX 5090 at $0.53/hr median. RTX 3090 at $0.15/hr. For workloads that fit in 24GB VRAM, this is the structurally cheapest GPU access anywhere - by a wide margin.

Hyper-budget experimentation. If your monthly GPU budget is under $100, Vast is essentially the only option that gets you real GPU hours.

Where none of them are right

Production with SLA at the lowest price. Hyperstack on-demand beats all three on the SLA tier: H100 SXM $2.40/hr, A100 SXM $1.60/hr. Not in this comparison because it's a different player class - see H100 vs A100 in 2026 for the broader provider table.
Regulated workloads (HIPAA, FedRAMP, SOC 2 + Type II depending on enterprise needs) - AWS p5 or GCP A3 are usually mandated, despite costing 2-3× more per hour.
Sustained 24/7 production inference at 100K+ req/day on one model. At that volume, per-token pricing on hosted-inference platforms (Groq, Together AI, Fireworks, HuggingFace Inference Providers) usually beats renting any of these. See also OpenRouter Free Tier for the gateway-based alternative.
Workloads needing >80GB VRAM. H200 (141GB) on RunPod or Vast, or rent a multi-GPU cluster from Lambda - see workload section.

Workload picks (quick - full math in the H100 vs A100 article)

Inference of Llama 3.3 70B Q4. Cheapest fixed-rate: RunPod A100 SXM at $1.49/hr. Cheapest if interruptible-tolerant: Vast.ai A100 SXM at $1.00/hr median.
LoRA fine-tune of 13B model. RunPod H100 PCIe at $2.89/hr for a stable run; Vast.ai H100 SXM at $2.13/hr median if you can checkpoint.
Multi-GPU pretraining (8× H100 or bigger). Lambda 1-Click Clusters is the only practical indie option - $3.99/GPU/hr × 8 = $31.92/hr for an 8×H100 SXM cluster with NVLink/NVSwitch. RunPod and Vast aren't built for this.
Hobbyist experiments under $30/month. Vast.ai RTX 4090 at $0.37/hr median × ~80 hr = $29.60. Nobody else gets you this kind of GPU access at this price.
24/7 production endpoint, SLA required, smallest bill. None of these three on the budget axis. Hyperstack is the cheapest SLA-tier on-demand - see the H100 vs A100 article for that comparison.

Common gotchas per provider

RunPod Community Cloud. Same prices as Secure, no SLA. Hosts can evict instances. Don't run production endpoints here; use Secure Cloud or Pods on Secure tier when uptime matters.

RunPod Network Storage tiering. Idle volume disk is $0.20/GB/mo (vs running at $0.10) - if you're parking a big checkpoint, this adds up. Use Network Storage Standard at $0.05-0.07/GB/mo for long-term storage instead.

Lambda waitlist for 1-Click Clusters. Capacity is allocated; for 8× SXM clusters there can be a wait queue. Sign up and reserve before you actually need the cluster.

Vast host reliability varies. Always check the host's reliability score and reviews before a long-running job. Use the on-demand tier (not interruptible) for non-batch work, and pick hosts with consistent uptime track records.

Vast network egress varies. Unlike RunPod and Lambda, Vast hosts can set their own data transfer policies. If your workload moves big checkpoints or datasets in and out, factor this into provider choice.

FAQ

Which is fastest to spin up? RunPod and Vast spin up Pods in under 60 seconds with per-second billing. Lambda's on-demand single-GPU spin-up is similar; their 1-Click Clusters can take longer depending on queue and configuration.

Can I move workloads between providers? Yes - all three support standard Docker images and run the same OS/CUDA stacks. The friction is in dataset and checkpoint movement (egress costs at AWS/GCP; minimal on RunPod and Lambda; varies per host on Vast). If you're choosing for long-term use, anchor on the provider with no egress fees (Lambda) or set up cheap object storage that all three can read from.

What's the cheapest way to get H100 in 2026? If you accept interruptions: Vast.ai H100 SXM at ~$2.13/hr 30-day median (range observed down to $1.33). If you need guaranteed uptime: Hyperstack PCIe at $1.90/hr (covered in the H100 vs A100 article, not in this 3-way). Among the three providers here: RunPod H100 PCIe at $2.89/hr is the cheapest fixed-price option.

Do any of these offer free credits? RunPod occasionally runs promo credits; Lambda Labs offers academic-research credits via partnerships; Vast.ai doesn't run a free-credits program. For broader free-GPU paths see Free GPU Compute and NVIDIA Inception Program.

Which is best for fine-tuning a Llama 70B? For LoRA fine-tune at 70B (single GPU 80GB at low rank), all three work - Vast is cheapest, RunPod most flexible. For full fine-tune at 70B (multi-GPU needed), Lambda 1-Click 8×H100 SXM is the practical answer.

Which can scale to 100+ GPUs? Lambda 1-Click Clusters scale to 2,000+ HGX H100/B200 SXM. RunPod Reserved Clusters can scale to 10,000+ via sales contracts. Vast is single-host only - multi-host parallelism is not natively supported.

Bottom line

For most indie AI developers and small teams in 2026:

Default choice → RunPod (broad catalogue, per-second billing, serverless option, both no-SLA and SLA tiers).
Cost-driven choice → Vast.ai (cheapest by ~40-60% if you can handle interruptions; the only realistic option for sub-$100/month GPU budgets).
Multi-GPU training choice → Lambda Labs (the only one of these three set up for 16-2,000+ GPU clusters).

For production-with-SLA at the lowest price, look outside this 3-way - Hyperstack wins that tier (~$2.40/hr H100 SXM on-demand SLA). See the H100 vs A100 comparison for the broader provider table.

Pricing across all three providers will keep moving as B200 supply ramps and H100 prices drift toward A100. Re-check this article every quarter.