Where to Rent H200 GPUs: Cloud Price, Availability, and Access Compared

May 28, 2026

Finding H200 GPU availability on paper is easy. Finding it on terms that actually work for your project is a different problem.H200 listings exist across 31 cloud providers in 2026, but the gap between the cheapest and most expensive is nearly 10x per GPU-hour, and many listings come with minimum commitments, bundle requirements, or availability constraints that are not obvious from the headline rate.This piece covers which platforms offer genuine on-demand H200 access, what each costs per hour, and where the access requirements differ in ways that affect whether an option is actually usable.

Why the H200 and Why It Matters Now

The H200 is NVIDIA's Hopper-generation GPU with 141GB of HBM3e memory and 4,800 GB/s of memory bandwidth. Compared to the H100's 80GB, the additional VRAM changes the math on several practical workloads.

Running LLaMA 4 Maverick (400B) on H100s requires two full 8-GPU nodes. On an 8-GPU H200 node, it fits on one.
70B-class models run with substantial headroom on a single H200, compared to tight fits or multi-GPU requirements on H100.
Production inference for large models benefits from the bandwidth increase: 4,800 GB/s versus 3,350 GB/s on the H100.

For teams whose workloads have outgrown H100 capacity, H200 access has become a practical requirement rather than a hardware preference.

The H200 Market in 2026: Price Range and Availability

As of May 2026, H200 on-demand pricing across tracked providers runs from $1.45 to $13.78 per GPU per hour, with a market median around $3.95. The spread reflects a mix of provider type, deployment model, and access requirements.

Key structural differences that affect whether a listed price is actually accessible:

Bundle minimums: Hyperscalers and some enterprise neo-clouds (including CoreWeave) require 8-GPU H200 clusters. A team that needs one or two GPUs cannot access these listings regardless of budget.
Spot vs on-demand: Spot pricing can fall significantly below on-demand rates but carries preemption risk. For training runs with checkpointing, spot can reduce cost by 40-65%. For production inference, preemption is generally not acceptable.
Commitment requirements: Some providers offer lower rates only against 1-year or 3-year reserved contracts. On-demand access at those rates is not available.

H200 Provider Comparison

Provider	H200 Price (on-demand)	Min. Commitment	Single GPU Available	Notes
GMI Cloud	$2.60/hr	None	Yes	On-demand, no minimum, dedicated
Theta EdgeCloud	~$2.29/hr	Varies	Limited	Marketplace-tier, availability varies
RunPod	~$2.69/hr (SXM)	None	Yes	Spot tier available; preemption possible
Jarvislabs	$3.80/hr	None	Yes	Single GPU on-demand
Lambda Labs	$4.49/hr	None	Yes	On-demand, widely available
CoreWeave	$6.31/hr	8-GPU bundle	No (except GH200)	Enterprise SLA, cluster model
AWS	$6.88+/hr	None (on-demand)	8-GPU node minimum	Hyperscaler, highest overhead
Azure	up to $13.78/hr	None (on-demand)	Varies	Highest listed rate in market

Prices are based on publicly available on-demand rates as of May 2026. Spot and reserved pricing vary. Check individual provider pages for current rates before committing.

GMI Cloud H200: On-Demand Access at $2.60/hr

GMI Cloud offers H200 GPU access at $2.60 per GPU-hour on-demand, with no minimum commitment, no bundle requirement, and no reservation needed to start. This is among the lowest available rates for dedicated, named-provider H200 access in the current market.

What "on-demand with no minimum" means in practice:A team can spin up a single H200 for an afternoon of experimentation, scale to a multi-GPU cluster for a training run, and scale back down without carrying committed capacity between jobs. There is no contract to exit and no idle cost between jobs.

GMI Cloud is an NVIDIA Preferred Partner and Reference Architecture Provider, running H200 (alongside H100, GB200, and Blackwell) on owned data center hardware with 99.99% platform availability. The infrastructure is not a marketplace of community-contributed GPUs. Hardware and network are operated by GMI Cloud, which affects consistency and support access compared to spot marketplace models.

For teams comparing options:

At $2.60/hr versus Lambda Labs' $4.49/hr, the gap on a single H200 running 720 hours per month is approximately $1,360/month per GPU.
At $2.60/hr versus CoreWeave's $6.31/hr on an 8-GPU cluster, the monthly difference is roughly $10,800 for equivalent GPU-hours.
Unlike CoreWeave or hyperscaler H200 listings, GMI Cloud supports single-GPU access, removing the 8-GPU minimum barrier for teams that do not need a full node.

Access starts through the GMI Cloud console atconsole.gmicloud.ai.No sales call required for on-demand H200 access. For teams with predictable long-term GPU requirements, reserved capacity and committed pricing are available through GMI Cloud's enterprise tier.

What to Verify Before Committing

The headline per-hour rate is one input. These variables change the effective cost and usability:

Dedicated vs. shared hardware: Shared or virtualized instances affect performance consistency, especially for inference workloads with latency requirements. Confirm whether the H200 listing is dedicated or multi-tenant.
NVLink availability: For multi-GPU workloads requiring fast GPU-to-GPU communication (distributed training, large model inference), confirm whether the cluster uses NVLink or slower PCIe interconnect.
Egress fees: Hyperscalers charge $0.08-$0.12/GB for outbound data. Most neo-clouds include bandwidth in the instance rate. For workloads moving large model checkpoints or datasets, this difference adds up.
Spot preemption policy: If using spot pricing, verify the preemption notice window and whether the provider supports checkpointing tools that reduce re-start cost.

The Usable Rate, Not the Listed Rate

The 10x price spread across H200 providers collapses considerably once access requirements are applied. An $13/hr hyperscaler option that bundles in a 8-GPU minimum and charges egress fees is not competing on the same terms as a $2.60/hr on-demand single-GPU option with no commitment required.

For most teams evaluating H200 rental in 2026, the relevant comparison is among providers that offer on-demand single-GPU access with no minimum commitment. On that basis, the current market for dedicated, reliable H200 access starts at $2.60/hr on GMI Cloud, with Lambda Labs and RunPod in the $2.69-$4.49 range depending on configuration and billing model.

Pricing details and GPU availability are atgmicloud.ai/en/pricingandconsole.gmicloud.ai.

Colin Mo

Build AI Without Limits

GMI Cloud helps you architect, deploy, optimize, and scale your AI strategies

Ready to build?

Explore powerful AI models and launch your project in just a few clicks.

Get Started