CoreWeave, Lambda, Nebius, and GMI: H200 GPU Provider Pricing 2026
May 28, 2026
GPU cloud pricing pages usually lead with a number. That number, the per-hour rate for a given GPU model, is real but incomplete. **Two providers listing the same H200 at different hourly rates may produce nearly identical monthly bills once egress charges, storage, and commitment requirements are factored in. Or the cheaper provider may turn out to have access restrictions that make the rate unusable without a minimum cluster commitment. This piece compares how major H200 providers structure their pricing in 2026. who is cheapest per hour, who offers monthly discounts and what those require, and where fees outside the GPU rate tend to accumulate.
H200 Pricing Structures: Three Models, Different Total Costs
Before comparing rates, the billing model matters. H200 providers in 2026 generally fall into three pricing structures:
- On-demand: Pay per hour, no advance commitment, start and stop freely. Highest per-hour rate in most cases. Suitable for experimentation, variable workloads, and teams that cannot predict GPU requirements in advance.
- Reserved/Committed: Lock in capacity for one to three years in exchange for 15-40% lower rates. Requires predicting workload volume upfront. Hyperscalers favor this model.
- Spot/Preemptible: Lowest per-hour rate, typically 40-65% below on-demand. Subject to interruption when demand spikes. Practical for batch processing and fault-tolerant training with checkpoint support. Not suitable for production inference.
Most published pricing comparisons cite on-demand rates. Monthly discounts are usually only available through reserved contracts, and spot pricing carries operational constraints that need to be accounted for before treating them as equivalent options.
Per-Hour Rate Comparison Across Major Providers
H200 on-demand pricing across providers currently ranges from approximately $2.29 to $13.78 per GPU per hour. The spread reflects provider type, deployment model, and what is included in the base rate.
H100 and H200 On-Demand Rate Reference (May 2026)
| Provider | H100 (per GPU/hr) | H200 (per GPU/hr) | Min. Commitment | Single GPU Access |
|---|---|---|---|---|
| GMI Cloud | $2.00 | $2.60 | None | Yes |
| Lambda Labs | $2.49-$3.44 | $4.49 | None | Yes |
| RunPod | $1.99-$2.50 | $2.69-$3.59 | None | Yes (spot risk) |
| CoreWeave | $4.25-$6.16 | $6.31 | 8-GPU bundle | No single GPU |
| Spheron | $2.01-$2.50 | Varies | None | Yes |
| AWS | ~$6.88 | ~$8.00+ | None (on-demand) | 8-GPU node minimum |
| Azure | ~$12.29 | up to $13.78 | None (on-demand) | Varies |
Prices are on-demand rates based on publicly available data as of May 2026. Spot and reserved pricing vary. Verify current rates with each provider before procurement decisions.
The gap between GMI Cloud at $2.60/hr and CoreWeave at $6.31/hr on a single H200 GPU is $2,664 per month at continuous operation.On an 8-GPU cluster, that difference reaches roughly $21,312 per month. These are not rounding errors. They are structural differences in how each provider positions their offering in the market.
Two observations from the table are worth noting. First, RunPod's listed rate is close to GMI Cloud's, but the on-demand tier carries preemption risk on spot configurations. For production inference, this matters. Second, AWS and Azure are providing H200 access at hyperscaler overhead rates. their pricing is not competitive for teams whose primary need is GPU compute.
Monthly Discount Structures: Who Offers Them and What They Require
Monthly or annual discounts exist at most major providers, but they are not equivalent across the board.
- Hyperscalers (AWS, GCP, Azure): Reserved instance pricing offers 30-60% discounts against on-demand rates for 1-year or 3-year commitments. The discount is meaningful, but the H200 on-demand rate is high enough that even the discounted reserved rate may not match specialized providers' on-demand pricing. Reserved instances also require predicting capacity requirements many months in advance.
- CoreWeave: Multi-year committed contracts offer 15-30% discounts. Designed for teams running large-scale training clusters with predictable long-term requirements. Minimum cluster size requirements apply.
- Lambda Labs and RunPod: Volume pricing available through direct contact for teams with high sustained GPU usage. No published standard reserved pricing structure.
- GMI Cloud: Committed pricing structures are available for enterprise teams with predictable long-term GPU needs, negotiated through the enterprise sales channel. The standard on-demand rate has no minimum commitment and no requirement to pre-purchase capacity.
For teams with stable, predictable GPU workloads over 12 or more months, reserved pricing at hyperscalers may close part of the gap with specialized providers. For teams whose volume is uncertain or variable, on-demand pricing from specialized providers is more cost-efficient because there is no idle capacity to carry.
Hidden Fees That Don't Appear in the Headline Rate
The GPU hourly rate is the largest line item in most cloud bills, but not the only one. These fee categories add 20-40% to monthly hyperscaler GPU bills and are less common or absent at specialized providers.
Egress fees: Hyperscalers (AWS, GCP, Azure) charge $0.08-$0.12 per GB for data leaving their cloud. Moving large model weights, training datasets, or inference outputs out of a hyperscaler environment adds up rapidly. A team moving 10 TB of data per month incurs $800-$1,200 in egress charges alone.
Storage: High-performance storage for model checkpoints, datasets, and inference caches carries separate per-GB pricing on hyperscaler platforms. For teams running large-scale training with frequent checkpoints, this is a meaningful additional cost.
Cross-zone and inter-region networking: Distributed training across multiple GPU nodes or regions incurs additional network charges on hyperscaler platforms. On dedicated specialized infrastructure, intra-cluster traffic is typically not billed separately.
Provisioning overhead: Hyperscalers abstract GPU access through virtualized layers that reduce effective compute output by 10-15% compared to dedicated bare-metal or near-bare-metal infrastructure. The published hourly rate buys less actual compute throughput than the same rate at a provider running near-bare-metal configurations.
What GMI Cloud's Pricing Structure Covers
GMI Cloud prices H100 at $2.00 per GPU-hour and H200 at $2.60 per GPU-hour on-demand, with no minimum commitment, no reserved contract required, and no bundle minimum for access.
These are not introductory or promotional rates. They are the published on-demand rates, available to any team through the console without a sales engagement for standard configurations.
The pricing structure reflects how GMI Cloud positions its infrastructure. As an NVIDIA Preferred Partner and Reference Architecture Provider, GMI Cloud operates on owned data center hardware rather than reselling capacity through intermediaries. The Cluster Engine is built to recover the 10-15% virtualization overhead common on hyperscaler platforms, meaning each hour of GPU time produces more usable compute output relative to equivalent-rate competitors that run virtualized infrastructure.
On hidden fees, GMI Cloud does not charge standard egress fees and offers ingress fee negotiation for teams moving large datasets to the platform. For teams comparing total monthly cost rather than per-hour rate, this is a meaningful structural difference from hyperscaler pricing.
What is not included: GMI Cloud's published pricing covers GPU-hours. Enterprise SLAs, dedicated cluster configurations for large-scale distributed training, and committed capacity pricing require engagement with the enterprise sales team. For standard on-demand GPU access, the console atconsole.gmicloud.aiprovides immediate access without a sales process.
Customer data validates the cost differential in production: Higgsfield reduced compute costs by 45% after moving workloads to GMI Cloud. LegalSign.ai found GMI Cloud 50% more cost-effective than their previous provider. These figures reflect total cost of operation, not just per-GPU-hour rates.
GMI Cloud pricing details are atgmicloud.ai/en/pricing.
Read the Full Rate Card, Not Just the GPU Line
The hourly GPU rate is the right starting point for comparing H200 providers, but it is not a complete picture. A provider at $4.50/hr with no egress fees and no bundle minimum may be materially cheaper in practice than a provider at $3.00/hr with 8-GPU minimums and $0.10/GB egress on everything that leaves the platform.
The comparison that produces an accurate cost estimate requires four inputs: per-GPU-hour rate, egress and storage charges, minimum access requirements, and whether spot or on-demand applies to the workload. Running that calculation across the providers in this article will narrow the field quickly.
Colin Mo
Build AI Without Limits
GMI Cloud helps you architect, deploy, optimize, and scale your AI strategies
