Pricing

Comprehensive solutions to architect, deploy, optimize, and scale your AI initiatives

Reserved GPUs

As low as
$2.50 / GPU-hour
Contact Sales
Model
Fixed, committed capacity
Use Case
Production workloads, training pipelines
Commitment
Multi-month / year
Benefits
Guaranteed scale, stable cost
Fixed, committed capacity for production workloads
Long-term commitment (multi-month / yearly)
Guaranteed scale with stable, predictable cost
GPU availability
NVIDIA H200
NVIDIA GB200
NVIDIA B200

On-demand GPUs

Starting at
$4.39 / GPU-hour
Contact Sales
Model
Fixed, committed capacity
Use Case
Production workloads, training pipelines
Commitment
Multi-month / year
Benefits
Guaranteed scale, stable cost
Pay-as-you-go for fine-tuning and experimentation
Short-term flexibility (hourly / monthly)
Burstable capacity with maximum adaptability
GPU availability
NVIDIA H200
NVIDIA GB200
NVIDIA B200
Supercharge your GPUs

Inference Engine

GMI Cloud’s inference platform for deploying and scaling LLMs with minimal latency and maximum efficiency
Contact Sales
Supercharge your GPUs

Cluster Engine

A powerful orchestration layer for managing GPU workloads at scale
Contact Sales

Pricing

On-demand GPUs

Starting at

$4.39 / GPU-hour
Get startedContact Sales

GPU Configuration:

8 × NVIDIA H100

CPU Cores

2 x Intel 48 Cores

Memory

2TB

System Disk

2 x 960GB NVMe SSD

Data Disk

8 x 7.6TB NVMe SSD

GPU Compute Network

InfiniBand 400GB/s/GPU

Ethernet Network

100GB/s

Additional features

Cluster Engine
Application Platform
Pay-as-you-go
Reserved Capacity
Volume-based Pricing

Private Cloud

As low as

$2.50 / GPU-hour
Get startedContact Sales

GPU Configuration

8 x NVIDIA H100

CPU Cores

2 x Intel 48 Cores

Memory

2TB

System Disk

2 x 960GB NVMe SSD

Data Disk

8 x 7.6TB NVMe SSD

GPU Compute Network

InfiniBand 400GB/s/GPU

Ethernet Network:

100 GB/s

Additional features

Cluster Engine
Application Platform
Pay-as-you-go
Reserved Capacity
Volume-based Pricing

Frequently asked questions

Get quick answers to common queries in our FAQs.

What types of GPUs do you offer?

We offer NVIDIA H100 GPUs with 80 GB VRAM and high compute capabilities for various AI and HPC workloads. Discover more details at pricing page.

How do you manage GPU clustering and networking for distributed training?

We use NVIDIA NVLink and InfiniBand networking to enable high-speed, low-latency GPU clustering, supporting frameworks like Horovod and NCCL for seamless distributed training. Learn more at gpu-instances.

What software and deep learning frameworks do you support, and how customizable is it?

We support TensorFlow, PyTorch, Keras, Caffe, MXNet, and ONNX, with a highly customizable environment using pip and conda.

What is your GPU pricing, and do you offer cost optimization features?

Our pricing includes on-demand, reserved, and spot instances, with automatic scaling options to optimize costs and performance. Check out pricing.