Cluster Engine

Orchestration

Effortless Management

Automatically scale and manage containers and GPU workloads across your entire cluster, ensuring maximum performance and GPU utilization at runtime.

Kubernetes-Native

Seamlessly orchestrate complex tasks with Kubernetes, optimized for AI/ML, HPC, and cloud-native applications in a GPU cloud environment.

Get Started Now

Container Management

Prebuilt Containers & Flexibility

Run AI workloads with secure, high-performance GPU-optimized containers or bring your own configurations into our scalable container management system.

Zero Configuration

Containers are automatically deployed with minimal setup, reducing manual engineering and packaging time in GPU cloud operations.

Get Started Now

Monitoring

Real-Time Data & Alerts

Monitor GPU usage and system performance in real-time with custom alerts, ensuring stability across clustered GPU environments.

End-to-End Coverage

Track every container’s performance from start to finish, with full visibility into resource usage and job health.

Get Started Now

Role-based IAM & User Groups

Secure Access

Grant fine-grained permissions to teams working on AI projects using GPU cloud infrastructure, managing access with IAM policies.

User Group Management

Easily manage GPU and cluster access per team or project through role-based user groups—essential for scaling AI deployments securely.

Get Started Now

Security

Multi-Tenant Architecture

Isolated VPCs for each customer to ensure secure, separate network and compute resources.

Private Networking

Dedicated private subnets and secure messaging for end-to-end data integrity and safety.

GMI Cloud Direct Connect & Virtual Private Gateway

Ensure fast and secure access to your GPU cloud platform via private connections and dedicated virtual gateways.

Get Started Now

Launch your cluster now.

Contact Sales

Frequently asked questions

Get quick answers to common queries in our FAQs.

What is the Cluster Engine at GMI Cloud?



The Cluster Engine is GMI Cloud’s on-demand compute power offering platform. There are at least three types of compute services available, which are CE-CaaS(Container), CE-BMaaS(Bare-metal) and CE-Cluster(Managed K8S/Slurm). By leveraging kubernetes, openstack orchestration softwares, and by deploying RDMA networks, Cluster Engine is designed to automate different compute workloads with fine-grained control.

What role do containerization and Kubernetes play?



The CE-CaaS service, which offers prebuilt, GPU‑optimized containers for rapid deployment of AI application workloads, uses Native Kubernetes to ensure seamless, secure, and automated orchestration of small compute workload, with the option to bring your own custom image templates.

What role does openstack play?



The CE-BMaaS service, which offers prebuilt, GPU‑optimized bare-metal servers for rapid deployment of a GPU cluster, which serves AI training or finetuing workloads. By leveraging openstack platform, the CE-BMaaS provisions bare-metal servers with customized OS images and user-defined configuration and post-installation setup.

How does Cluster Engine handle security and user access?



The CE leverages the design of organizations to isolate tenants, incorporating organizational user management with fine-grained role-based access control (RBAC). As for the network isolation and access control, the virtual private network (VPC) mechanism is applied to the internal network isolation with elastic ip for public access, while firewall rules are also introduced to assure the public network security.

What additional capabilities does Cluster Engine offer?



The CE provides real‑time monitoring with customizable alerts to maintain visibility over resource usage and container health. It also includes proprietary high‑performance storage filesystem shared between containers and bare-metals, which is an ideal solution for both AI training or generative AI Inferencing workloads.

Eliminate workflow friction and bring models to production faster than ever with GMI Cloud’s Cluster Engine—an AI/ML Ops environment that streamlines workload management by simplifying virtualization, containerization, and orchestration for seamless AI deployment.

How it Works

GMI Cloud Cluster Engine makes it easy to run AI/ML workloads by automating resource management across AI services, HPC Slurm, and bare-metal infrastructure.

With high-speed storage, distributed file systems, and backup solutions, your data is always accessible and optimized for performance. Containerized storage and persistent volumes ensure smooth deployment, while intelligent workload distribution keeps everything running efficiently at scale.

Key Features

No items found.

Enhancing Security, VPC, and Monitoring on GMI Cloud

Defines roles with specific permissions (e.g., read, write, create).
Assigns roles to users or groups.
Role-based access control (RBAC) provides fine-grained permissicns for users and groups.
By defining roles and assigning them to users or groups, user can limit access to specific resources and actions.
As customer's infrastructure grows, RBAC and user groups help maintain control and prevent unauthorized access.

Creates logical groupings of users.
Simplifies role assignment and management.
User groups simplify administration by allowing you to manage permissions for multiple users collectively.

Multi-Tenant Architecture Isolated VPCs for each customer, ensuring secure, separate network and compute resources.
Virtual Private Subnet Dedicated subnet within each VPC for secure messaging, data transfer, and management.
Private External Gateway Ensures network isolation across VPCs in a multi-tenant setup.
GMI Cloud Direct Connect & Virtual Private Gateway Secure data center connectivity for customers and GMI Cloud teams.
TrendMicro Option Optional security enhancement with TrendMicro.

Continuously track all critical metrics, from system performance to traffic data, with complete visibility.
Continuously monitor all critical performance metrics to guarantee your system operates at peak efficiency.
Log comprehensive historical data of the system for detailed tracking of operations and performance. Easily review past events to identify trends and make informed decisions that optimize system performance and business strategy.

Set specific alert conditions tailored to your needs, enabling precise monitoring of various system metrics. Once custom thresholds are reached, instant notifications are sent to ensure your team stays informed of critical changes and can quickly respond to potential risks.

Deliver comprehensive monitoring coverage from infrastructure to application level, gaining full visibility into each component's performance. Through end-to-end data collection and analysis, quickly identify performance bottlenecks and potential risks, ensuring overall system stability and efficiency.

Efficiently manage and monitor containers, from deployment and scaling to resource allocation, with ease. Gain real-time insights into each container's performance, swiftly identify potential issues, and implement quick fixes to ensure optimal performance in your containerized environment.

Why Choose GMI Cloud?

GMI Cloud Features

On-Demand and Reserved GPU Clusters: Leverage dedicated GPU clusters for high-demand, compute-intensive applications with flexible access options.
Unmatched Cost Efficiency: Benefit from direct manufacturer partnerships that keep costs competitive without compromising quality.
End-to-End Monitoring and Support: Our GMI Cloud Cluster Engine provides complete visibility and control, with real-time monitoring, alert systems, and a user-friendly dashboard for smooth, efficient management.

Simplify Your Infrastructure

With GMI Cloud’s streamlined setup, integrating compute, storage, and networking is simpler than ever. Our unified platform minimizes software sprawl, cutting down operational costs and accelerating your time-to-insight. Enjoy :

Comprehensive Security: Role-based IAM, and dedicated 24/7 security for peace of mind.
Optimized Data Centers: Our data centers meet the highest performance benchmarks with non-blocking InfiniBand networking and robust storage architectures.

GMI Cloud
Cluster Engine

Your AI Control Plane for Cluster and Container Orchestration

Orchestration

Effortless Management

Kubernetes-Native

Container Management

Prebuilt Containers & Flexibility

Zero Configuration

Monitoring

Real-Time Data & Alerts

End-to-End Coverage

Role-based IAM & User Groups

Secure Access

User Group Management

Security

Multi-Tenant Architecture

Private Networking

GMI Cloud Direct Connect & Virtual Private Gateway

Opinions about GMI

Manage the World’s Most Advanced GPUs with Cluster Engine

Frequently asked questions

What is the Cluster Engine at GMI Cloud?

What role do containerization and Kubernetes play?

What role does openstack play?

How does Cluster Engine handle security and user access?

What additional capabilities does Cluster Engine offer?

Cluster Engine

How it Works

Key Features

Enhancing Security, VPC, and Monitoring on GMI Cloud

Role-Based IAM and User Groups

Multi-Tenant VPC Architecture

Monitoring

Why Choose GMI Cloud?

GMI Cloud Features

Simplify Your Infrastructure

Empowering Tomorrow’s AI, Today

Sign up for our newsletter

Subscribe to our newsletter

GMI CloudCluster Engine

Your AI Control Plane for Cluster and Container Orchestration

Orchestration

Effortless Management

Kubernetes-Native

Container Management

Prebuilt Containers & Flexibility

Zero Configuration

Monitoring

Real-Time Data & Alerts

End-to-End Coverage

Role-based IAM & User Groups

Secure Access

User Group Management

Security

Multi-Tenant Architecture

Private Networking

GMI Cloud Direct Connect & Virtual Private Gateway

Opinions about GMI

Manage the World’s Most Advanced GPUs with Cluster Engine

Frequently asked questions

What is the Cluster Engine at GMI Cloud?

What role do containerization and Kubernetes play?

What role does openstack play?

How does Cluster Engine handle security and user access?

What additional capabilities does Cluster Engine offer?

Cluster Engine

How it Works

Key Features

Enhancing Security, VPC, and Monitoring on GMI Cloud

Role-Based IAM and User Groups

Multi-Tenant VPC Architecture

Monitoring

Why Choose GMI Cloud?

GMI Cloud Features

Simplify Your Infrastructure

Empowering Tomorrow’s AI, Today

Sign up for our newsletter

Subscribe to our newsletter

GMI Cloud
Cluster Engine