Use the Cluster Engine as your unified control plane to orchestrate AI and ML workloads across popular frameworks like PyTorch and Hugging Face—powered by robust GPU cloud infrastructure with Kubernetes and Docker integration.
Automatically scale and manage containers and GPU workloads across your entire cluster, ensuring maximum performance and GPU utilization at runtime.
Seamlessly orchestrate complex tasks with Kubernetes, optimized for AI/ML, HPC, and cloud-native applications in a GPU cloud environment.
Get Started NowRun AI workloads with secure, high-performance GPU-optimized containers or bring your own configurations into our scalable container management system.
Containers are automatically deployed with minimal setup, reducing manual engineering and packaging time in GPU cloud operations.
Get Started NowMonitor GPU usage and system performance in real-time with custom alerts, ensuring stability across clustered GPU environments.
Track every container’s performance from start to finish, with full visibility into resource usage and job health.
Get Started NowGrant fine-grained permissions to teams working on AI projects using GPU cloud infrastructure, managing access with IAM policies.
Easily manage GPU and cluster access per team or project through role-based user groups—essential for scaling AI deployments securely.
Get Started NowIsolated VPCs for each customer to ensure secure, separate network and compute resources.
Dedicated private subnets and secure messaging for end-to-end data integrity and safety.
Ensure fast and secure access to your GPU cloud platform via private connections and dedicated virtual gateways.
Get Started NowGet quick answers to common queries in our FAQs.
The Cluster Engine is GMI Cloud’s on-demand compute power offering platform. There are at least three types of compute services available, which are CE-CaaS(Container), CE-BMaaS(Bare-metal) and CE-Cluster(Managed K8S/Slurm). By leveraging kubernetes, openstack orchestration softwares, and by deploying RDMA networks, Cluster Engine is designed to automate different compute workloads with fine-grained control.
The CE-CaaS service, which offers prebuilt, GPU‑optimized containers for rapid deployment of AI application workloads, uses Native Kubernetes to ensure seamless, secure, and automated orchestration of small compute workload, with the option to bring your own custom image templates.
The CE-BMaaS service, which offers prebuilt, GPU‑optimized bare-metal servers for rapid deployment of a GPU cluster, which serves AI training or finetuing workloads. By leveraging openstack platform, the CE-BMaaS provisions bare-metal servers with customized OS images and user-defined configuration and post-installation setup.
The CE leverages the design of organizations to isolate tenants, incorporating organizational user management with fine-grained role-based access control (RBAC). As for the network isolation and access control, the virtual private network (VPC) mechanism is applied to the internal network isolation with elastic ip for public access, while firewall rules are also introduced to assure the public network security.
The CE provides real‑time monitoring with customizable alerts to maintain visibility over resource usage and container health. It also includes proprietary high‑performance storage filesystem shared between containers and bare-metals, which is an ideal solution for both AI training or generative AI Inferencing workloads.