Now Available: Optimized DeepSeek-R1 On GMI Cloud

GMI Cloud is happy to announce that we are hosting DeepSeek and its distilled models!

GMI Cloud is excited to announce that we are now hosting a dedicated DeepSeek-R1 inference endpoint, on optimized, US-based hardware.

What's DeepSeek-R1? Read our initial takeaways here.

Technical details:

  • Model Provider: DeepSeek
  • Type: Chat
  • Parameters: 685B
  • Deployment: Serverless (MaaS) or Dedicated Endpoint
  • Quantization: FP16
  • Context Length: The model can remember and process up to 128,000 tokens from previous inputs within a single session.

Additionally, we are offering the following distilled models:

  • DeepSeek-R1-Distill-Llama-70B
  • DeepSeek-R1-Distill-Qwen-32B
  • DeepSeek-R1-Distill-Qwen-14B
  • DeepSeek-R1-Distill-Llama-8B
  • DeepSeek-R1-Distill-Qwen-7B
  • DeepSeek-R1-Distill-Qwen-1.5B

Try our token-free service with unlimited usage!

Reach out for access to our dedicated endpoint here.

Build AI Without Limits
GMI Cloud helps you architect, deploy, optimize, and scale your AI strategies
Get Started Now

Ready to build?

Explore powerful AI models and launch your project in just a few clicks.
Get Started