DeepSeek R1 Distill Qwen 32B

A compact Qwen-32B distillation of DeepSeek-R1 optimized for deliberate reasoning, coding agents, and retrieval workflows. Versus Qwen 32B Instruct, it delivers higher step accuracy and better multilingual reliability with lower serving cost.
Model Library
Model Info

Provider

DeepSeek

Model Type

LLM

Context Length

131K

Video Quality

Video Length

Capability

Text-to-Text

Serverless

Available

Pricing

$0.5 / $0.9 per 1M input/output tokens

GMI Cloud Features

Serverless

Access your chosen AI model instantly through GMI Cloud’s flexible pay-as-you-go serverless platform. Integrate easily using our Python SDK, REST interface, or any OpenAI-compatible client.

State-of-the-Art Model Serving

Experience unmatched inference speed and efficiency with GMI Cloud’s advanced serving architecture. Our platform dynamically scales resources in real time, maintaining peak performance under any workload while optimizing cost and capacity.

Dedicated Deployments

Run your chosen AI model on dedicated GPUs reserved exclusively for you. GMI Cloud’s infrastructure provides consistent performance, high availability, and flexible auto-scaling to match your workloads.
Try
DeepSeek R1 Distill Qwen 32B
now.
Try this model now.
Try this Model

Ready to build?

Explore powerful AI models and launch your project in just a few clicks.
Get Started