Announcing Qwen 3, now on GMI Cloud

Qwen3's 235B-A22B and 32B models are now available to be used!

April 28, 2025

Why managing AI risk presents new challenges

Aliquet morbi justo auctor cursus auctor aliquam. Neque elit blandit et quis tortor vel ut lectus morbi. Amet mus nunc rhoncus sit sagittis pellentesque eleifend lobortis commodo vestibulum hendrerit proin varius lorem ultrices quam velit sed consequat duis. Lectus condimentum maecenas adipiscing massa neque erat porttitor in adipiscing aliquam auctor aliquam eu phasellus egestas lectus hendrerit sit malesuada tincidunt quisque volutpat aliquet vitae lorem odio feugiat lectus sem purus.

  • Lorem ipsum dolor sit amet consectetur lobortis pellentesque sit ullamcorpe.
  • Mauris aliquet faucibus iaculis vitae ullamco consectetur praesent luctus.
  • Posuere enim mi pharetra neque proin condimentum maecenas adipiscing.
  • Posuere enim mi pharetra neque proin nibh dolor amet vitae feugiat.

The difficult of using AI to improve risk management

Viverra mi ut nulla eu mattis in purus. Habitant donec mauris id consectetur. Tempus consequat ornare dui tortor feugiat cursus. Pellentesque massa molestie phasellus enim lobortis pellentesque sit ullamcorper purus. Elementum ante nunc quam pulvinar. Volutpat nibh dolor amet vitae feugiat varius augue justo elit. Vitae amet curabitur in sagittis arcu montes tortor. In enim pulvinar pharetra sagittis fermentum. Ultricies non eu faucibus praesent tristique dolor tellus bibendum. Cursus bibendum nunc enim.

Id suspendisse massa mauris amet volutpat adipiscing odio eu pellentesque tristique nisi.

How to bring AI into managing risk

Mattis quisque amet pharetra nisl congue nulla orci. Nibh commodo maecenas adipiscing adipiscing. Blandit ut odio urna arcu quam eleifend donec neque. Augue nisl arcu malesuada interdum risus lectus sed. Pulvinar aliquam morbi arcu commodo. Accumsan elementum elit vitae pellentesque sit. Nibh elementum morbi feugiat amet aliquet. Ultrices duis lobortis mauris nibh pellentesque mattis est maecenas. Tellus pellentesque vivamus massa purus arcu sagittis. Viverra consectetur praesent luctus faucibus phasellus integer fermentum mattis donec.

Pros and cons of using AI to manage risks

Commodo velit viverra neque aliquet tincidunt feugiat. Amet proin cras pharetra mauris leo. In vitae mattis sit fermentum. Maecenas nullam egestas lorem tincidunt eleifend est felis tincidunt. Etiam dictum consectetur blandit tortor vitae. Eget integer tortor in mattis velit ante purus ante.

  1. Vestibulum faucibus semper vitae imperdiet at eget sed diam ullamcorper vulputate.
  2. Quam mi proin libero morbi viverra ultrices odio sem felis mattis etiam faucibus morbi.
  3. Tincidunt ac eu aliquet turpis amet morbi at hendrerit donec pharetra tellus vel nec.
  4. Sollicitudin egestas sit bibendum malesuada pulvinar sit aliquet turpis lacus ultricies.
“Lacus donec arcu amet diam vestibulum nunc nulla malesuada velit curabitur mauris tempus nunc curabitur dignig pharetra metus consequat.”
Benefits and opportunities for risk managers applying AI

Commodo velit viverra neque aliquet tincidunt feugiat. Amet proin cras pharetra mauris leo. In vitae mattis sit fermentum. Maecenas nullam egestas lorem tincidunt eleifend est felis tincidunt. Etiam dictum consectetur blandit tortor vitae. Eget integer tortor in mattis velit ante purus ante.

Today we’re excited to announce that Qwen 3 32B and Qwen 3 235B are now available on GMI Cloud’s US-based inference clusters with global deployment support taking advantage of our datacenters around the globe.

Built by Alibaba’s Qwen team and open-sourced under the permissive Apache 2.0 license, Qwen 3 models represent a new leap forward in open LLM performance, flexibility, and multilingual accessibility. And now, for the first time, developers can deploy these models instantly on high-availability, low-latency infrastructure in the USA backed by GMI Cloud’s purpose-built AI stack.

Why Qwen 3 Matters

The flagship Qwen 3 235B-A22B model boasts 235 billion total parameters (22B activated), and rivals the performance of models like Gemini 2.5 Pro and Grok-3 in STEM, coding, long-context tasks, and multilingual reasoning.

Meanwhile, the smaller Qwen 3 32B model offers elite performance at a lighter footprint and lower latency—ideal for production inference at scale.

Key innovations include:

  • Hybrid Thinking Modes — Switch between "thinking" (step-by-step reasoning) and "non-thinking" (rapid-response) modes dynamically, depending on task complexity and budget constraints.

  • Massive Context Windows — With up to 128K tokens, Qwen 3 models can handle longer documents, more detailed instructions, and sustained multi-turn conversations.

  • Multilingual Mastery — With support for 119 languages and dialects, Qwen 3 is among the most globally accessible models available today.

  • Agentic-Ready — Optimized for tool use, code execution, and compatibility with emerging agent standards like MCP (Multi-Agent Capability Protocol).

What This Unlocks for Developers

Qwen 3's hybrid thinking, massive context length, and multilingual fluency create new opportunities for AI developers that simply weren't practical before:

  • Dynamic cost-quality tradeoffs: Fine-tune if "thinking" is needed—balancing speed, depth, and cost according to your task.
  • International deployment: Build multilingual applications that seamlessly serve users in over 100 languages with native fluency, without needing external translation layers.
  • Long-form reasoning: Handle inputs like technical documents, legal contracts, or research papers in a single pass, maintaining nuanced understanding across 128K-token sequences.
  • Tool-augmented agents: Build agents that can reason, plan, and interact with APIs and services intelligently, natively supporting tool-calling workflows through MCP integrations.

Real-world use cases now within reach:

  • Launch a multilingual support agent that reasons through complex product manuals without needing separate translation pipelines.
  • Deploy a global customer service assistant that switches between fast-response mode and deep reasoning depending on user queries.
  • Build AI research copilots that analyze full research papers and technical documents in a single session, using full 128K-token context windows.
  • Create tool-augmented agents that dynamically interact with APIs, databases, and workflows, powered by native MCP support.
  • Develop adaptive agents that toggle between fast interaction and deep thinking modes depending on system load or user preference.

Amplifying what you can do with Qwen

  • Customize deployments using our Inference Engine—adjust latency, throughput, and scaling parameters easily to meet specific application needs.
  • Optimize resource usage with Cluster Engine—balance GPU allocation dynamically for maximum efficiency and predictable costs.
  • Deploy globally with our multi-region infrastructure—giving you the ability to serve users close to their geographic location and fully leverage Qwen 3's multilingual capabilities.
  • Scale flexibly by distributing workloads across multiple GPUs—perfect for high-volume, low-latency, or long-context AI applications.

Before Qwen 3, delivering scalable multilingual agents, reasoning engines, or cost-optimized AI applications meant stitching together multiple models or relying on proprietary platforms. Now, it’s open-source—and production-ready !—on GMI Cloud.

Why GMI Cloud

GMI Cloud is purpose-built for the AI workloads of today and tomorrow:

  • Inference-Optimized Clusters — Tuned for high-throughput, low-latency large model serving.

  • Transparent Pricing — Simple, predictable billing without hidden fees.

  • Instant API Access — Launch OpenAI-compatible APIs through frameworks like vLLM and SGLang with minimal setup.

  • Enterprise-Grade Reliability — High availability, secure deployments, and scalable capacity as your needs grow.

Whether you're running autonomous agents, building a multilingual co-pilot, or researching new AI behaviors, Qwen 3 is now just a few clicks away.

Get Started

Ready to build agents, copilots, or next-gen AI products?

Spin up Qwen 3 32B and 235B today on GMI Cloud’s Inference Engine—with flexible scaling, API simplicity, and no surprises.

Read Qwen's blog announcement.

Build faster, think deeper—with Qwen 3 on GMI Cloud.

Get started today

Give GMI Cloud a try and see for yourself if it's a good fit for AI needs.

Get started
14-day trial
No long-term commits
No setup needed
On-demand GPUs

Starting at

$4.39/GPU-hour

$4.39/GPU-hour
Private Cloud

As low as

$2.50/GPU-hour

$2.50/GPU-hour