Alibaba Unveils Qwen3

Alibaba Unveils Qwen3: 5 Reasons Their ‘Hybrid’ AI Models Stand Out

Follow Us:

Mirror Review

April 29th, 2025

Alibaba’s Qwen team has officially released Qwen3, the latest generation in their series of large language models, offering a comprehensive suite of both dense and Mixture-of-Experts (MoE) models. This release aims to push the boundaries of AI capabilities. As Alibaba Unveils Qwen3, it turns out to be a significant step in the evolution of their AI offerings.

Qwen3 Details and Performance Highlights

Qwen3 introduces several models with varying parameter sizes, catering to different needs and computational resources. The models range from 0.6 billion to a massive 235 billion parameters.

Qwen3 Models Released:

  • MoE Models (Open-weighted):
  • Qwen3-235B-A22B: 235 billion total parameters, 22 billion activated parameters.
  • Qwen3-30B-A3B: 30 billion total parameters, 3 billion activated parameters.
  • Dense Models (Open-weighted under Apache 2.0 license):
  • Qwen3-32B
  • Qwen3-14B
  • Qwen3-8B
  • Qwen3-4B
  • Qwen3-1.7B
  • Qwen3-0.6B

Alibaba claims that Qwen3 models are competitive with, and in some cases outperform, top models from other leading AI labs such as Google and OpenAI. The flagship Qwen3-235B-A22B model shows strong results in benchmark evaluations for coding, math, and general capabilities when compared to models like DeepSeek-R1, o1, o3-mini, Grok-3, and Gemini-2.5-Pro.

For instance, on the Codeforces programming platform, Qwen3-235B-A22B slightly surpasses OpenAI’s o3-mini and Google’s Gemini 2.5 Pro. It also performs better than o3-mini on challenging math and reasoning benchmarks like AIME and BFCL.

While Qwen3-235B-A22B is not yet publicly available, the largest public model, Qwen3-32B, is competitive with models like DeepSeek’s R1 and outperforms OpenAI’s o1 on several tests, including the LiveCodeBench coding benchmark.

The Qwen team highlighted a key feature, stating, “We have seamlessly integrated thinking and non-thinking modes, offering users the flexibility to control the thinking budget,” and added, “This design enables users to configure task-specific budgets with greater ease”.

Tuhin Srivastava, co-founder and CEO of AI cloud host Baseten, commented on the significance of Qwen3’s release, noting, “The U.S. is doubling down on restricting sales of chips to China and purchases from China, but models like Qwen 3 that are state-of-the-art and open … will undoubtedly be used domestically,” and that it “reflects the reality that businesses are both building their own tools [as well as] buying off the shelf via closed-model companies like Anthropic and OpenAI”.

5 Things That Make Qwen3 Stand Out

Alright, let’s dive into what makes this new Qwen3 model family from Alibaba so interesting. Beyond just the impressive performance numbers, there are some really valuable features that caught my eye. Thinking about how I might use these kinds of models, these details really make a difference.

  1. Hybrid Thinking Modes:

One of the coolest things is what they call “Hybrid Thinking Modes”. Imagine you have a really complex problem that needs deep thought, but then you also have simple questions that need quick answers. Qwen3 can handle both. It has a “Thinking Mode” for step-by-step reasoning on tough tasks and a “Non-Thinking Mode” for fast responses to easy ones. This flexibility is great because it lets you control how much “thinking” the model does, depending on what you need, which can help balance performance and cost.

  1. Extensive Multilingual Support:

Another major plus is the extensive multilingual capability. Qwen3 understands and works with an incredible 119 languages and dialects. This is huge for making AI tools accessible and useful for people all over the world. The training data for Qwen3 was significantly expanded to nearly double that of its predecessor, Qwen2.5, now including approximately 36 trillion tokens covering all these languages.

  1. Improved Agentic and Tool-Calling Capabilities:

The Qwen3 models also have improved agentic capabilities and are better at tool calling. This means they can interact with their environment more effectively and use external tools to complete tasks. They even recommend using their Qwen-Agent tool to make the most of this feature, which simplifies the technical side of things.

  1. Massive and Diverse Training Data:

The training data size for Qwen3 is approximately 36 trillion tokens, nearly twice that of Qwen2.5. They gathered data from various sources, including web content and even PDF-like documents. They also used their previous models to generate synthetic data for math and coding tasks to boost those capabilities. This extensive and diverse dataset contributes significantly to the models’ performance.

  1. Accessibility and Open-weighting:

A significant portion of the Qwen3 model family, including both MoE and dense models, has been open-weighted under the Apache 2.0 license. This makes these powerful models accessible to researchers, developers, and organizations worldwide. They are available on popular platforms like Hugging Face, ModelScope, and Kaggle.

End Note

The release of Qwen3 is seen by the team as a key step towards achieving Artificial General Intelligence (AGI) and Artificial Superintelligence (ASI). They plan to continue improving the models by scaling up data, increasing model size, extending context length, broadening modalities, and advancing reinforcement learning.

It’s clear they believe the future is moving towards training AI agents that can have a more meaningful impact on our work and lives. The Alibaba Unveils Qwen3 moment is certainly one to watch in the evolving landscape of AI.

For those interested in trying out Qwen3, you can access it through the Qwen Chat Web or their mobile app. For developers, recommended frameworks for deployment include SGLang and vLLM, while local usage is supported by tools like Ollama, LMStudio, MLX, llama.cpp, and KTransformers. These options make it quite accessible to integrate Qwen3 into various projects and environments.

Maria Isabel Rodrigues

Share:

Facebook
Twitter
Pinterest
LinkedIn
MR logo

Mirror Review

Mirror Review shares the latest news and events in the business world and produces well-researched articles to help the readers stay informed of the latest trends. The magazine also promotes enterprises that serve their clients with futuristic offerings and acute integrity.

Subscribe To Our Newsletter

Get updates and learn from the best

MR logo

Through a partnership with Mirror Review, your brand achieves association with EXCELLENCE and EMINENCE, which enhances your position on the global business stage. Let’s discuss and achieve your future ambitions.