← Back to Blog

Tencent Hy-MT2: Fast-Thinking Multilingual Translation Models

by CurateClick Team

Tencent Open-Sources Hy-MT2 Series: Three Models Redefine Translation

"Fast-thinking" multilingual translation models are here.


TL;DR

On May 21, 2026, Tencent Hunyuan officially open-sourced the Hy-MT2 family of multilingual translation models with three sizes:

  • Hy-MT2-1.8B — Lightweight, fits in a phone at 440MB quantized
  • Hy-MT2-7B — Mid-range, runs on a single GPU
  • Hy-MT2-30B-A3B — MoE architecture, 30B total params, 3B active

All three support 33 languages including Mandarin, Cantonese, English, French, Japanese, Korean, Arabic, Russian, Tibetan, and Uyghur.

GitHub: 213 stars. HuggingFace and ModelScope available now.


What is "Fast-Thinking" Translation?

Traditional LLM translation uses "slow thinking" — understand the full semantics first, then generate. Tencent introduced a "fast-thinking" paradigm here: react like a professional human translator, cutting unnecessary reasoning overhead.

Results: 7B and 30B-A3B in fast-thinking mode outperform DeepSeek-V4-Pro and Kimi K2.6 on translation tasks. And the lightweight 1.8B model beats mainstream commercial APIs like Microsoft and Doubao overall.

That's remarkable — a 1.8B on-device model outperforming commercial cloud services.


Choosing the Right Model

Hy-MT2-1.8B: The Mobile Powerhouse

  • Parameters: 1.8B
  • Quantized size: 440MB (1.25-bit extreme quantization)
  • Inference speed: 1.5x faster

Target: on-device deployment. Phones, tablets, embedded devices. Tencent's AngelSlim quantization compresses 1.8B to 440MB while actually speeding up inference.

Available on HuggingFace in FP8, GGUF, and even 2-bit / 1.25-bit extreme quantization variants.

Hy-MT2-7B: The Sweet Spot

  • Parameters: 7B
  • Recommended: Single A100 or RTX 4090
  • Quantization: FP8 / GGUF

7B is the most popular open-source model size. Tencent provides four inference solutions: transformers, vLLM, SGLang, and llama.cpp — covering research to production.

Ideal for server-side deployment where you need high quality without operating a massive model.

Hy-MT2-30B-A3B: MoE Brutality

  • Architecture: Mixture of Experts (MoE)
  • Total params: 30B
  • Active params: 3B per forward pass

MoE logic: 30B knowledge, 3B compute cost. Only 3B parameters activate per inference, but theoretically taps 30B's knowledge capacity.

Best for highest translation quality demands: legal, medical, or technical documentation.


Supported Languages (33)

Mandarin, Cantonese, English, French, Spanish, Japanese, Korean, Russian, Arabic, Thai, Vietnamese, Hindi, Traditional Chinese, Tibetan, Uyghur, and more.


Instruction-Following Capabilities

Hy-MT2 isn't just a "translator." It follows complex translation instructions:

  • Terminology consistency: Provide reference translations, model keeps terminology unified
  • Style control: Specify formal/casual/literary tone
  • Delimiter preservation: Special characters in code/templates stay intact
  • Structured data translation: JSON keys don't translate, only user-visible text
  • Personalized preferences: e.g., "translate with a Northern Chinese dialect feel"
  • Context integration: Provide background, model translates with context

These capabilities are evaluated via IFMTBench, which Tencent also open-sourced.


Deployment Options

  • Quick prototype / research: transformers (HuggingFace)
  • Production / high throughput: vLLM / SGLang
  • Local lightweight: llama.cpp (GGUF)
  • Mobile / on-device: AngelSlim 1.25-bit quantization

llama.cpp inference relies on Tencent's open-source STQ kernel (llama.cpp PR #22836) — requires building from source.


Open Source Ecosystem

Tencent also partnered with WMT26 to sponsor a video subtitle translation task.


Summary

Hy-MT2's core strengths:

  1. Three sizes covering phones to servers
  2. 33 languages + 5 dialects, true multilingual
  3. "Fast-thinking" paradigm for efficient inference
  4. Strong instruction following beyond plain MT
  5. Fully open source: quantization tools, inference scripts, training pipeline

If you're building multilingual products, translation tools, or need high-quality localization, Hy-MT2 is worth trying. A 1.8B model that fits in a phone is interesting enough on its own.