Tencent Open-Sources Hy-MT2 Series: Three Models Redefine Translation

"Fast-thinking" multilingual translation models are here.

TL;DR

On May 21, 2026, Tencent Hunyuan officially open-sourced the Hy-MT2 family of multilingual translation models with three sizes:

Hy-MT2-1.8B — Lightweight, fits in a phone at 440MB quantized
Hy-MT2-7B — Mid-range, runs on a single GPU
Hy-MT2-30B-A3B — MoE architecture, 30B total params, 3B active

All three support 33 languages including Mandarin, Cantonese, English, French, Japanese, Korean, Arabic, Russian, Tibetan, and Uyghur.

GitHub: 213 stars. HuggingFace and ModelScope available now.

What is "Fast-Thinking" Translation?

Traditional LLM translation uses "slow thinking" — understand the full semantics first, then generate. Tencent introduced a "fast-thinking" paradigm here: react like a professional human translator, cutting unnecessary reasoning overhead.

Results: 7B and 30B-A3B in fast-thinking mode outperform DeepSeek-V4-Pro and Kimi K2.6 on translation tasks. And the lightweight 1.8B model beats mainstream commercial APIs like Microsoft and Doubao overall.

That's remarkable — a 1.8B on-device model outperforming commercial cloud services.

Choosing the Right Model

Hy-MT2-1.8B: The Mobile Powerhouse

Parameters: 1.8B
Quantized size: 440MB (1.25-bit extreme quantization)
Inference speed: 1.5x faster

Target: on-device deployment. Phones, tablets, embedded devices. Tencent's AngelSlim quantization compresses 1.8B to 440MB while actually speeding up inference.

Available on HuggingFace in FP8, GGUF, and even 2-bit / 1.25-bit extreme quantization variants.

Hy-MT2-7B: The Sweet Spot

Parameters: 7B
Recommended: Single A100 or RTX 4090
Quantization: FP8 / GGUF

7B is the most popular open-source model size. Tencent provides four inference solutions: transformers, vLLM, SGLang, and llama.cpp — covering research to production.

Ideal for server-side deployment where you need high quality without operating a massive model.

Hy-MT2-30B-A3B: MoE Brutality

Architecture: Mixture of Experts (MoE)
Total params: 30B
Active params: 3B per forward pass

MoE logic: 30B knowledge, 3B compute cost. Only 3B parameters activate per inference, but theoretically taps 30B's knowledge capacity.

Best for highest translation quality demands: legal, medical, or technical documentation.

Supported Languages (33)

Mandarin, Cantonese, English, French, Spanish, Japanese, Korean, Russian, Arabic, Thai, Vietnamese, Hindi, Traditional Chinese, Tibetan, Uyghur, and more.

Instruction-Following Capabilities

Hy-MT2 isn't just a "translator." It follows complex translation instructions:

Terminology consistency: Provide reference translations, model keeps terminology unified
Style control: Specify formal/casual/literary tone
Delimiter preservation: Special characters in code/templates stay intact
Structured data translation: JSON keys don't translate, only user-visible text
Personalized preferences: e.g., "translate with a Northern Chinese dialect feel"
Context integration: Provide background, model translates with context

These capabilities are evaluated via IFMTBench, which Tencent also open-sourced.

Deployment Options

Quick prototype / research: transformers (HuggingFace)
Production / high throughput: vLLM / SGLang
Local lightweight: llama.cpp (GGUF)
Mobile / on-device: AngelSlim 1.25-bit quantization

llama.cpp inference relies on Tencent's open-source STQ kernel (llama.cpp PR #22836) — requires building from source.

Open Source Ecosystem

HuggingFace: https://huggingface.co/collections/tencent/hy-mt2
ModelScope: https://modelscope.cn/collections/Tencent-Hunyuan/Hy-MT2
GitHub: https://github.com/Tencent-Hunyuan/Hy-MT2
AngelSlim: https://github.com/tencent/AngelSlim

Tencent also partnered with WMT26 to sponsor a video subtitle translation task.

Summary

Hy-MT2's core strengths:

Three sizes covering phones to servers
33 languages + 5 dialects, true multilingual
"Fast-thinking" paradigm for efficient inference
Strong instruction following beyond plain MT
Fully open source: quantization tools, inference scripts, training pipeline

If you're building multilingual products, translation tools, or need high-quality localization, Hy-MT2 is worth trying. A 1.8B model that fits in a phone is interesting enough on its own.

Gemini CLI

Tencent Hy-MT2: Fast-Thinking Multilingual Translation Models