Instructions to use knoveleng/polyglot-lion-1.7b-v1.5-mlx-bf16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use knoveleng/polyglot-lion-1.7b-v1.5-mlx-bf16 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir polyglot-lion-1.7b-v1.5-mlx-bf16 knoveleng/polyglot-lion-1.7b-v1.5-mlx-bf16
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
CHANGE LOG: This version was retrained on the same dataset without punctuation removal to improve the model’s ability to recognize pauses and sentence boundaries in speech.
About
Polyglot-Lion-1.7B was developed by Quy-Anh Dang and Chris Ngo at Knovel Engineering and presented in the report "Polyglot-Lion: Efficient Multilingual ASR for Singapore via Balanced Fine-Tuning of Qwen3-ASR".
The model is obtained by fine-tuning Qwen3-ASR-1.7B exclusively on publicly available speech corpora covering Singapore's four official languages. It utilizes a balanced sampling strategy that equalizes the number of training utterances per language and deliberately omits language-tag conditioning, allowing the model to learn to identify languages implicitly from audio.
Polyglot-Lion-1.7B achieves an average error rate of 14.85 — competitive with MERaLiON-2-10B-ASR (14.32), a model 6× larger and 20× faster inference.
- Parameters: 1.7B
- Languages: English, Mandarin, Tamil, Malay
- Training cost: $81 on a single NVIDIA RTX PRO 6000 (48 h)
- Inference speed: ~0.10 s/sample on RTX PRO 4500
Results
| Model | Params | English (LS) | English (NSC) | Mandarin (CV) | Mandarin (AISH1) | Mandarin (AISH3) | Mandarin (Fleurs) | Tamil (CV) | Tamil (SLR65) | Tamil (SLR127) | Tamil (Fleurs) | Malay (Meso.) | Malay (Fleurs) | Avg |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Whisper-large-v3-turbo | 0.8B | 3.04 | 32.02 | 17.91 | 9.64 | 16.81 | 10.63 | 74.50 | 58.13 | 69.56 | 66.90 | 28.47 | 8.88 | 33.04 |
| SeaLLMs-Audio-7B | 7B | 94.74 | 9.53 | 8.68 | 9.65 | 9.76 | 37.09 | 126.70 | 127.24 | 138.65 | 105.31 | 71.34 | 26.25 | 63.75 |
| Qwen2.5-Omni-3B | 3B | 29.21 | 34.79 | 46.36 | 28.25 | 44.55 | 54.74 | 318.36 | 465.58 | 448.82 | 311.67 | 211.90 | 74.69 | 172.37 |
| Qwen2.5-Omni-7B | 7B | 13.80 | 22.96 | 14.49 | 7.33 | 22.58 | 16.68 | 252.06 | 239.15 | 303.96 | 326.43 | 158.06 | 43.92 | 118.45 |
| Qwen3-ASR-0.6B | 0.6B | 2.74 | 7.64 | 10.06 | 2.08 | 2.59 | 9.75 | 121.10 | 127.00 | 129.12 | 130.09 | 47.29 | 18.71 | 50.68 |
| Qwen3-ASR-1.7B | 1.7B | 2.31 | 6.22 | 7.50 | 1.52 | 2.08 | 9.33 | 139.96 | 134.63 | 144.49 | 147.23 | 39.00 | 10.87 | 53.76 |
| MERaLiON-2-10B-ASR | 10B | 2.54 | 4.62 | 8.83 | 3.09 | 4.07 | 11.99 | 31.78 | 19.29 | 22.42 | 28.68 | 25.90 | 8.55 | 14.32 |
| Polyglot-Lion-0.6B | 0.6B | 2.67 | 6.09 | 6.16 | 1.93 | 2.32 | 9.19 | 42.16 | 23.07 | 28.14 | 37.68 | 24.33 | 14.45 | 16.52 |
| Polyglot-Lion-1.7B | 1.7B | 2.10 | 5.28 | 4.91 | 1.45 | 1.86 | 8.00 | 39.19 | 19.75 | 26.83 | 37.28 | 21.51 | 9.98 | 14.85 |
WER (%) for English, Tamil, and Malay; CER (%) for Mandarin. Lower is better. Bold = best overall.
Quick Start
See mlx-audio for inference.
Citation
@misc{dang2026polyglotlion,
title={Polyglot-Lion: Efficient Multilingual ASR for Singapore via Balanced Fine-Tuning of Qwen3-ASR},
author={Quy-Anh Dang and Chris Ngo},
year={2026},
eprint={2603.16184},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2603.16184},
}
- Downloads last month
- 28
Quantized
Model tree for knoveleng/polyglot-lion-1.7b-v1.5-mlx-bf16
Base model
Qwen/Qwen3-ASR-1.7B