OmniVoice
Collection
by k2-fsa, converted to MLX • 6 items • Updated
How to use mlx-community/OmniVoice-8bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir OmniVoice-8bit mlx-community/OmniVoice-8bit
Part of the OmniVoice MLX collection.
Apple MLX weights for k2-fsa/OmniVoice, a massively multilingual zero-shot TTS model. This is a community MLX conversion for Apple Silicon; the upstream model card, license, and repository remain authoritative for non-MLX usage.
| Variant | 8-bit |
| Best for | smaller checkpoint with moderate compression |
| Runtime | ailuntx/OmniVoice-MLX |
| Official code | k2-fsa/OmniVoice |
| Format | MLX safetensors + tokenizer/config assets |
| Hardware | Apple Silicon recommended; HF Spaces Linux CPU fallback can be slow |
hf download mlx-community/OmniVoice-8bit --local-dir ./models/OmniVoice-8bit
git clone https://github.com/ailuntx/OmniVoice-MLX.git
cd OmniVoice-MLX
python -m venv .venv
.venv/bin/pip install -e .
Run a minimal generation from the MLX helper repository:
.venv/bin/python scripts/infer_mlx.py \
--model ./models/OmniVoice-8bit \
--text "Hello from OmniVoice MLX." \
--language en \
--output output.wav
| Variant | Best for |
|---|---|
OmniVoice |
default entry |
OmniVoice-fp32 |
high-precision baseline |
OmniVoice-bfloat16 |
high-quality Apple Silicon use |
OmniVoice-8bit |
smaller local checkpoint |
OmniVoice-4bit |
smallest checkpoint and Space default |
OmniVoice-8bit/
├── config.json
├── model.safetensors / shards
├── tokenizer files
├── audio_tokenizer/
└── mlx_manifest.json
| Component | Source | MLX handling |
|---|---|---|
| main model | k2-fsa/OmniVoice |
converted to MLX weights |
| tokenizer/config | official checkpoint | copied for runtime compatibility |
| audio tokenizer | official OmniVoice assets | included as a required subcomponent |
Local MLX smoke tests were used during conversion. For voice cloning checks, use a full audio tokenizer; slim tokenizer assets can decode audio but do not provide a reliable speaker-encoding path.
License follows the upstream OmniVoice release.
@misc{omnivoice-mlx,
title = {OmniVoice-MLX: Apple MLX port of OmniVoice},
author = {ailuntx},
year = {2026},
url = {https://github.com/ailuntx/OmniVoice-MLX},
}
@article{zhu2026omnivoice,
title = {OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models},
author = {Zhu, Han and Ye, Lingxuan and Kang, Wei and Yao, Zengwei and Guo, Liyong and Kuang, Fangjun and Han, Zhifeng and Zhuang, Weiji and Lin, Long and Povey, Daniel},
journal = {arXiv preprint arXiv:2604.00688},
year = {2026},
}
8-bit