A masked diffusion language model for Turkish language
Diffutron
non-profit
AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
Diffutron: A Masked Diffusion Language Model for Turkish Language
| 🤗 Models | 📊 Pre-training Dataset | 📄 Paper |
Overview
Diffutron is a lightweight, non-autoregressive Masked Diffusion Language Model (MDLM) specifically optimized for the Turkish language. By utilizing a discrete diffusion process, Diffutron generates text through iterative refinement, allowing for bi-directional context awareness and high parameter efficiency.
Core Features
- Architecture: Discrete Masked Diffusion (MDLM) using a 307M parameter encoder backbone.
- Efficiency: Achieves competitive performance against 2B+ parameter autoregressive models on Turkish benchmarks.
- Adaptation: LoRA-based (r=256) continual pre-training on a 2M sequence Turkish corpus.
- Instruction Tuning: Progressive strategy using LlamaTurk and InstrucTurca datasets for enhanced command following.
Benchmarks
Diffutron achieves a significant reduction in perplexity and competitive scores across the CETVEL benchmark suite:
| Benchmark | Diffutron-1st-Stage (0.3B) | Diffutron-2nd-Stage (0.3B) | TURNA (1.1B) | Kumru (2B) | Kanarya (2B) | Llama-3.2 (3B) | Trendyol (7B) | Aya-101 (13B) |
|---|---|---|---|---|---|---|---|---|
| Belebele_TR | 22.22 | 27.00 | 22.56 | 29.00 | 28.11 | 55.78 | 36.22 | 22.89 |
| EXAMS_TR | 25.95 | 27.74 | 23.66 | 30.03 | 30.03 | 26.21 | 28.50 | 22.90 |
| IronyTR | 50.67 | 52.00 | 48.33 | 51.00 | 50.00 | 50.17 | 50.00 | 52.17 |
| News_Cat | 23.20 | 32.40 | 32.80 | 26.40 | 66.80 | 64.00 | 81.20 | 20.00 |
| MNLI_TR | 33.29 | 32.81 | 34.94 | 36.42 | 33.40 | 34.76 | 35.19 | 27.90 |
| STS_TR | 17.77 | 18.78 | 14.21 | 11.75 | 12.91 | 12.91 | 15.52 | 16.97 |
| XCOPA_TR | 53.80 | 52.00 | 55.80 | 54.00 | 64.20 | 54.60 | 61.00 | 59.60 |
| Average | 32.41 | 34.68 | 33.19 | 34.09 | 40.78 | 42.63 | 43.95 | 31.78 |
Citation
@article{kocabay2026diffutron,
title={Diffutron: A Masked Diffusion Language Model for Turkish Language},
author={Kocabay, Şuayp Talha and Akkuş, Talha Rüzgar},
journal={arXiv: [cs.CL]},
year={2026}
}