gemma-3-1b-ua-safe-summarization

Full fine-tune of unsloth/gemma-3-1b-it on Ukrainian news summarization.
Trained with Unsloth + TRL SFT on the csebuetnlp/xlsum dataset (Ukrainian split of XL-Sum with safety filtering).

Training details

Base model unsloth/gemma-3-1b-it
Dataset csebuetnlp/xlsum
Fine-tuning Full SFT (no LoRA)
Framework Unsloth + TRL
Epochs ~1.48 (checkpoint-4000, best val ROUGE-L)
Max seq length 3072
Batch size 8 per device
Learning rate 2e-5 (cosine decay, warmup 3 %)
Precision bfloat16
Optimizer adamw_8bit
Best eval ROUGE-L 22.23

Response-only masking was applied — loss is computed on the model turn only.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "nuinashco/gemma-3-1b-it-xlsum-ua-sft"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, torch_dtype=torch.bfloat16, device_map="auto"
)

article = "Ваш текст новини тут..."
prompt = [
    {"role": "user", "content": f"Зроби короткий переказ наступного тексту:\n{article}"}
]
inputs = tokenizer.apply_chat_template(
    prompt, add_generation_prompt=True, return_tensors="pt"
).to(model.device)

out = model.generate(inputs, max_new_tokens=128, do_sample=False)
print(tokenizer.decode(out[0][inputs.shape[-1]:], skip_special_tokens=True))

Limitations

  • Trained for Ukrainian only; performance on other languages is undefined.
  • Inherits any biases present in the base model and training corpus.
  • Summaries may occasionally be factually inaccurate; always verify against the source.
Downloads last month
615
Safetensors
Model size
1.0B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nuinashco/gemma-3-1b-it-xlsum-ua-sft

Finetuned
(500)
this model
Finetunes
1 model

Dataset used to train nuinashco/gemma-3-1b-it-xlsum-ua-sft

Collection including nuinashco/gemma-3-1b-it-xlsum-ua-sft