Instructions to use sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg") model = AutoModelForCausalLM.from_pretrained("sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg
- SGLang
How to use sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg with Docker Model Runner:
docker model run hf.co/sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg
xelm-gemma-4b-austronesian-layer-reg
Layer-range L2-SP regularization: middle layers receive a larger L2 penalty against the base Gemma-3-4B weights than the first/last layers. Soft equivalent of layer freezing.
- Base model: google/gemma-3-4b-pt
- Strategy:
layer-reg - Language family: Austronesian
- Code: https://github.com/sanchit-ahuja/scaling-multilingual-experts
Loading
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg")
tokenizer = AutoTokenizer.from_pretrained("sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg")
Training recipe
The exact training recipe lives in configs/yaml/train_gemma_layer_range.yaml in the code repo. The resolved config used for this specific run is also included in this model repo as training_config.yaml — load it with pyrallis to reproduce the run bit-for-bit:
python train.py --config_path configs/yaml/train_gemma_layer_range.yaml
Citation
@misc{ahuja2026parameteralignmentmitigatescatastrophic,
title={Parameter Alignment Mitigates Catastrophic Forgetting in Multilingual Expert Language Models},
author={Sanchit Ahuja and Terra Blevins},
year={2026},
eprint={2606.00284},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2606.00284},
}
- Downloads last month
- 20
Model tree for sanchitahuja205/xelm-gemma-4b-austronesian-layer-reg
Base model
google/gemma-3-4b-pt