YanivWeiss123's picture
Add comprehensive model card with training details and usage examples
e8df7cc verified
---
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
tags:
- mumps
- m-language
- medical
- healthcare
- ehr
- vista
- code-generation
- peft
- lora
- qwen2.5
language:
- en
license: mit
datasets:
- YanivWeiss123/mumps-mllm-dataset
metrics:
- perplexity
library_name: peft
pipeline_tag: text-generation
---
# Qwen2.5-Coder-7B-Instruct-MUMPS
A fine-tuned version of [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) specialized for **MUMPS (M language)** code generation, explanation, and assistance.
## Model Description
This model is a MUMPS programming specialist trained to:
- Generate MUMPS code from natural language descriptions
- Explain existing MUMPS routines and code patterns
- Answer questions about MUMPS syntax, functions, and best practices
- Refactor and optimize MUMPS code
- Handle healthcare/EHR-specific MUMPS scenarios (VistA, Epic, etc.)
**Training Method**: QLoRA (4-bit quantized Low-Rank Adaptation)
**Adapter Size**: ~33MB (LoRA weights only)
**Base Model Size**: 7B parameters
## Training Details
### Dataset
Trained on [YanivWeiss123/mumps-mllm-dataset](https://huggingface.co/datasets/YanivWeiss123/mumps-mllm-dataset):
- **30,298 total examples** of MUMPS code completion pairs
- **298 carefully crafted base examples** covering:
- Basic MUMPS syntax (SET, WRITE, FOR, IF, etc.)
- Global variables and data storage
- String and numeric operations
- Clinical/EHR scenarios
- Advanced features (transactions, indirection, etc.)
- **30,000 intelligent variations** for robustness
### Training Configuration
```yaml
Base Model: Qwen/Qwen2.5-Coder-7B-Instruct
Method: Supervised Fine-Tuning (SFT) with QLoRA
Quantization: 4-bit NF4 with double quantization
LoRA Configuration:
Rank: 16
Alpha: 32
Target Modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]
Dropout: 0.05
Training Hyperparameters:
Epochs: 2
Batch Size: 16 (2 per device Γ— 8 gradient accumulation)
Learning Rate: 2e-4
LR Scheduler: Cosine with 10% warmup
Optimizer: paged_adamw_8bit
Max Sequence Length: 1024 tokens
Gradient Checkpointing: True
Mixed Precision: bfloat16
Hardware: a10g-large GPU
Training Time: ~4 hours
```
### MUMPS Language Coverage
**Commands** (20+): WRITE, READ, SET, KILL, NEW, QUIT, DO, IF, ELSE, FOR, XECUTE, JOB, HANG, HALT, LOCK, MERGE, TSTART, TCOMMIT, TROLLBACK, OPEN, USE, CLOSE
**Functions** (30+): $ORDER, $PIECE, $LENGTH, $EXTRACT, $DATA, $GET, $JUSTIFY, $QUERY, $NAME, $ASCII, $CHAR, $REVERSE, $STACK, $STORAGE, $INCREMENT, $HOROLOG, $TEST, and more
**Advanced Features**:
- Transaction processing (TSTART/TCOMMIT/TROLLBACK)
- 5 types of indirection (argument, atomic, entryref, pattern, name)
- Structured system variables (^$ROUTINE, ^$JOB, ^$LOCK, ^$GLOBAL)
- Pattern matching with alternation
- Device I/O with parameters
- Clinical/EHR workflows
## Usage
### Loading the Model
This is a **PEFT adapter** that must be loaded on top of the base model:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-Coder-7B-Instruct",
device_map="auto",
trust_remote_code=True,
torch_dtype=torch.bfloat16
)
# Load MUMPS adapter
model = PeftModel.from_pretrained(
base_model,
"YanivWeiss123/Qwen2.5-Coder-7B-Instruct-MUMPS"
)
tokenizer = AutoTokenizer.from_pretrained(
"YanivWeiss123/Qwen2.5-Coder-7B-Instruct-MUMPS",
trust_remote_code=True
)
```
### Inference Example
```python
# Generate MUMPS code
prompt = "Write a MUMPS routine to register a new patient with name, DOB, and SSN"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
### Example Prompts
```
"Write a MUMPS routine to register a new patient"
"Create a MUMPS function to calculate age from date of birth"
"How do I use $ORDER to iterate through a global array?"
"Explain what this MUMPS code does: SET x=$PIECE(data,'^',3)"
"Write MUMPS code to retrieve all medications for a patient"
"Refactor this MUMPS routine for better error handling"
```
## Capabilities
### What the Model Can Do
βœ… Generate MUMPS code from natural language descriptions
βœ… Explain MUMPS syntax, functions, and commands
βœ… Answer questions about MUMPS programming patterns
βœ… Handle healthcare/EHR-specific scenarios (patient records, medications, etc.)
βœ… Work with MUMPS globals, functions, and special variables
βœ… Generate complete routines with proper structure
βœ… Provide code examples with comments
### Limitations
❌ Not trained on specific VistA/Epic/Cerner codebase internals
❌ May not know organization-specific global naming conventions
❌ Limited knowledge of proprietary MUMPS extensions
❌ General LLM limitations (hallucination, context length, etc.)
❌ Should not be used for production medical systems without review
## Use Cases
- **Learning MUMPS**: Educational tool for developers learning MUMPS
- **Legacy System Maintenance**: Understanding and documenting existing MUMPS code
- **Code Generation**: Rapid prototyping of MUMPS routines
- **Healthcare IT**: VistA, Epic, and other EHR system development
- **Code Review**: Getting explanations of complex MUMPS patterns
- **Migration Planning**: Understanding MUMPS code before modernization
## Model Architecture
**Base**: Qwen2.5-Coder-7B-Instruct
- Decoder-only transformer
- 7 billion parameters
- Context length: 32,768 tokens
- Optimized for code generation
**Adapter**: LoRA (Low-Rank Adaptation)
- Trainable parameters: ~19M (0.27% of base model)
- Adapter size: ~33MB
- Can be merged with base model for deployment
## Training Infrastructure
- **Platform**: Hugging Face Jobs
- **GPU**: a10g-large (24GB VRAM)
- **Framework**: TRL (Transformer Reinforcement Learning)
- **Monitoring**: Trackio for real-time metrics
## Ethical Considerations
⚠️ **Medical Disclaimer**: This model is for educational and development purposes only. It should NOT be used to generate production medical software without proper testing, validation, and regulatory compliance.
⚠️ **Code Review Required**: All generated MUMPS code should be reviewed by experienced developers before deployment, especially in healthcare settings.
⚠️ **Privacy**: Do not input real patient data or PHI when using this model.
## Citation
If you use this model in your research or project, please cite:
```bibtex
@model{qwen25-coder-mumps-2024,
title={Qwen2.5-Coder-7B-Instruct-MUMPS: A Fine-tuned Model for MUMPS Code Generation},
author={YanivWeiss123},
year={2024},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/YanivWeiss123/Qwen2.5-Coder-7B-Instruct-MUMPS}}
}
```
## License
MIT License - Free to use for commercial and research purposes.
Inherits license compatibility from:
- Base model: [Qwen2.5-Coder License](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)
- Dataset: [MIT License](https://huggingface.co/datasets/YanivWeiss123/mumps-mllm-dataset)
## Links
- **Base Model**: [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)
- **Training Dataset**: [YanivWeiss123/mumps-mllm-dataset](https://huggingface.co/datasets/YanivWeiss123/mumps-mllm-dataset)
- **Model Repository**: [YanivWeiss123/Qwen2.5-Coder-7B-Instruct-MUMPS](https://huggingface.co/YanivWeiss123/Qwen2.5-Coder-7B-Instruct-MUMPS)
## Acknowledgments
- **Qwen Team** for the excellent Qwen2.5-Coder base model
- **Hugging Face** for the training infrastructure (Jobs, TRL, PEFT)
- **MUMPS Community** for documentation and resources
---
**Model Type**: LoRA Adapter
**Created**: December 2024
**Last Updated**: December 10, 2024
**Status**: Experimental - Use with caution in production