|
|
--- |
|
|
base_model: Qwen/Qwen2.5-Coder-7B-Instruct |
|
|
tags: |
|
|
- mumps |
|
|
- m-language |
|
|
- medical |
|
|
- healthcare |
|
|
- ehr |
|
|
- vista |
|
|
- code-generation |
|
|
- peft |
|
|
- lora |
|
|
- qwen2.5 |
|
|
language: |
|
|
- en |
|
|
license: mit |
|
|
datasets: |
|
|
- YanivWeiss123/mumps-mllm-dataset |
|
|
metrics: |
|
|
- perplexity |
|
|
library_name: peft |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# Qwen2.5-Coder-7B-Instruct-MUMPS |
|
|
|
|
|
A fine-tuned version of [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) specialized for **MUMPS (M language)** code generation, explanation, and assistance. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This model is a MUMPS programming specialist trained to: |
|
|
- Generate MUMPS code from natural language descriptions |
|
|
- Explain existing MUMPS routines and code patterns |
|
|
- Answer questions about MUMPS syntax, functions, and best practices |
|
|
- Refactor and optimize MUMPS code |
|
|
- Handle healthcare/EHR-specific MUMPS scenarios (VistA, Epic, etc.) |
|
|
|
|
|
**Training Method**: QLoRA (4-bit quantized Low-Rank Adaptation) |
|
|
**Adapter Size**: ~33MB (LoRA weights only) |
|
|
**Base Model Size**: 7B parameters |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Dataset |
|
|
|
|
|
Trained on [YanivWeiss123/mumps-mllm-dataset](https://huggingface.co/datasets/YanivWeiss123/mumps-mllm-dataset): |
|
|
- **30,298 total examples** of MUMPS code completion pairs |
|
|
- **298 carefully crafted base examples** covering: |
|
|
- Basic MUMPS syntax (SET, WRITE, FOR, IF, etc.) |
|
|
- Global variables and data storage |
|
|
- String and numeric operations |
|
|
- Clinical/EHR scenarios |
|
|
- Advanced features (transactions, indirection, etc.) |
|
|
- **30,000 intelligent variations** for robustness |
|
|
|
|
|
### Training Configuration |
|
|
|
|
|
```yaml |
|
|
Base Model: Qwen/Qwen2.5-Coder-7B-Instruct |
|
|
Method: Supervised Fine-Tuning (SFT) with QLoRA |
|
|
Quantization: 4-bit NF4 with double quantization |
|
|
|
|
|
LoRA Configuration: |
|
|
Rank: 16 |
|
|
Alpha: 32 |
|
|
Target Modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj] |
|
|
Dropout: 0.05 |
|
|
|
|
|
Training Hyperparameters: |
|
|
Epochs: 2 |
|
|
Batch Size: 16 (2 per device Γ 8 gradient accumulation) |
|
|
Learning Rate: 2e-4 |
|
|
LR Scheduler: Cosine with 10% warmup |
|
|
Optimizer: paged_adamw_8bit |
|
|
Max Sequence Length: 1024 tokens |
|
|
Gradient Checkpointing: True |
|
|
Mixed Precision: bfloat16 |
|
|
|
|
|
Hardware: a10g-large GPU |
|
|
Training Time: ~4 hours |
|
|
``` |
|
|
|
|
|
### MUMPS Language Coverage |
|
|
|
|
|
**Commands** (20+): WRITE, READ, SET, KILL, NEW, QUIT, DO, IF, ELSE, FOR, XECUTE, JOB, HANG, HALT, LOCK, MERGE, TSTART, TCOMMIT, TROLLBACK, OPEN, USE, CLOSE |
|
|
|
|
|
**Functions** (30+): $ORDER, $PIECE, $LENGTH, $EXTRACT, $DATA, $GET, $JUSTIFY, $QUERY, $NAME, $ASCII, $CHAR, $REVERSE, $STACK, $STORAGE, $INCREMENT, $HOROLOG, $TEST, and more |
|
|
|
|
|
**Advanced Features**: |
|
|
- Transaction processing (TSTART/TCOMMIT/TROLLBACK) |
|
|
- 5 types of indirection (argument, atomic, entryref, pattern, name) |
|
|
- Structured system variables (^$ROUTINE, ^$JOB, ^$LOCK, ^$GLOBAL) |
|
|
- Pattern matching with alternation |
|
|
- Device I/O with parameters |
|
|
- Clinical/EHR workflows |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Loading the Model |
|
|
|
|
|
This is a **PEFT adapter** that must be loaded on top of the base model: |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
from peft import PeftModel |
|
|
import torch |
|
|
|
|
|
# Load base model |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
"Qwen/Qwen2.5-Coder-7B-Instruct", |
|
|
device_map="auto", |
|
|
trust_remote_code=True, |
|
|
torch_dtype=torch.bfloat16 |
|
|
) |
|
|
|
|
|
# Load MUMPS adapter |
|
|
model = PeftModel.from_pretrained( |
|
|
base_model, |
|
|
"YanivWeiss123/Qwen2.5-Coder-7B-Instruct-MUMPS" |
|
|
) |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained( |
|
|
"YanivWeiss123/Qwen2.5-Coder-7B-Instruct-MUMPS", |
|
|
trust_remote_code=True |
|
|
) |
|
|
``` |
|
|
|
|
|
### Inference Example |
|
|
|
|
|
```python |
|
|
# Generate MUMPS code |
|
|
prompt = "Write a MUMPS routine to register a new patient with name, DOB, and SSN" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.7) |
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
|
|
print(response) |
|
|
``` |
|
|
|
|
|
### Example Prompts |
|
|
|
|
|
``` |
|
|
"Write a MUMPS routine to register a new patient" |
|
|
"Create a MUMPS function to calculate age from date of birth" |
|
|
"How do I use $ORDER to iterate through a global array?" |
|
|
"Explain what this MUMPS code does: SET x=$PIECE(data,'^',3)" |
|
|
"Write MUMPS code to retrieve all medications for a patient" |
|
|
"Refactor this MUMPS routine for better error handling" |
|
|
``` |
|
|
|
|
|
## Capabilities |
|
|
|
|
|
### What the Model Can Do |
|
|
|
|
|
β
Generate MUMPS code from natural language descriptions |
|
|
β
Explain MUMPS syntax, functions, and commands |
|
|
β
Answer questions about MUMPS programming patterns |
|
|
β
Handle healthcare/EHR-specific scenarios (patient records, medications, etc.) |
|
|
β
Work with MUMPS globals, functions, and special variables |
|
|
β
Generate complete routines with proper structure |
|
|
β
Provide code examples with comments |
|
|
|
|
|
### Limitations |
|
|
|
|
|
β Not trained on specific VistA/Epic/Cerner codebase internals |
|
|
β May not know organization-specific global naming conventions |
|
|
β Limited knowledge of proprietary MUMPS extensions |
|
|
β General LLM limitations (hallucination, context length, etc.) |
|
|
β Should not be used for production medical systems without review |
|
|
|
|
|
## Use Cases |
|
|
|
|
|
- **Learning MUMPS**: Educational tool for developers learning MUMPS |
|
|
- **Legacy System Maintenance**: Understanding and documenting existing MUMPS code |
|
|
- **Code Generation**: Rapid prototyping of MUMPS routines |
|
|
- **Healthcare IT**: VistA, Epic, and other EHR system development |
|
|
- **Code Review**: Getting explanations of complex MUMPS patterns |
|
|
- **Migration Planning**: Understanding MUMPS code before modernization |
|
|
|
|
|
## Model Architecture |
|
|
|
|
|
**Base**: Qwen2.5-Coder-7B-Instruct |
|
|
- Decoder-only transformer |
|
|
- 7 billion parameters |
|
|
- Context length: 32,768 tokens |
|
|
- Optimized for code generation |
|
|
|
|
|
**Adapter**: LoRA (Low-Rank Adaptation) |
|
|
- Trainable parameters: ~19M (0.27% of base model) |
|
|
- Adapter size: ~33MB |
|
|
- Can be merged with base model for deployment |
|
|
|
|
|
## Training Infrastructure |
|
|
|
|
|
- **Platform**: Hugging Face Jobs |
|
|
- **GPU**: a10g-large (24GB VRAM) |
|
|
- **Framework**: TRL (Transformer Reinforcement Learning) |
|
|
- **Monitoring**: Trackio for real-time metrics |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
β οΈ **Medical Disclaimer**: This model is for educational and development purposes only. It should NOT be used to generate production medical software without proper testing, validation, and regulatory compliance. |
|
|
|
|
|
β οΈ **Code Review Required**: All generated MUMPS code should be reviewed by experienced developers before deployment, especially in healthcare settings. |
|
|
|
|
|
β οΈ **Privacy**: Do not input real patient data or PHI when using this model. |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research or project, please cite: |
|
|
|
|
|
```bibtex |
|
|
@model{qwen25-coder-mumps-2024, |
|
|
title={Qwen2.5-Coder-7B-Instruct-MUMPS: A Fine-tuned Model for MUMPS Code Generation}, |
|
|
author={YanivWeiss123}, |
|
|
year={2024}, |
|
|
publisher={Hugging Face}, |
|
|
howpublished={\url{https://huggingface.co/YanivWeiss123/Qwen2.5-Coder-7B-Instruct-MUMPS}} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
MIT License - Free to use for commercial and research purposes. |
|
|
|
|
|
Inherits license compatibility from: |
|
|
- Base model: [Qwen2.5-Coder License](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) |
|
|
- Dataset: [MIT License](https://huggingface.co/datasets/YanivWeiss123/mumps-mllm-dataset) |
|
|
|
|
|
## Links |
|
|
|
|
|
- **Base Model**: [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) |
|
|
- **Training Dataset**: [YanivWeiss123/mumps-mllm-dataset](https://huggingface.co/datasets/YanivWeiss123/mumps-mllm-dataset) |
|
|
- **Model Repository**: [YanivWeiss123/Qwen2.5-Coder-7B-Instruct-MUMPS](https://huggingface.co/YanivWeiss123/Qwen2.5-Coder-7B-Instruct-MUMPS) |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- **Qwen Team** for the excellent Qwen2.5-Coder base model |
|
|
- **Hugging Face** for the training infrastructure (Jobs, TRL, PEFT) |
|
|
- **MUMPS Community** for documentation and resources |
|
|
|
|
|
--- |
|
|
|
|
|
**Model Type**: LoRA Adapter |
|
|
**Created**: December 2024 |
|
|
**Last Updated**: December 10, 2024 |
|
|
**Status**: Experimental - Use with caution in production |
|
|
|