Qwen2.5-Coder-7B-Instruct-MUMPS / README.md

YanivWeiss123

Add comprehensive model card with training details and usage examples

e8df7cc verified 1 day ago

preview code

raw

history blame contribute delete

7.94 kB

metadata

base_model: Qwen/Qwen2.5-Coder-7B-Instruct
tags:
  - mumps
  - m-language
  - medical
  - healthcare
  - ehr
  - vista
  - code-generation
  - peft
  - lora
  - qwen2.5
language:
  - en
license: mit
datasets:
  - YanivWeiss123/mumps-mllm-dataset
metrics:
  - perplexity
library_name: peft
pipeline_tag: text-generation

Qwen2.5-Coder-7B-Instruct-MUMPS

A fine-tuned version of Qwen/Qwen2.5-Coder-7B-Instruct specialized for MUMPS (M language) code generation, explanation, and assistance.

Model Description

This model is a MUMPS programming specialist trained to:

Generate MUMPS code from natural language descriptions
Explain existing MUMPS routines and code patterns
Answer questions about MUMPS syntax, functions, and best practices
Refactor and optimize MUMPS code
Handle healthcare/EHR-specific MUMPS scenarios (VistA, Epic, etc.)

Training Method: QLoRA (4-bit quantized Low-Rank Adaptation) Adapter Size: ~33MB (LoRA weights only) Base Model Size: 7B parameters

Training Details

Dataset

Trained on YanivWeiss123/mumps-mllm-dataset:

30,298 total examples of MUMPS code completion pairs
298 carefully crafted base examples covering:
- Basic MUMPS syntax (SET, WRITE, FOR, IF, etc.)
- Global variables and data storage
- String and numeric operations
- Clinical/EHR scenarios
- Advanced features (transactions, indirection, etc.)
30,000 intelligent variations for robustness

Training Configuration

Base Model: Qwen/Qwen2.5-Coder-7B-Instruct
Method: Supervised Fine-Tuning (SFT) with QLoRA
Quantization: 4-bit NF4 with double quantization

LoRA Configuration:
  Rank: 16
  Alpha: 32
  Target Modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]
  Dropout: 0.05

Training Hyperparameters:
  Epochs: 2
  Batch Size: 16 (2 per device × 8 gradient accumulation)
  Learning Rate: 2e-4
  LR Scheduler: Cosine with 10% warmup
  Optimizer: paged_adamw_8bit
  Max Sequence Length: 1024 tokens
  Gradient Checkpointing: True
  Mixed Precision: bfloat16

Hardware: a10g-large GPU
Training Time: ~4 hours

MUMPS Language Coverage

Commands (20+): WRITE, READ, SET, KILL, NEW, QUIT, DO, IF, ELSE, FOR, XECUTE, JOB, HANG, HALT, LOCK, MERGE, TSTART, TCOMMIT, TROLLBACK, OPEN, USE, CLOSE

Functions (30+): $ORDER, $PIECE, $LENGTH, $EXTRACT, $DATA, $GET, $JUSTIFY, $QUERY, $NAME, $ASCII, $CHAR, $REVERSE, $STACK, $STORAGE, $INCREMENT, $HOROLOG, $TEST, and more

Advanced Features:

Transaction processing (TSTART/TCOMMIT/TROLLBACK)
5 types of indirection (argument, atomic, entryref, pattern, name)
Structured system variables (^$ROUTINE, ^$JOB, ^$LOCK, ^$GLOBAL)
Pattern matching with alternation
Device I/O with parameters
Clinical/EHR workflows

Usage

Loading the Model

This is a PEFT adapter that must be loaded on top of the base model:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-Coder-7B-Instruct",
    device_map="auto",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16
)

# Load MUMPS adapter
model = PeftModel.from_pretrained(
    base_model,
    "YanivWeiss123/Qwen2.5-Coder-7B-Instruct-MUMPS"
)

tokenizer = AutoTokenizer.from_pretrained(
    "YanivWeiss123/Qwen2.5-Coder-7B-Instruct-MUMPS",
    trust_remote_code=True
)

Inference Example

# Generate MUMPS code
prompt = "Write a MUMPS routine to register a new patient with name, DOB, and SSN"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(response)

Example Prompts

"Write a MUMPS routine to register a new patient"
"Create a MUMPS function to calculate age from date of birth"
"How do I use $ORDER to iterate through a global array?"
"Explain what this MUMPS code does: SET x=$PIECE(data,'^',3)"
"Write MUMPS code to retrieve all medications for a patient"
"Refactor this MUMPS routine for better error handling"

Capabilities

What the Model Can Do

✅ Generate MUMPS code from natural language descriptions ✅ Explain MUMPS syntax, functions, and commands ✅ Answer questions about MUMPS programming patterns ✅ Handle healthcare/EHR-specific scenarios (patient records, medications, etc.) ✅ Work with MUMPS globals, functions, and special variables ✅ Generate complete routines with proper structure ✅ Provide code examples with comments

Limitations

❌ Not trained on specific VistA/Epic/Cerner codebase internals ❌ May not know organization-specific global naming conventions ❌ Limited knowledge of proprietary MUMPS extensions ❌ General LLM limitations (hallucination, context length, etc.) ❌ Should not be used for production medical systems without review

Use Cases

Learning MUMPS: Educational tool for developers learning MUMPS
Legacy System Maintenance: Understanding and documenting existing MUMPS code
Code Generation: Rapid prototyping of MUMPS routines
Healthcare IT: VistA, Epic, and other EHR system development
Code Review: Getting explanations of complex MUMPS patterns
Migration Planning: Understanding MUMPS code before modernization

Model Architecture

Base: Qwen2.5-Coder-7B-Instruct

Decoder-only transformer
7 billion parameters
Context length: 32,768 tokens
Optimized for code generation

Adapter: LoRA (Low-Rank Adaptation)

Trainable parameters: ~19M (0.27% of base model)
Adapter size: ~33MB
Can be merged with base model for deployment

Training Infrastructure

Platform: Hugging Face Jobs
GPU: a10g-large (24GB VRAM)
Framework: TRL (Transformer Reinforcement Learning)
Monitoring: Trackio for real-time metrics

Ethical Considerations

⚠️ Medical Disclaimer: This model is for educational and development purposes only. It should NOT be used to generate production medical software without proper testing, validation, and regulatory compliance.

⚠️ Code Review Required: All generated MUMPS code should be reviewed by experienced developers before deployment, especially in healthcare settings.

⚠️ Privacy: Do not input real patient data or PHI when using this model.

Citation

If you use this model in your research or project, please cite:

@model{qwen25-coder-mumps-2024,
  title={Qwen2.5-Coder-7B-Instruct-MUMPS: A Fine-tuned Model for MUMPS Code Generation},
  author={YanivWeiss123},
  year={2024},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/YanivWeiss123/Qwen2.5-Coder-7B-Instruct-MUMPS}}
}

License

MIT License - Free to use for commercial and research purposes.

Inherits license compatibility from:

Base model: Qwen2.5-Coder License
Dataset: MIT License

Acknowledgments

Qwen Team for the excellent Qwen2.5-Coder base model
Hugging Face for the training infrastructure (Jobs, TRL, PEFT)
MUMPS Community for documentation and resources

Model Type: LoRA Adapter Created: December 2024 Last Updated: December 10, 2024 Status: Experimental - Use with caution in production

YanivWeiss123
/

Qwen2.5-Coder-7B-Instruct-MUMPS