Instructions to use NBAmine/mistral-nemo-text-to-sql with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use NBAmine/mistral-nemo-text-to-sql with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="NBAmine/mistral-nemo-text-to-sql") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("NBAmine/mistral-nemo-text-to-sql") model = AutoModelForCausalLM.from_pretrained("NBAmine/mistral-nemo-text-to-sql") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - PEFT
How to use NBAmine/mistral-nemo-text-to-sql with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use NBAmine/mistral-nemo-text-to-sql with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "NBAmine/mistral-nemo-text-to-sql" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NBAmine/mistral-nemo-text-to-sql", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/NBAmine/mistral-nemo-text-to-sql
- SGLang
How to use NBAmine/mistral-nemo-text-to-sql with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "NBAmine/mistral-nemo-text-to-sql" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NBAmine/mistral-nemo-text-to-sql", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "NBAmine/mistral-nemo-text-to-sql" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NBAmine/mistral-nemo-text-to-sql", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use NBAmine/mistral-nemo-text-to-sql with Docker Model Runner:
docker model run hf.co/NBAmine/mistral-nemo-text-to-sql
Mistral-Nemo-12B-Text-to-SQL
Model Overview
This is the full-precision (BF16), merged version of a Mistral-Nemo-12B model Parameter-Efficient Fine-Tuned for high-performance Text-to-SQL generation. This model is the result of merging LoRA adapters—trained via a two-phase curriculum learning strategy—back into the base weights.
It is designed to serve as the "Source of Truth" for further optimizations (like AWQ or GGUF) and represents the peak predictive performance of the training pipeline before any quantization-related drift.
- Base Model:
mistralai/Mistral-Nemo-Base-2407 - Primary Task: Natural Language to SQL generation with DDL context.
- Output Format: Standalone SQL queries compatible with standard SQL engines.
Training Methodology
The model was developed using an MLOps pipeline on dual T4 GPUs in Kaggle.
1. Curriculum Learning Strategy
The model underwent a two-stage training process:
- Phase 1 (Syntactic Alignment): Focused on SQL syntax, basic keywords, and simple schema mapping.
- Phase 2 (Logical Alignment): Introduced complex reasoning tasks including multiple
JOINoperations, nested subqueries, and set operations (UNION,INTERSECT).
2. Fine-Tuning Details
- Technique: QLoRA (Rank 16, Alpha 32)
- Quantization (during training): 4-bit NF4
- Optimizer: Paged AdamW 8-bit
- Hardware: 2x NVIDIA T4 (Kaggle).
Evaluation Results
Evaluated on the Spider validation set:
- Execution Accuracy (EX): 69.5%
- Exact Match (EM): 61.2%
- Max Context Length: 2048 tokens
Architecture Specs
The merged weights utilize the standard Mistral-Nemo 12B architecture:
- Parameters: 12.2B
- Layers: 40
- Attention: Grouped Query Attention (GQA) with 8 KV heads.
- Vocabulary Size: 128k (Tekken Tokenizer)
- VRAM Requirements: ~24GB for inference in BF16/FP16.
Template used during training
prompt = "Context: {DDL}
Question: {NL_QUERY}
Answer:"
- Downloads last month
- 9
Model tree for NBAmine/mistral-nemo-text-to-sql
Base model
mistralai/Mistral-Nemo-Base-2407