Instructions to use MiniMaxAI/MiniMax-M1-80k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MiniMaxAI/MiniMax-M1-80k with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="MiniMaxAI/MiniMax-M1-80k", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("MiniMaxAI/MiniMax-M1-80k", trust_remote_code=True, dtype="auto") - Inference
- HuggingChat
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use MiniMaxAI/MiniMax-M1-80k with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "MiniMaxAI/MiniMax-M1-80k" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MiniMaxAI/MiniMax-M1-80k", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/MiniMaxAI/MiniMax-M1-80k
- SGLang
How to use MiniMaxAI/MiniMax-M1-80k with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "MiniMaxAI/MiniMax-M1-80k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MiniMaxAI/MiniMax-M1-80k", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "MiniMaxAI/MiniMax-M1-80k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MiniMaxAI/MiniMax-M1-80k", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use MiniMaxAI/MiniMax-M1-80k with Docker Model Runner:
docker model run hf.co/MiniMaxAI/MiniMax-M1-80k
Update README.md
#21 opened 4 months ago
by
cherry0328
使用transformer部署模型有报错
1
#19 opened 10 months ago
by
chuyuelin1
Minimum requirements
3
#18 opened 10 months ago
by
Julen10
update functioncall
#16 opened 11 months ago
by
kamuy-shennai
HF Compatible Weights
1
#15 opened 11 months ago
by
geetu040
How to get structrured output?
#14 opened 11 months ago
by
Sheffchenko
使用vllm推理部署,对比了DeepSeek- R1和MiniMax-M1-80k的性能,差距很大,是什么原因?
1
#13 opened 11 months ago
by
dingyuansheng
Quantization Support
#12 opened 11 months ago
by
Wongibaek
Was the 7.5T Token Continual Pre-Training Performed on the Instruction-Tuned Model or the Base PLM?
3
#10 opened 11 months ago
by
Jinhwan
I hope you guys can provide a 32B dense model
👍 2
#9 opened 11 months ago
by
zletpm
MLX Convert Error
4
#8 opened 11 months ago
by
baggaindia
main
#7 opened 11 months ago
by
zwb19820615
Where's the knowledge?
🧠❤️ 8
6
#5 opened 11 months ago
by
phil111
Can we expect a 20b~32b parameter minimax model to fit into a single 4090?
🚀🔥 35
#3 opened 11 months ago
by
win10
WHAT a benchmarks graph
👍 13
1
#2 opened 11 months ago
by
CyborgPaloma
gguf weights for llama.cpp?
👍🧠 24
1
#1 opened 11 months ago
by
segmond