Update README.md
Browse files
README.md
CHANGED
|
@@ -39,7 +39,7 @@ otherwise the expert tensors wouldn’t be evenly sharded across GPU devices.</i
|
|
| 39 |
```
|
| 40 |
CONTEXT_LENGTH=32768
|
| 41 |
vllm serve \
|
| 42 |
-
|
| 43 |
--served-model-name MY_MODEL \
|
| 44 |
--enable-auto-tool-choice \
|
| 45 |
--tool-call-parser minimax_m2 \
|
|
@@ -69,8 +69,8 @@ vllm serve \
|
|
| 69 |
|
| 70 |
### 【Model Download】
|
| 71 |
```python
|
| 72 |
-
from
|
| 73 |
-
snapshot_download('
|
| 74 |
```
|
| 75 |
|
| 76 |
### 【Overview】
|
|
|
|
| 39 |
```
|
| 40 |
CONTEXT_LENGTH=32768
|
| 41 |
vllm serve \
|
| 42 |
+
QuantTrio/MiniMax-M2-REAP-162B-A10B-AWQ \
|
| 43 |
--served-model-name MY_MODEL \
|
| 44 |
--enable-auto-tool-choice \
|
| 45 |
--tool-call-parser minimax_m2 \
|
|
|
|
| 69 |
|
| 70 |
### 【Model Download】
|
| 71 |
```python
|
| 72 |
+
from huggingface_hub import snapshot_download
|
| 73 |
+
snapshot_download('QuantTrio/MiniMax-M2-REAP-162B-A10B-AWQ', cache_dir="your_local_path")
|
| 74 |
```
|
| 75 |
|
| 76 |
### 【Overview】
|