QuantTrio
/

Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix

Text Generation

4-bit precision

Model card Files Files and versions

JunHowie commited on Jul 28

Commit

d7af091

·

verified ·

1 Parent(s): e739e6c

Update README.md

Files changed (1) hide show

README.md +14 -14

README.md CHANGED Viewed

@@ -13,17 +13,17 @@ base_model:
   - Qwen/Qwen3-Coder-480B-A35B-Instruct
 base_model_relation: quantized
 ---
-# 通义千问3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix
-基础型 [Qwen/Qwen3-Coder-480B-A35B-Instruct](https://www.modelscope.cn/models/Qwen/Qwen3-Coder-480B-A35B-Instruct)
-### 【Vllm 单机8卡启动命令】
-<i>注: 8卡启动一定要跟`--enable-expert-parallel` 否则该模型专家张量TP整除除不尽；4卡则不需要。 </i>
 ```
 CONTEXT_LENGTH=32768  # 262144
 vllm serve \
-    tclf90/Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix \
     --served-model-name Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix \
     --enable-expert-parallel \
     --swap-space 16 \
@@ -38,35 +38,35 @@ vllm serve \
     --port 8000
 ```
-### 【依赖】
 ```
 vllm>=0.9.2
 ```
-### 【模型更新日期】
 ```
 2025-07-24
-1. 首次commit
 ```
-### 【模型列表】
-| 文件大小    | 最近更新时间       |
 |---------|--------------|
 | `261GB` | `2025-07-24` |
-### 【模型下载】
 ```python
-from modelscope import snapshot_download
-snapshot_download('tclf90/Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix', cache_dir="本地路径")
 ```
-### 【介绍】
 # Qwen3-Coder-480B-A35B-Instruct
 <a href="https://chat.qwen.ai/" target="_blank" style="margin: 2px;">

   - Qwen/Qwen3-Coder-480B-A35B-Instruct
 base_model_relation: quantized
 ---
+# Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix
+Base model [Qwen/Qwen3-Coder-480B-A35B-Instruct](https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct)
+### 【VLLM Launch Command for 8 GPUs (Single Node)】
+<i>注: Note: When launching with 8 GPUs, --enable-expert-parallel must be specified; otherwise, the expert tensors cannot be evenly split across tensor parallel ranks. This option is not required for 4-GPU setups. </i>
 ```
 CONTEXT_LENGTH=32768  # 262144
 vllm serve \
+    QuantTrio/Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix \
     --served-model-name Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix \
     --enable-expert-parallel \
     --swap-space 16 \
     --port 8000
 ```
+### 【Dependencies】
 ```
 vllm>=0.9.2
 ```
+### 【Model Update History】
 ```
 2025-07-24
+1. fast commit
 ```
+### 【Model Files】
+| File Size	    | Last Updated       |
 |---------|--------------|
 | `261GB` | `2025-07-24` |
+### 【Model Download】
 ```python
+from huggingface_hub import snapshot_download
+snapshot_download('QuantTrio/Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix', cache_dir="your_local_path")
 ```
+### 【Description】
 # Qwen3-Coder-480B-A35B-Instruct
 <a href="https://chat.qwen.ai/" target="_blank" style="margin: 2px;">