Run Claude Code, OpenCode & Frontier Coding Models on Your Own AI Infrastructure with DEH

Community Article Published June 6, 2026

A practical guide to running Anthropic's Claude Code and the open-source OpenCode terminal agents with frontier open-weight models served on-premises from Dell Enterprise Hub — with steps presented side by side so you can pick one or run both.

li-claude-open-code-post


TL;DR

  • Build AI on premises with Dell Enterprise Hub (DEH) from Dell Technologies + Hugging Face.
  • Claude Code (Anthropic) and OpenCode (open source) both point straight at that DEH endpoint — just set a base URL and a model name. No translation layer or proxy is needed.
  • The payoff: a fully air-gapped, data-sovereign coding agent running a frontier open source model on your own Dell PowerEdge platforms — no tokens leave your data center.

How to use this guide: Every command lives in its own code block so you can copy it straight to your terminal. Replace INFERENCE_HOST with your host where your model is deployed.

Why

Pairing a best-in-class agent CLI with open-weight frontier models on your own Dell hardware gives you:

  • Data sovereignty — inference happens on local GPUs, behind your firewall.
  • Model control — pin a version, fine-tune it, audit it.
  • Capacity-based cost — pay for GPUs, not per token.
  • Predictable latency — no third-party rate limits.

DEH serves models with vLLM / SGLang, exposing the standard OpenAI REST API (/v1/chat/completions, /v1/models). It also publishes "Goodput" scenarios with per-GPU SLOs so you can size a deployment:

Scenario Optimizes for Good when…
Balanced Context vs. concurrency General day-to-day agentic coding
High concurrency Many parallel requests A shared team endpoint
Long context Very large windows Whole-repo / large-file reasoning

Agentic coding pushes large contexts, so Long context or Balanced on H200/B300 is a strong default for both agents.


Frontier open-source coding models on Dell Enterprise Hub

Here are the frontier open models best suited to coding / agentic software engineering.

Purpose-built / agentic coding models

Model Params (total / active) License Why it's relevant
Qwen3-Coder-Next 80B (sparse) Apache 2.0 Coding-focused; strong agentic reasoning + tool use; long-context; built for IDE/CLI.
GLM 5.1 754B MoE MIT / NVIDIA Z.ai flagship MoE for agentic engineering, long-horizon coding, repository generation, terminal tasks.
Kimi K2.6 1T MoE / 32B active Modified MIT Native-multimodal MoE for long-horizon coding, agentic workflows, autonomous orchestration.
MiniMax M2.7 ~229–230B MoE Other / NVIDIA For complex software engineering, agentic tool use, long-context reasoning.
DeepSeek V4 Pro 1.6T MoE / 49B active, 1M ctx MIT Frontier-scale long-context agentic reasoning.
DeepSeek V4 Flash 284B MoE / 13B active, 1M ctx MIT Efficient long-context variant for agentic tasks.

Strong general models with excellent coding ability

Model Params License Notes
GPT-OSS-120B 117B Apache 2.0 OpenAI open model: production reasoning + agentic tasks; broad platform support.
GPT-OSS-20B 21B Apache 2.0 Low-latency/local variant; runs on Dell Pro Max GB10.
Trinity Large Thinking 398B MoE / ~13B active Apache 2.0 Reasoning-optimized, native long CoT, strong tool-calling.
Mistral Large 3 675B MoE / 41B active Apache 2.0 SOTA general-purpose multimodal MoE.
Qwen3.5 family 9B / 27B / 397B-A17B Apache 2.0 Strong reasoning + multilingual; 27B fits GB10.
NVIDIA Nemotron 3 30B–120B NVIDIA Open Model Agentic + reasoning; Nano 30B runs on GB10. Ultra 550B / Super 120B / Nano 30B

Quantization: Models ship in BF16 / FP8 / NVFP4. NVFP4 (4-bit, for Blackwell B300/GB10) shrinks memory and boosts throughput at minimal quality loss — ideal for fitting big coding MoEs on fewer GPUs.

How to pick

  • Single workstation (Dell Pro Max GB10): GPT-OSS-20B, Qwen3.5-27B, or quantized Qwen3-Coder-Next.
  • Team server (XE9680 / H100 / H200): Qwen3-Coder-Next (80B) — the coding-tuned, Apache-2.0 sweet spot.
  • Max capability (XE9780 / B300): GLM 5.1, Kimi K2.6, or DeepSeek V4 Pro for long-horizon, whole-repo work.

Step-by-step

Both agents follow the same five steps: deploy a model → install the CLI → point it at the endpoint → run. The only differences are the config file each agent uses.

Step 1 — Deploy a model on Dell Enterprise Hub (shared)

  1. Log in to https://dell.hf.co.
  2. In the Model Catalog, filter by your Dell Platform and pick a coding model — e.g. Qwen3-Coder-Next.
  3. Click Deploy, choose the platform and a Goodput scenario (Balanced or Long context).
  4. Run the generated container command on your Dell PowerEdge server. It serves an OpenAI-compatible API on port 8000.

Check that the endpoint is live:

curl http://INFERENCE_HOST:8000/v1/models

Both Claude Code and OpenCode will talk to this same endpoint. No proxy, no gateway, no translation layer.

Step 2 — Install the agent

Claude Code (native installer, no Node.js needed):

curl -fsSL https://claude.ai/install.sh | bash
claude --version

OpenCode (install script):

curl -fsSL https://opencode.ai/install | bash
opencode --version

Step 3 — Point the agent at your DEH model

Pick the tab for your agent and paste the block. Both write a single config file. Replace INFERENCE_HOST and the model name.

Claude Code~/.claude/settings.json:

mkdir -p ~/.claude && cat > ~/.claude/settings.json <<'EOF'
{
  "env": {
    "ANTHROPIC_BASE_URL": "http://INFERENCE_HOST:8000",
    "ANTHROPIC_API_KEY": "dummy",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "NVIDIA-Nemotron-3-Super-120B-A12B-BF16",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "qwen3-coder-next"
  },
  "theme": "dark"
}
EOF
  • ANTHROPIC_BASE_URL → your DEH endpoint instead of api.anthropic.com.
  • ANTHROPIC_API_KEYdummy (a local endpoint ignores it; use a real key for a secured endpoint).
  • ANTHROPIC_DEFAULT_*_MODEL → maps Claude's three tiers onto your model. Point them at different DEH models if you want a big model for Opus and a fast one for Haiku.

OpenCode~/.config/opencode/opencode.json:

mkdir -p ~/.config/opencode && cat > ~/.config/opencode/opencode.json <<'EOF'
{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "vllm": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Dell Enterprise Hub (local)",
      "options": { "baseURL": "http://INFERENCE_HOST:8000/v1" },
      "models": { "qwen3-coder-next": { "name": "Qwen3 Coder Next" } }
    }
  },
  "model": "vllm/qwen3-coder-next"
}
EOF
  • options.baseURL → the DEH endpoint, including /v1.
  • model → the active model as provider/model.

Step 4 — Add credentials (OpenCode only)

Claude Code already has its key in settings.json. OpenCode stores credentials separately:

mkdir -p ~/.local/share/opencode && cat > ~/.local/share/opencode/auth.json <<'EOF'
{ "vllm": { "type": "api", "key": "dummy" } }
EOF

Use a real key for a secured endpoint, and never commit auth.json.

Step 5 — (Optional) Add project conventions

Each agent reads a guidance file from your project root each session — CLAUDE.md for Claude Code, AGENTS.md for OpenCode. Create both at once:

cat > CLAUDE.md <<'EOF'
# Project conventions

## Python environment
This project uses **uv**. Do NOT use `pip` directly.
- Run a script: `uv run python <script>`
- Run tests:    `uv run pytest`
- Add a dep:    `uv add <package>`
- Sync env:     `uv sync`
EOF
cp CLAUDE.md AGENTS.md

Tip: in OpenCode you can run /init to auto-generate AGENTS.md from your repo.

Step 6 — Run it

cd ~/my-project
claude        # Claude Code
# or
opencode      # OpenCode

Every request now flows straight to your DEH model on your own hardware.


Config & file-location cheat sheet

Concern Claude Code OpenCode
User config ~/.claude/settings.json ~/.config/opencode/opencode.json
Project (shared) config <project>/.claude/settings.json <project>/opencode.json
Credentials env in settings (ANTHROPIC_API_KEY) ~/.local/share/opencode/auth.json (❌ never commit)
Project guidance CLAUDE.md AGENTS.md

Validation & troubleshooting

Quick endpoint sanity check (bypasses both agents):

curl http://INFERENCE_HOST:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"qwen3-coder-next","messages":[{"role":"user","content":"Reverse a linked list in Rust."}]}'
Symptom Agent Likely cause Fix
Connection errors / claude doctor fails Claude Code Wrong base URL or host unreachable Confirm ANTHROPIC_BASE_URL and curl the endpoint.
command not found: opencode OpenCode Install dir not on PATH Add $HOME/.opencode/bin to PATH.
ProviderInitError OpenCode Invalid config Re-check opencode.json; re-auth via /connect.
Copy/paste broken in TUI OpenCode No clipboard utility Install xclip / xsel / wl-clipboard.
Model "not found" both Name mismatch Model name must match what /v1/models reports.
Truncated long outputs both Context/output SLO too small Redeploy with the Long context Goodput scenario.
Tool calls failing both Weak tool-use model Use a strong tool-use model (Qwen3-Coder-Next, GLM 5.1, Trinity Large Thinking).

Production hardening checklist (both)

  • Pin the model version in DEH for reproducibility.
  • Right-size with Goodput SLOs to match context + concurrency.
  • Tier models: large for primary (GLM 5.1 / Kimi K2.6), fast for auxiliary (GPT-OSS-20B).
  • Secure shared endpoints with TLS + real keys; never commit secrets.
  • Prefer NVFP4/FP8 on Blackwell (B300/GB10) to fit bigger coding MoEs per GPU.
  • Keep CLAUDE.md / AGENTS.md rich — explicit build/test/lint commands improve open-model reliability.

7. Which should you choose?

Aspect Claude Code OpenCode
License Proprietary (Anthropic) Open source
Connects to DEH Direct (ANTHROPIC_BASE_URL) Direct (baseURL)
Config env vars in settings.json provider block in opencode.json
Credentials env (ANTHROPIC_API_KEY) auth.json
Project guidance file CLAUDE.md AGENTS.md
Model tiers Opus/Sonnet/Haiku env mapping model + small_model
  • Want the most polished, batteries-included agent UX? Claude Code.
  • Want a 100% open stack? OpenCode.
  • They use entirely separate config directories, so you can run both side by side in the same repo and cross check each other's work.

Conclusion

Whether you choose Anthropic's Claude Code or the open-source OpenCode or both, the destination is the same: a frontier open-weight coding model running on your own Dell PowerEdge Platforms, with your source code never leaving the data center.

You can start with Qwen3-Coder-Next or gemma-4-31B-it, then scale up to GLM 5.1 / Kimi K2.6 / MiniMax2.7 / DeepSeek V4 / Nemotron 3 Ultra on H200/B300 for the most demanding work. Either harness, any recommended frontier coding model and all on-prem.


References

Community

Sign up or log in to comment