Run Claude Code, OpenCode & Frontier Coding Models on Your Own AI Infrastructure with DEH

Community Article Published June 6, 2026

A practical guide to running Anthropic's Claude Code and the open-source OpenCode terminal agents with frontier open-weight models served on-premises from Dell Enterprise Hub — with steps presented side by side so you can pick one or run both.

TL;DR

Build AI on premises with Dell Enterprise Hub (DEH) from Dell Technologies + Hugging Face.
Claude Code (Anthropic) and OpenCode (open source) both point straight at that DEH endpoint — just set a base URL and a model name. No translation layer or proxy is needed.
The payoff: a fully air-gapped, data-sovereign coding agent running a frontier open source model on your own Dell PowerEdge platforms — no tokens leave your data center.

How to use this guide: Every command lives in its own code block so you can copy it straight to your terminal. Replace INFERENCE_HOST with your host where your model is deployed.

Why

Pairing a best-in-class agent CLI with open-weight frontier models on your own Dell hardware gives you:

Data sovereignty — inference happens on local GPUs, behind your firewall.
Model control — pin a version, fine-tune it, audit it.
Capacity-based cost — pay for GPUs, not per token.
Predictable latency — no third-party rate limits.

DEH serves models with vLLM / SGLang, exposing the standard OpenAI REST API (/v1/chat/completions, /v1/models). It also publishes "Goodput" scenarios with per-GPU SLOs so you can size a deployment:

Scenario	Optimizes for	Good when…
Balanced	Context vs. concurrency	General day-to-day agentic coding
High concurrency	Many parallel requests	A shared team endpoint
Long context	Very large windows	Whole-repo / large-file reasoning

Agentic coding pushes large contexts, so Long context or Balanced on H200/B300 is a strong default for both agents.

Frontier open-source coding models on Dell Enterprise Hub

Here are the frontier open models best suited to coding / agentic software engineering.

Purpose-built / agentic coding models

Model	Params (total / active)	License	Why it's relevant
Qwen3-Coder-Next	80B (sparse)	Apache 2.0	Coding-focused; strong agentic reasoning + tool use; long-context; built for IDE/CLI.
GLM 5.1	754B MoE	MIT / NVIDIA	Z.ai flagship MoE for agentic engineering, long-horizon coding, repository generation, terminal tasks.
Kimi K2.6	1T MoE / 32B active	Modified MIT	Native-multimodal MoE for long-horizon coding, agentic workflows, autonomous orchestration.
MiniMax M2.7	~229–230B MoE	Other / NVIDIA	For complex software engineering, agentic tool use, long-context reasoning.
DeepSeek V4 Pro	1.6T MoE / 49B active, 1M ctx	MIT	Frontier-scale long-context agentic reasoning.
DeepSeek V4 Flash	284B MoE / 13B active, 1M ctx	MIT	Efficient long-context variant for agentic tasks.

Strong general models with excellent coding ability

Model	Params	License	Notes
GPT-OSS-120B	117B	Apache 2.0	OpenAI open model: production reasoning + agentic tasks; broad platform support.
GPT-OSS-20B	21B	Apache 2.0	Low-latency/local variant; runs on Dell Pro Max GB10.
Trinity Large Thinking	398B MoE / ~13B active	Apache 2.0	Reasoning-optimized, native long CoT, strong tool-calling.
Mistral Large 3	675B MoE / 41B active	Apache 2.0	SOTA general-purpose multimodal MoE.
Qwen3.5 family	9B / 27B / 397B-A17B	Apache 2.0	Strong reasoning + multilingual; 27B fits GB10.
NVIDIA Nemotron 3	30B–120B	NVIDIA Open Model	Agentic + reasoning; Nano 30B runs on GB10. Ultra 550B / Super 120B / Nano 30B

Quantization: Models ship in BF16 / FP8 / NVFP4. NVFP4 (4-bit, for Blackwell B300/GB10) shrinks memory and boosts throughput at minimal quality loss — ideal for fitting big coding MoEs on fewer GPUs.

How to pick

Single workstation (Dell Pro Max GB10): GPT-OSS-20B, Qwen3.5-27B, or quantized Qwen3-Coder-Next.
Team server (XE9680 / H100 / H200): Qwen3-Coder-Next (80B) — the coding-tuned, Apache-2.0 sweet spot.
Max capability (XE9780 / B300): GLM 5.1, Kimi K2.6, or DeepSeek V4 Pro for long-horizon, whole-repo work.

Step-by-step

Both agents follow the same five steps: deploy a model → install the CLI → point it at the endpoint → run. The only differences are the config file each agent uses.

Step 1 — Deploy a model on Dell Enterprise Hub (shared)

Log in to https://dell.hf.co.
In the Model Catalog, filter by your Dell Platform and pick a coding model — e.g. Qwen3-Coder-Next.
Click Deploy, choose the platform and a Goodput scenario (Balanced or Long context).
Run the generated container command on your Dell PowerEdge server. It serves an OpenAI-compatible API on port 8000.

Check that the endpoint is live:

curl http://INFERENCE_HOST:8000/v1/models

Both Claude Code and OpenCode will talk to this same endpoint. No proxy, no gateway, no translation layer.

Step 2 — Install the agent

Claude Code (native installer, no Node.js needed):

curl -fsSL https://claude.ai/install.sh | bash
claude --version

OpenCode (install script):

curl -fsSL https://opencode.ai/install | bash
opencode --version

Step 3 — Point the agent at your DEH model

Pick the tab for your agent and paste the block. Both write a single config file. Replace INFERENCE_HOST and the model name.

Claude Code — ~/.claude/settings.json:

mkdir -p ~/.claude && cat > ~/.claude/settings.json <<'EOF'
{
  "env": {
    "ANTHROPIC_BASE_URL": "http://INFERENCE_HOST:8000",
    "ANTHROPIC_API_KEY": "dummy",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "NVIDIA-Nemotron-3-Super-120B-A12B-BF16",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "qwen3-coder-next"
  },
  "theme": "dark"
}
EOF

ANTHROPIC_BASE_URL → your DEH endpoint instead of api.anthropic.com.
ANTHROPIC_API_KEY → dummy (a local endpoint ignores it; use a real key for a secured endpoint).
ANTHROPIC_DEFAULT_*_MODEL → maps Claude's three tiers onto your model. Point them at different DEH models if you want a big model for Opus and a fast one for Haiku.

OpenCode — ~/.config/opencode/opencode.json:

mkdir -p ~/.config/opencode && cat > ~/.config/opencode/opencode.json <<'EOF'
{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "vllm": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Dell Enterprise Hub (local)",
      "options": { "baseURL": "http://INFERENCE_HOST:8000/v1" },
      "models": { "qwen3-coder-next": { "name": "Qwen3 Coder Next" } }
    }
  },
  "model": "vllm/qwen3-coder-next"
}
EOF

options.baseURL → the DEH endpoint, including /v1.
model → the active model as provider/model.

Step 4 — Add credentials (OpenCode only)

Claude Code already has its key in settings.json. OpenCode stores credentials separately:

mkdir -p ~/.local/share/opencode && cat > ~/.local/share/opencode/auth.json <<'EOF'
{ "vllm": { "type": "api", "key": "dummy" } }
EOF

Use a real key for a secured endpoint, and never commit auth.json.

Step 5 — (Optional) Add project conventions

Each agent reads a guidance file from your project root each session — CLAUDE.md for Claude Code, AGENTS.md for OpenCode. Create both at once:

cat > CLAUDE.md <<'EOF'
# Project conventions

## Python environment
This project uses **uv**. Do NOT use `pip` directly.
- Run a script: `uv run python <script>`
- Run tests:    `uv run pytest`
- Add a dep:    `uv add <package>`
- Sync env:     `uv sync`
EOF
cp CLAUDE.md AGENTS.md

Tip: in OpenCode you can run /init to auto-generate AGENTS.md from your repo.

Step 6 — Run it

cd ~/my-project
claude        # Claude Code
# or
opencode      # OpenCode

Every request now flows straight to your DEH model on your own hardware.

Config & file-location cheat sheet

Concern	Claude Code	OpenCode
User config	`~/.claude/settings.json`	`~/.config/opencode/opencode.json`
Project (shared) config	`<project>/.claude/settings.json`	`<project>/opencode.json`
Credentials	env in settings (`ANTHROPIC_API_KEY`)	`~/.local/share/opencode/auth.json` (❌ never commit)
Project guidance	`CLAUDE.md`	`AGENTS.md`

Validation & troubleshooting

Quick endpoint sanity check (bypasses both agents):

curl http://INFERENCE_HOST:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"qwen3-coder-next","messages":[{"role":"user","content":"Reverse a linked list in Rust."}]}'

Symptom	Agent	Likely cause	Fix
Connection errors / `claude doctor` fails	Claude Code	Wrong base URL or host unreachable	Confirm `ANTHROPIC_BASE_URL` and `curl` the endpoint.
`command not found: opencode`	OpenCode	Install dir not on `PATH`	Add `$HOME/.opencode/bin` to `PATH`.
`ProviderInitError`	OpenCode	Invalid config	Re-check `opencode.json`; re-auth via `/connect`.
Copy/paste broken in TUI	OpenCode	No clipboard utility	Install `xclip` / `xsel` / `wl-clipboard`.
Model "not found"	both	Name mismatch	Model name must match what `/v1/models` reports.
Truncated long outputs	both	Context/output SLO too small	Redeploy with the Long context Goodput scenario.
Tool calls failing	both	Weak tool-use model	Use a strong tool-use model (Qwen3-Coder-Next, GLM 5.1, Trinity Large Thinking).

Production hardening checklist (both)

Pin the model version in DEH for reproducibility.
Right-size with Goodput SLOs to match context + concurrency.
Tier models: large for primary (GLM 5.1 / Kimi K2.6), fast for auxiliary (GPT-OSS-20B).
Secure shared endpoints with TLS + real keys; never commit secrets.
Prefer NVFP4/FP8 on Blackwell (B300/GB10) to fit bigger coding MoEs per GPU.
Keep CLAUDE.md / AGENTS.md rich — explicit build/test/lint commands improve open-model reliability.

7. Which should you choose?

Aspect	Claude Code	OpenCode
License	Proprietary (Anthropic)	Open source
Connects to DEH	Direct (`ANTHROPIC_BASE_URL`)	Direct (`baseURL`)
Config	env vars in `settings.json`	`provider` block in `opencode.json`
Credentials	env (`ANTHROPIC_API_KEY`)	`auth.json`
Project guidance file	`CLAUDE.md`	`AGENTS.md`
Model tiers	Opus/Sonnet/Haiku env mapping	`model` + `small_model`

Want the most polished, batteries-included agent UX? Claude Code.
Want a 100% open stack? OpenCode.
They use entirely separate config directories, so you can run both side by side in the same repo and cross check each other's work.

Conclusion

Whether you choose Anthropic's Claude Code or the open-source OpenCode or both, the destination is the same: a frontier open-weight coding model running on your own Dell PowerEdge Platforms, with your source code never leaving the data center.

You can start with Qwen3-Coder-Next or gemma-4-31B-it, then scale up to GLM 5.1 / Kimi K2.6 / MiniMax2.7 / DeepSeek V4 / Nemotron 3 Ultra on H200/B300 for the most demanding work. Either harness, any recommended frontier coding model and all on-prem.

References

Dell Enterprise Hub - https://dell.hf.co (Model Catalog, App Catalog, Docs, Optimized Deployments, Security, Goodput Scenarios)
Anthropic Claude Code - Build, debug, and ship with natural language - https://claude.com/product/claude-code
OpenCode - The open source AI coding agent - https://opencode.ai/

Dell Enterprise Hub at Dell Tech World 2026: new models, new platforms, faster to production

May 29, 2026

Under The Hood : Trinity-Large-Thinking Disected

May 1, 2026

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote