Instructions to use maldv/winter-garden-7b-delta with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use maldv/winter-garden-7b-delta with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="maldv/winter-garden-7b-delta")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("maldv/winter-garden-7b-delta")
model = AutoModelForCausalLM.from_pretrained("maldv/winter-garden-7b-delta")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use maldv/winter-garden-7b-delta with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "maldv/winter-garden-7b-delta"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "maldv/winter-garden-7b-delta",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/maldv/winter-garden-7b-delta

SGLang

How to use maldv/winter-garden-7b-delta with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "maldv/winter-garden-7b-delta" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "maldv/winter-garden-7b-delta",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "maldv/winter-garden-7b-delta" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "maldv/winter-garden-7b-delta",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use maldv/winter-garden-7b-delta with Docker Model Runner:
```
docker model run hf.co/maldv/winter-garden-7b-delta
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Winter Garden 7B - δ - "Charming"

It was mentioned that we are in the open ai dark winter; so I thought I would make myself a nice winter garden.

An experiment

I performed the same type of merge as in the previous model, but with a different set of models. I took the following models:

Mistral-7B-v0.1

and merged in

KuNoichi-DPO-v2-7B
Datura_7B
AlphaMonarch-7B
LemonadeRP-4.5.3
Prima-LelantaclesV6-7b
FuseChat-7B-VaRM
Capricorn-7B-DPO
eros-7b-test
NeuralMarcoro14-7B
StrangeMerges_6-7B-dare_ties
Multi-Verse-RP-7B
WestLake-7B-v2-laser-truthy-dpo
Noromaid-7B-0.4-DPO
Thespis-Balanced-7b-v1
InfinityRP-v1-7B
winter-garden-7b-gamma

in an iterative DARE-TIES tree merge, ordering the merge order by tensor-relative cosine similarity until the merge branches resolve to a single value.

Chat Template

These models were selected because they follow my chat template, which is '' ended turns. A lot of models follow this template by default because they were trained with end padding, so this is a natural choice for chat, and should be highly compatible with ST.

Tom: Hello, how are you?</s>
Jane: I am fine, thank you.</s>

Why?

The purpose of all of these models is to act as a base for me to train on. This one so far has the best multi-turn conversational ability, and should get really good at following long-form conversations after a bit of tweaking.

Scores

Metric	Score
Average	64.93
ARC	64.16
HellaSwag	84.37
MMLU	60.38
TruthfulQA	67.95
Winogrande	76.72
GSM8K	36.01

Details

Downloads last month: 70

Safetensors

Model size

7B params

Tensor type

BF16

Model tree for maldv/winter-garden-7b-delta

Quantizations

1 model