Instructions to use Open4bits/nexora-vector-v0.1-mlx-4Bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Open4bits/nexora-vector-v0.1-mlx-4Bit")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Open4bits/nexora-vector-v0.1-mlx-4Bit")
model = AutoModelForCausalLM.from_pretrained("Open4bits/nexora-vector-v0.1-mlx-4Bit")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

MLX

How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("Open4bits/nexora-vector-v0.1-mlx-4Bit")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps
LM Studio

vLLM

How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Open4bits/nexora-vector-v0.1-mlx-4Bit"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Open4bits/nexora-vector-v0.1-mlx-4Bit",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Open4bits/nexora-vector-v0.1-mlx-4Bit

SGLang

How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Open4bits/nexora-vector-v0.1-mlx-4Bit" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Open4bits/nexora-vector-v0.1-mlx-4Bit",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Open4bits/nexora-vector-v0.1-mlx-4Bit" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Open4bits/nexora-vector-v0.1-mlx-4Bit",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Pi new

How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "Open4bits/nexora-vector-v0.1-mlx-4Bit"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Open4bits/nexora-vector-v0.1-mlx-4Bit"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with Hermes Agent:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "Open4bits/nexora-vector-v0.1-mlx-4Bit"

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default Open4bits/nexora-vector-v0.1-mlx-4Bit

Run Hermes

hermes

MLX LM

How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "Open4bits/nexora-vector-v0.1-mlx-4Bit"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "Open4bits/nexora-vector-v0.1-mlx-4Bit"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "Open4bits/nexora-vector-v0.1-mlx-4Bit",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

Docker Model Runner
How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with Docker Model Runner:
```
docker model run hf.co/Open4bits/nexora-vector-v0.1-mlx-4Bit
```

Nexora-Vector

Nexora-Vector-v0.1 · MLX 4-Bit

Nexora-Vector-v0.1 MLX 4-Bit is the official Apple MLX 4-bit quantized release of Nexora-Vector-v0.1, published by Open4bits — an official quantization project under ArkAiLabs. Nexora-Vector is an experimental text-to-vector model that generates structured SVG graphics from natural language prompts. This variant is optimized for efficient local inference on Apple Silicon hardware via the MLX framework.

Overview
Model Details
Capabilities
Limitations
Intended Use
Architecture & Quantization
Usage Recommendations
Original Model
Evaluation
Risks & Considerations
Future Work
Community & Support
License
Acknowledgements

Overview

This is the official MLX 4-bit quantized release of Nexora-Vector-v0.1, published by Open4bits — the official quantization project under ArkAiLabs — and converted for use with Apple's MLX framework. The base model is a supervised fine-tuned variant of Qwen3-4B, adapted specifically to generate structured vector graphics in SVG format from natural language instructions.

This release is in beta and is intended for research, experimentation, and early-stage design tooling on Apple Silicon machines. All outputs should be validated before use in any downstream pipeline.

Model Details

Property	Details
Model Type	MLX 4-Bit Quantized
Base Model	Nexora-Vector-v0.1
Original Base	Qwen3-4B
Fine-tuning Method	Supervised Fine-Tuning (SFT)
Quantization	4-Bit (MLX)
Target Hardware	Apple Silicon (M1/M2/M3/M4 series)
Framework	MLX
Output Format	SVG
License	Apache 2.0

Capabilities

Nexora-Vector-v0.1 is designed to translate textual instructions into structured SVG code. This MLX version retains all capabilities of the original model while enabling fast, memory-efficient inference on Apple Silicon. The model is best suited for:

Generating SVG markup for simple vector graphics
Producing geometric shapes and basic illustrations
Creating lightweight icons and minimal design assets
Supporting rapid prototyping in vector-based design workflows on macOS

Tip: The model performs best with concise, clearly scoped prompts focused on simple visual compositions.

Limitations

This is an early-stage beta release. Users should be aware of the following constraints:

High hallucination rate — outputs may be invalid or non-renderable SVG
Limited generalization — the small training dataset (~1,500 samples) affects output consistency
Weak complex scene handling — highly detailed or multi-element prompts may produce poor results
Manual correction required — outputs should be validated and post-processed before use
Not production-ready — not suitable for safety-critical or automated pipelines
4-bit quality trade-off — minor quality degradation is expected compared to the full-precision original model

Intended Use

✅ Supported Use Cases

Academic and applied research in text-to-vector generation on Apple Silicon
Experimental AI-assisted design systems running locally on macOS
Educational exploration of structured output generation
Lightweight SVG prototyping and ideation with low memory overhead

❌ Out-of-Scope Use Cases

Production-grade or commercial vector asset pipelines
High-precision design deliverables without human validation
Automated systems where SVG correctness is required without manual review
Non-Apple-Silicon hardware (use the GGUF version instead)

Architecture & Quantization

This model is a 4-bit MLX quantization of the original Nexora-Vector-v0.1 weights, which are themselves a supervised fine-tune of Qwen3-4B.

Quantization Details

Parameter	Details
Quantization Method	MLX 4-Bit
Source Model	ArkAiLab-Adl/nexora-vector-v0.1
Framework	Apple MLX
Memory Reduction	~75% vs. full-precision (fp16)
Target Platform	macOS with Apple Silicon

Original Training Configuration

Parameter	Details
Fine-tuning Method	Supervised Fine-Tuning (SFT)
Dataset Composition	Curated prompt–SVG pairs
Dataset Size	~1,500 samples
Training Objective	Structured output generation for SVG formats

Note: The relatively small dataset size may result in instability and limited generalization across diverse prompts. Improved dataset coverage is planned for future versions.

Usage Recommendations

To get the best results from this model:

Keep prompts simple and specific — avoid multi-scene or highly complex compositions
Validate all SVG outputs before rendering or integrating into any pipeline
Post-process outputs to correct syntax or structural issues
Use iterative prompting — refining prompts across multiple turns often yields better results
Expect imperfections — this is a beta model; treat outputs as drafts, not finals
Run on Apple Silicon — this MLX build is optimized for M1/M2/M3/M4 series chips

Original Model

Version	Link
Original (full precision)	ArkAiLab-Adl/nexora-vector-v0.1
GGUF Quantized	Open4bits/nexora-vector-v0.1-GGUF
MLX 4-Bit (this model)	Open4bits/nexora-vector-v0.1-mlx-4Bit

Evaluation

Nexora-Vector-v0.1 has not yet undergone formal benchmark evaluation. Current assessment is qualitative, based on manual testing of SVG generation tasks.

Planned evaluation metrics for future releases include:

Metric	Description
SVG Validity Rate	Percentage of outputs that are parseable, valid SVG
Structural Correctness	Adherence to SVG schema and element hierarchy
Prompt Adherence	Alignment between user intent and generated output
Visual Consistency	Stability of outputs across similar prompts

Risks & Considerations

Developers integrating this model should account for the following risks:

Generation of malformed or non-functional SVG code
Inconsistent instruction following across prompt variations
Unpredictable outputs due to limited training data coverage
Minor quality reduction inherent to 4-bit quantization

Recommendation: Implement downstream validation layers and SVG syntax checking before any rendering or integration.

Future Work

The following improvements are planned for upcoming versions of the Nexora Vector series:

Expanded and more diverse training dataset
Improved SVG syntax correctness and validity rates
Reduced hallucination rates
Enhanced natural language understanding for complex prompts
Support for richer vector compositions and multi-element scenes
Formal benchmark evaluation suite
Updated MLX quantized releases aligned with future model versions

Community & Support

Join the community for updates and discussion:

💬 Join our Discord Server

License

This model is released under the Apache License 2.0.

You may use, modify, and distribute this model in accordance with the terms of the Apache 2.0 license. See the LICENSE file for full details, or refer to the official Apache 2.0 license text.

Acknowledgements

This is an official ArkAiLabs release, published under the Open4bits project — ArkAiLabs' dedicated initiative for quantized model releases. The MLX 4-bit weights are derived from Nexora-Vector-v0.1, which is itself built upon Qwen3-4B by the Qwen team. We thank the MLX team at Apple and the open-source AI community for their continued contributions that make projects like this possible.

About Nexora & Open4bits

Nexora is an experimental AI initiative under ArkAiLabs, focused on building lightweight, practical, and creative AI systems for real-world applications. The Nexora Vector series represents our exploration into AI-assisted vector graphics generation.

Open4bits is ArkAiLabs' official project for quantized model releases, providing optimized variants of Nexora models for efficient local inference across different hardware platforms.

Downloads last month: 15

Safetensors

Model size

0.6B params

Tensor type

F16

U32

MLX

Hardware compatibility

4-bit

Model tree for Open4bits/nexora-vector-v0.1-mlx-4Bit

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B