Instructions to use krisclarkdev/edge-command-model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use krisclarkdev/edge-command-model with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="krisclarkdev/edge-command-model")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("krisclarkdev/edge-command-model", dtype="auto")

llama-cpp-python

How to use krisclarkdev/edge-command-model with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="krisclarkdev/edge-command-model",
	filename="edge-command-model-Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use krisclarkdev/edge-command-model with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf krisclarkdev/edge-command-model:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf krisclarkdev/edge-command-model:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf krisclarkdev/edge-command-model:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf krisclarkdev/edge-command-model:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf krisclarkdev/edge-command-model:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf krisclarkdev/edge-command-model:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf krisclarkdev/edge-command-model:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf krisclarkdev/edge-command-model:Q4_K_M

Use Docker

docker model run hf.co/krisclarkdev/edge-command-model:Q4_K_M

LM Studio
Jan

vLLM

How to use krisclarkdev/edge-command-model with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "krisclarkdev/edge-command-model"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "krisclarkdev/edge-command-model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/krisclarkdev/edge-command-model:Q4_K_M

SGLang

How to use krisclarkdev/edge-command-model with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "krisclarkdev/edge-command-model" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "krisclarkdev/edge-command-model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "krisclarkdev/edge-command-model" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "krisclarkdev/edge-command-model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use krisclarkdev/edge-command-model with Ollama:
```
ollama run hf.co/krisclarkdev/edge-command-model:Q4_K_M
```

Unsloth Studio

How to use krisclarkdev/edge-command-model with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for krisclarkdev/edge-command-model to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for krisclarkdev/edge-command-model to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for krisclarkdev/edge-command-model to start chatting

How to use krisclarkdev/edge-command-model with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf krisclarkdev/edge-command-model:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "krisclarkdev/edge-command-model:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use krisclarkdev/edge-command-model with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf krisclarkdev/edge-command-model:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default krisclarkdev/edge-command-model:Q4_K_M

Run Hermes

hermes

Atomic Chat new

OpenClaw new

How to use krisclarkdev/edge-command-model with OpenClaw:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf krisclarkdev/edge-command-model:Q4_K_M

Configure OpenClaw

# Install OpenClaw:
npm install -g openclaw@latest
# Register the local server and set it as the default model:
openclaw onboard --non-interactive --mode local \
  --auth-choice custom-api-key \
  --custom-base-url http://127.0.0.1:8080/v1 \
  --custom-model-id "krisclarkdev/edge-command-model:Q4_K_M" \
  --custom-provider-id llama-cpp \
  --custom-compatibility openai \
  --custom-text-input \
  --accept-risk \
  --skip-health

Run OpenClaw

openclaw agent --local --agent main --message "Hello from Hugging Face"

Docker Model Runner
How to use krisclarkdev/edge-command-model with Docker Model Runner:
```
docker model run hf.co/krisclarkdev/edge-command-model:Q4_K_M
```

Lemonade

How to use krisclarkdev/edge-command-model with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull krisclarkdev/edge-command-model:Q4_K_M

Run and chat with the model

lemonade run user.edge-command-model-Q4_K_M

List all available models

lemonade list

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Edge Command Model — EVE-OS & Linux Terminal Assistant

A fine-tuned Qwen3-0.6B model trained to act as a lightweight command assistant for EVE-OS edge devices and Linux systems. It accepts natural language requests and responds exclusively with structured JSON tool calls.

Intended Use

This model runs on edge hardware (ARM or x86 CPU, no GPU required) and serves as an on-device command assistant for operators managing EVE-OS edge nodes. It is designed for offline, air-gapped, or bandwidth-constrained environments where cloud-based LLMs are not available.

Example interaction:

User: Show memory usage
Model: {"tool": "terminal", "command": "free -h"}

User: What is zedagent?
Model: {"tool": "explain", "text": "zedagent is the main EVE-OS orchestration agent. It processes configurations from ZedCloud, manages application deployment, handles device attestation, and coordinates all other EVE services."}

Output Format

The model always responds with a single JSON object in one of two formats:

Terminal commands (for actions to execute):

{"tool": "terminal", "command": "<shell command>"}

Explanations (for informational queries):

{"tool": "explain", "text": "<explanation>"}

Training Details

Parameter	Value
Base model	Qwen/Qwen3-0.6B
Method	QLoRA (4-bit quantization during training)
LoRA rank (r)	32
LoRA alpha	64
Target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Learning rate	1e-4
Scheduler	Cosine
Epochs	15
Max sequence length	512
Training examples	1,715
Training hardware	Single consumer GPU

Performance

Tested against 100 randomly sampled prompts from the training set.

JSON Validity Rate: 99.3% Tool Routing Accuracy: 98.6% Exact Match Accuracy: 20.0% Fuzzy Match Accuracy: 27.6% Average Inference Time: 0.692s per query Peak Memory Usage: 736.0 MB

Training Data

The model was trained on 1,715 instruction-output pairs covering:

~350 unique commands with 4-5 phrasing variations each
Linux commands: file operations, text processing, networking, process management, disk/storage, kernel modules, containers (containerd/runc), ZFS, LVM, security, namespaces, cgroups
EVE-OS commands and concepts: all pillar microservices (zedagent, nim, domainmgr, zedrouter, volumemgr, etc.), device filesystem paths (/persist, /config, /run), ZedCloud connectivity, EdgeView, TPM management, containerd operations
Explanations: EVE-OS architecture, Linux subsystems, file paths, configuration files

All training data was human-curated and reviewed for accuracy.

Quantization

The model is provided in GGUF format quantized to Q4_K_M for efficient CPU-only inference.

Format	File Size	RAM Required	Use Case
Q4_K_M (recommended)	~450 MB	2-4 GB	Edge deployment, CPU inference
Q8_0	~700 MB	4-6 GB	Higher accuracy, more RAM available
F16	~1.2 GB	6-8 GB	Maximum accuracy, development/testing

Hardware Requirements

Minimum:

CPU: Any modern ARM or x86 processor
RAM: 2 GB
Storage: 500 MB
GPU: Not required

Recommended:

CPU: ARM Cortex-A72 or better / x86-64
RAM: 4 GB
Storage: 1 GB

Tested on Raspberry Pi 4 (4GB) and x86 edge gateways.

How to Use

With llama.cpp

./llama-cli -m edge-command-model-Q4_K_M.gguf \
  --temp 0.1 \
  --top-p 0.9 \
  -p "<|im_start|>system
You are an edge device command assistant. You respond ONLY with valid JSON tool calls. Never respond with plain text. Available tools: terminal, explain.
<|im_end|>
<|im_start|>user
Show disk space
<|im_end|>
<|im_start|>assistant
"

With Ollama

Create a Modelfile:

FROM ./edge-command-model-Q4_K_M.gguf
SYSTEM "You are an edge device command assistant. You respond ONLY with valid JSON tool calls. Never respond with plain text. Available tools: terminal, explain."
PARAMETER temperature 0.1
PARAMETER num_ctx 512

Then:

ollama create edge-cmd -f Modelfile
ollama run edge-cmd "Show memory usage"

With llama-cpp-python

from llama_cpp import Llama

model = Llama(model_path="edge-command-model-Q4_K_M.gguf", n_ctx=512, n_threads=4)

prompt = """<|im_start|>system
You are an edge device command assistant. You respond ONLY with valid JSON tool calls. Never respond with plain text. Available tools: terminal, explain.
<|im_end|>
<|im_start|>user
Show memory usage
<|im_end|>
<|im_start|>assistant
"""

output = model(prompt, max_tokens=128, temperature=0.1, stop=["<|im_end|>"])
print(output["choices"][0]["text"])

Coverage

The model covers commands and concepts across these categories:

Linux: File operations, text processing (grep, sed, awk), networking (ip, ss, tcpdump, iptables), process management, disk/storage (lsblk, fdisk, ZFS, LVM), kernel modules, containers (containerd, runc), security (namespaces, cgroups, capabilities), compression, certificates (openssl), WireGuard

EVE-OS: All pillar microservices (zedagent, nim, domainmgr, zedrouter, volumemgr, baseosmgr, tpmmgr, vaultmgr, loguploader, ledmanager, nodeagent, and more), device filesystem layout (/persist, /config, /run), ZedCloud communication, EdgeView remote diagnostics, containerd operations on EVE, ZFS pool management, device identity and certificates

Limitations

The model is trained on a fixed set of ~350 commands. It may hallucinate plausible but incorrect commands for requests outside its training distribution.
Explain responses are generated, not memorized. Factual accuracy of explanations should be verified for critical operations.
The model does not support multi-turn conversation. Each request is independent.
Complex compound commands (multi-pipe chains) may be less accurate than single commands.
The model was trained for EVE-OS specifically and may not generalize well to other edge operating systems.

Safety

This model is intended to be used behind an agent harness that:

Requires user confirmation (y/n) before executing any terminal command
Blocks dangerous commands (rm -rf /, mkfs on mounted volumes, fork bombs)
Enforces timeouts on command execution
Limits output capture size

Never execute model outputs directly without human review.

License

MIT

Downloads last month: -

GGUF

Model size

0.6B params

Architecture

qwen3

Hardware compatibility

4-bit

Model tree for krisclarkdev/edge-command-model

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Quantized

(360)

this model