Daizee
/

Gemma3-Callous-Calla-4B-mlx

Text Generation

Model card Files Files and versions

Daizee commited on Nov 7

Commit

1eb80af

·

verified ·

1 Parent(s): 684a3c2

Create README.md

Files changed (1) hide show

README.md +35 -0

README.md ADDED Viewed

	@@ -0,0 +1,35 @@

+---
+tags:
+- mlx
+- apple-silicon
+- text-generation
+- gemma3
+library_name: mlx-lm
+pipeline_tag: text-generation
+base_model: Daizee/Gemma3-Callous-Calla-4B
+---
+# Gemma3-Callous-Calla-4B — **MLX** builds (Apple Silicon)
+This repo hosts **MLX-converted** variants of **Daizee/Gemma3-Callous-Calla-4B** for fast, local inference on Apple Silicon (M-series).
+Tokenizer/config are included at the repo root. MLX weight folders live under `mlx/`.
+> **Note on vocab padding:** For MLX compatibility, the tokenizer/embeddings were padded to the next multiple of 64 tokens.
+> In this build: **262,208 tokens** (added 64 placeholder tokens named `<pad_ex_*>`).
+## Variants
+| Path         | Bits | Group Size | Notes                              |
+|--------------|------|------------|------------------------------------|
+| `mlx/g128/`  | int4 | 128        | Smallest & fastest                 |
+| `mlx/g64/`   | int4 | 64         | Slightly larger, often steadier    |
+| `mlx/int8/`  | 8    | —          | Closest to fp16 quality (slower)   |
+## Quickstart (MLX-LM)
+### Run from Hugging Face (no cloning needed)
+```bash
+python -m mlx_lm.generate \
+  --model hf://Daizee/Gemma3-Callous-Calla-4B-mlx/mlx/g64 \
+  --prompt "Summarize the Bill of Rights for 7th graders in 4 bullet points." \
+  --max-tokens 180 --temp 0.3 --top-p 0.92