MLX
Safetensors
dbrx
custom_code

mlx-community/dbrx-instruct-4bit

This model was converted to MLX format from databricks/dbrx-instruct using mlx-lm version b80adbc after DBRX support was added by Awni Hannun.

Refer to the original model card for more details on the model.

Conversion

Conversion was done with:

python -m mlx_lm.convert --hf-path databricks/dbrx-instruct -q --upload-repo mlx-community/dbrx-instruct-4bit

Use with mlx

Make you you first upgrade mlx-lm and mlx to the latest.

pip install mlx --upgrade
pip install mlx-lm --upgrade

python -m mlx_lm.generate --model mlx-community/dbrx-instruct-4bit --prompt "Hello" --trust-remote-code --use-default-chat-template --max-tokens 500

Remember, this is an Instruct model, so you will need to use the instruct prompt template by appending --use-default-chat-template

Example:

python -m mlx_lm.generate --model dbrx-instruct-4bit --prompt "What's the difference between PCA vs UMAP vs t-SNE?" --trust-remote-code --use-default-chat-template  --max-tokens 1000

Output:

image/png

On my Macbook Pro M2 with 96GB of Unified Memory, DBRX Instruct in 4-bit for the above prompt it eats 70.2GB of RAM.

if the mlx-lm package was updated it can also be installed from pip:

pip install mlx-lm

To use it from Python you can do the following:

from mlx_lm import load, generate

model, tokenizer = load(
   "mlx-community/dbrx-instruct-4bit",
   tokenizer_config={"trust_remote_code": True}
)

chat = [
   {"role": "user", "content": "What's the difference between PCA vs UMAP vs t-SNE?"},
   # We need to add the Assistant role as well, otherwise mlx_lm will error on generation.
   {"role": "assistant", "content": "The "},
]

prompt = tokenizer.apply_chat_template(chat, add_generation_prompt=True, tokenize=False)

response = generate(model, tokenizer, prompt=prompt, verbose=True, temp=0.6, max_tokens=1500)

Converted and uploaded by eek

Downloads last month
88
MLX
Hardware compatibility
Log In to view the estimation

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support