"Not all quantized model perform good", serving framework ollama uses NVIDIA gpu, llama.cpp uses CPU with AVX & AMX
v1k
xbruce22
AI & ML interests
None yet
Recent Activity
liked
a model
2 days ago
deepseek-ai/DeepSeek-V3.2
liked
a model
10 days ago
deepseek-ai/DeepSeek-V3.2-Speciale
liked
a model
22 days ago
deepseek-ai/deepseek-vl2-tiny