Bitnet.cpp: Efficient Edge Inference for Ternary LLMs Paper • 2502.11880 • Published Feb 17, 2025 • 18
view reply Great news. Serving with llama.cpp using HF-hosted models, including unsloth's on AMD Strix Halo and OpenCode here.
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 ggerganov, ngxson, allozaur, lysandre, victor, julien-c • Feb 20 • 505