MemReader-4B-f32-GGUF
MemReader-4B is a 4 billion parameter causal language model designed specifically for memory operations within the MemOS system, derived from a fine-tuned Qwen3-4B. It supports high-quality memory extraction from conversations and documents in both English and Chinese, with a large context window of 32,768 tokens. This model enables faster, more accurate memory extraction with over 70% reduced resource consumption compared to larger models like Qwen3-14B, while outperforming GPT-4o-mini. MemReader-4B facilitates local-only deployment for restricted environments, offering excellent system efficiency and suitability for summarizing and integrating memory information from chats and documents. It is readily usable through MemOS configurations or directly via Huggingface and supports seamless memory extraction workflows.
Model Files
| File name | Size | Quant type |
|---|---|---|
| MemReader-4B.F32.gguf | 16.1 GB | F32 |
| MemReader-4B.BF16.gguf | 8.05 GB | BF16 |
| MemReader-4B.F16.gguf | 8.05 GB | F16 |
| MemReader-4B.Q8_0.gguf | 4.28 GB | Q8_0 |
| MemReader-4B.Q6_K.gguf | 3.31 GB | Q6_K |
| MemReader-4B.Q5_K_M.gguf | 2.89 GB | Q5_K_M |
| MemReader-4B.Q5_K_S.gguf | 2.82 GB | Q5_K_S |
| MemReader-4B.Q4_K_M.gguf | 2.5 GB | Q4_K_M |
| MemReader-4B.Q4_K_S.gguf | 2.38 GB | Q4_K_S |
| MemReader-4B.Q3_K_L.gguf | 2.24 GB | Q3_K_L |
| MemReader-4B.Q3_K_M.gguf | 2.08 GB | Q3_K_M |
| MemReader-4B.Q3_K_S.gguf | 1.89 GB | Q3_K_S |
| MemReader-4B.Q2_K.gguf | 1.67 GB | Q2_K |
Quants Usage
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):
- Downloads last month
- 39
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
32-bit
