MemReader-4B-f32-GGUF

MemReader-4B is a 4 billion parameter causal language model designed specifically for memory operations within the MemOS system, derived from a fine-tuned Qwen3-4B. It supports high-quality memory extraction from conversations and documents in both English and Chinese, with a large context window of 32,768 tokens. This model enables faster, more accurate memory extraction with over 70% reduced resource consumption compared to larger models like Qwen3-14B, while outperforming GPT-4o-mini. MemReader-4B facilitates local-only deployment for restricted environments, offering excellent system efficiency and suitability for summarizing and integrating memory information from chats and documents. It is readily usable through MemOS configurations or directly via Huggingface and supports seamless memory extraction workflows.

Model Files

File name	Size	Quant type
MemReader-4B.F32.gguf	16.1 GB	F32
MemReader-4B.BF16.gguf	8.05 GB	BF16
MemReader-4B.F16.gguf	8.05 GB	F16
MemReader-4B.Q8_0.gguf	4.28 GB	Q8_0
MemReader-4B.Q6_K.gguf	3.31 GB	Q6_K
MemReader-4B.Q5_K_M.gguf	2.89 GB	Q5_K_M
MemReader-4B.Q5_K_S.gguf	2.82 GB	Q5_K_S
MemReader-4B.Q4_K_M.gguf	2.5 GB	Q4_K_M
MemReader-4B.Q4_K_S.gguf	2.38 GB	Q4_K_S
MemReader-4B.Q3_K_L.gguf	2.24 GB	Q3_K_L
MemReader-4B.Q3_K_M.gguf	2.08 GB	Q3_K_M
MemReader-4B.Q3_K_S.gguf	1.89 GB	Q3_K_S
MemReader-4B.Q2_K.gguf	1.67 GB	Q2_K

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

Downloads last month: 39

GGUF

Model size

4B params

Architecture

qwen3

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Model tree for prithivMLmods/MemReader-4B-f32-GGUF

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Finetuned

MemTensor/MemReader-4B

Quantized

(3)

this model