The MHA2MLA-VLM model published in the paper "MHA2MLA-VLM: Enabling DeepSeek's Economical Multi-Head Latent Attention across Vision-Language Models"
Xiaoran Fan
cnxup
AI & ML interests
NLP, CV, LLM
Recent Activity
upvoted
a
collection
about 17 hours ago
MHA2MLA-VLM
updated
a collection
about 17 hours ago
MHA2MLA-VLM
updated
a collection
about 17 hours ago
MHA2MLA-VLM
Organizations
None yet