MHL's picture

MHL

Zipper112

·

zipper112

AI & ML interests

None yet

Recent Activity

liked a model about 21 hours ago

tencent/HY-World-2.0

liked a model 1 day ago

z-lab/Qwen3.6-35B-A3B-DFlash

reacted to SeanLee97's post with 🔥 1 day ago

Our lab recently released a paper where we introduce ShadowPEFT, a new Parameter-Efficient Fine-Tuning (PEFT) paradigm tailored for edge computing scenarios. Unlike traditional approaches such as LoRA and its variants, which inject trainable parameters directly into the weights of Transformer, requiring tight coupling with the backbone. ShadowPEFT instead enhances the frozen large base model by adding a lightweight, centralized, pretrainable, and detachable Shadow network. This shadow network operates in parallel with the base model, delivering learned corrections to each decoder layer. Because the shadow module is architecturally decoupled from the backbone, it can be independently trained, stored, and deployed, benefiting edge computing scenarios and edge-cloud collaboration computing. - HF Paper: https://huggingface.co/papers/2604.19254 - GitHub: https://github.com/ShadowLLM/shadow-peft - HF Collection: https://huggingface.co/collections/shadow-llm/shadow-peft-models

View all activity

Organizations

None yet

upvoted a collection 4 months ago

SWE-bench

SWE-bench is a benchmark for evaluating Language Models and AI Systems on their ability resolve real world GitHub Issues. • 4 items • Updated Mar 8, 2025 • 9

upvoted a collection 7 months ago

Qwen3-VL

37 items • Updated Dec 31, 2025 • 701

upvoted a collection 10 months ago

Qwen3

84 items • Updated Dec 31, 2025 • 1.75k

upvoted an article over 1 year ago

Article

Mixture of Experts Explained

+4

Dec 11, 2023

•

1.12k