Rayen
Lissanro
AI & ML interests
None yet
Organizations
None yet
128K context does not work (possibly because YaRN meta information is missing?)
➕
1
2
#8 opened 2 months ago
by
Lissanro
MXFP4_MOE
🔥
🚀
1
11
#1 opened 5 months ago
by
marcelone
Incorrect Model Uploaded
🤗
👍
18
5
#8 opened 4 months ago
by
noteventhrice
Context length: is it 128K (as mentioned in the model card) or 160K (as specified in config.json)?
1
#17 opened 4 months ago
by
Lissanro
Please consider creating ik_llama.cpp compatible quants (without llama.cpp-specific MLA tensors)
1
#1 opened 8 months ago
by
Lissanro
chat_template.json is missing
2
#1 opened 10 months ago
by
Lissanro
chat_template.json is missing
2
#1 opened 10 months ago
by
Lissanro
Tell me how do you feel about this model without telling me how do you feel about this model
4
#5 opened 10 months ago
by
MrDevolver
Is this model native 128K context length, or YaRN extended?
7
#28 opened 10 months ago
by
danielhanchen
Doesn't Generate `<think>` tags
3
#25 opened 10 months ago
by
bingw5
Works great on oobabooga, but always ends with assistant
1
#3 opened over 1 year ago
by
Noodlz
This could likely be dewokefied and possible even improved using mergekit's new 'Model Stock' method!
🔥
1
31
#5 opened over 1 year ago
by
jukofyork