Rayen's picture

15 1 4

Rayen

Lissanro

https://Dragon.Studio

Lissanro

AI & ML interests

None yet

Organizations

None yet

New activity in ubergarm/Ling-1T-GGUF 2 months ago

128K context does not work (possibly because YaRN meta information is missing?)

#8 opened 2 months ago by

New activity in Jinx-org/Jinx-gpt-oss-20b-GGUF 4 months ago

MXFP4_MOE

#1 opened 5 months ago by

New activity in xai-org/grok-2 4 months ago

Incorrect Model Uploaded

#8 opened 4 months ago by

New activity in deepseek-ai/DeepSeek-V3.1 4 months ago

Context length: is it 128K (as mentioned in the model card) or 160K (as specified in config.json)?

#17 opened 4 months ago by

New activity in tngtech/DeepSeek-R1T-Chimera 7 months ago

Any plans to release an updated version based on DeepSeek-V3-0526 + R1, or how to create the merge myself?

#4 opened 7 months ago by

New activity in bullerwins/DeepSeek-R1T-Chimera-GGUF 8 months ago

Please consider creating ik_llama.cpp compatible quants (without llama.cpp-specific MLA tensors)

#1 opened 8 months ago by

New activity in lynnea1517/c4ai-command-a-03-2025-exl2-4.5bpw-test 10 months ago

chat_template.json is missing

#1 opened 10 months ago by

New activity in llmixer/c4ai-command-a-03-2025-7.0bpw-h8-exl2 10 months ago

chat_template.json is missing

#1 opened 10 months ago by

New activity in CohereLabs/c4ai-command-a-03-2025 10 months ago

Tell me how do you feel about this model without telling me how do you feel about this model

#5 opened 10 months ago by

New activity in Qwen/QwQ-32B 10 months ago

Is this model native 128K context length, or YaRN extended?

#28 opened 10 months ago by

Doesn't Generate `<think>` tags

#25 opened 10 months ago by

New activity in turboderp/Llama-3-70B-Instruct-exl2 over 1 year ago

tokenizer_config.json and config.json specify wrong EOS token, causing the model to not function correctly in backends which do not read EOS tokens from generation_config.json

#1 opened over 1 year ago by

New activity in LoneStriker/Meta-Llama-3-8B-Instruct-8.0bpw-h8-exl2 over 1 year ago

Works great on oobabooga, but always ends with assistant

#3 opened over 1 year ago by

New activity in alpindale/WizardLM-2-8x22B over 1 year ago

This could likely be dewokefied and possible even improved using mergekit's new 'Model Stock' method!

#5 opened over 1 year ago by