A wild idea / suggestion...
Hello,
this probably won't be very popular, but I had a wild idea / suggestion. There are these new custom Qwen 3.5 based finetunes in 13B and 21B sizes which are dense models, kinda filling the gap between the small 9B and the bigger 27B official models. They were made by DavidAU and you can find them in his collection here: https://huggingface.co/collections/DavidAU/qwen-35-08-2-4-9-27-35b-regular-uncensored
OmniCoder 9B turned out to be a very good model for its size, but I feel like if it had a little bit more room to breathe (bigger base model), it could be even smarter thanks to your dataset, because whatever you did to this model is working for it really well.
I know there was already a request for the 35B MoE, so I was thinking maybe you could take a look at those unofficial models too and see if you could use them as a base for a mid size variant of your OmniCoder model? I think those OmniCoder 13B and/or OmniCoder 21B would really shine here. β€π
The right answer is likely to target the 27B dense model with a similar finetune. If I were Tesslate I would not waste time fiddling with those shrunken models using opaque methods.
The biggest challenge with the 27B is that Qwen has not released the 27B base weights. They released base models for all but the 27B, 122B, and 397B (so far). This doesn't prevent the possibility of a 27B fine tune, but it would not be from a similar clean slate. Would be nice if at least the 27B base would be released.
The right answer is likely to target the 27B dense model with a similar finetune. If I were Tesslate I would not waste time fiddling with those shrunken models using opaque methods.
The biggest challenge with the 27B is that Qwen has not released the 27B base weights. They released base models for all but the 27B, 122B, and 397B (so far). This doesn't prevent the possibility of a 27B fine tune, but it would not be from a similar clean slate. Would be nice if at least the 27B base would be released.
Not sure about that being the biggest challenge with 27B models, but I'm fairly sure that for someone who has very powerful hardware and has no issues to run even the biggest and strongest variants of the model, it may not be so easy to understand the sentiment behind someone wanting the middle tier sized model of around 14B and thus, smaller than 27B simply due to their hardware limitations.
Exactly that size is missing in the current Qwen generation lineup, despite it being very popular in previous generations. My point was therefore to explore the possibility of middle size models that would be more advanced than the smaller 9B model, yet still small enough for lower performance hardware users to enjoy. π
That doesn't mean I'm strictly against the idea of 27B variants, in fact I'd prefer options of all sizes, but at the same time I realize we're not in a shopping center, so while we may wish for something, chances are we're still going to leave empty handed. Still, if I got to choose one, I'd prefer the middle size that is the most optimal for my own hardware. π