5 20

bndp

AI & ML interests

None yet

Recent Activity

liked a model 7 days ago

ibm-granite/granite-4.0-micro

reacted to Shrijanagain's post with 🔥 12 days ago

We are thrilled to announce the launch of SKT-OMNI-CORPUS-146T-V1, a massive-scale, high-quality dataset designed to power the next generation of Foundation Models (LLMs) from scratch. Developed at SKT AI LABS, this corpus is not just a collection of data; it’s a mission to decentralize high-grade AI training for regional languages and global knowledge. 💎 Key Highlights: •• Massive Scale: Targeting a multi-terabyte architecture for 146T-level tokenization. •• Pure Quality: Curated from 500+ Elite Sources •• Structured for MoE: Perfectly sharded into 3.5GB standardized units (SKT-𝕻 series) for seamless distributed training. 🤝 Open for Collaboration! We are looking for AI researchers, CUDA engineers, and data scientists to join us in this journey of building Project Surya and the ST-X Series models. Whether it's optimization, custom tokenization, or architecture design—let’s build the future together. Explore the Dataset on Hugging Face: 🔗 https://huggingface.co/datasets/Shrijanagain/SKT-OMNI-CORPUS-146T-V1 DSR -- 🔗 https://huggingface.co/datasets/Shrijanagain/SKT-DSRx10000 #AI #MachineLearning #OpenSource #IndicAI #SKTAILABS #LLM #BigData #HuggingFace #InnovationIndia

liked a Space 16 days ago

ggml-org/gguf-my-repo

View all activity

Organizations

None yet

liked a model 7 days ago

ibm-granite/granite-4.0-micro

Text Generation • Updated Nov 3, 2025 • 106k • 264

reactedto Shrijanagain's post with 🔥 12 days ago

Post

5553

We are thrilled to announce the launch of SKT-OMNI-CORPUS-146T-V1, a massive-scale, high-quality dataset designed to power the next generation of Foundation Models (LLMs) from scratch.
Developed at SKT AI LABS, this corpus is not just a collection of data; it’s a mission to decentralize high-grade AI training for regional languages and global knowledge.

💎 Key Highlights:

•• Massive Scale: Targeting a multi-terabyte architecture for 146T-level tokenization.

•• Pure Quality: Curated from 500+ Elite Sources

•• Structured for MoE: Perfectly sharded into 3.5GB standardized units (SKT-𝕻 series) for seamless distributed training.

🤝 Open for Collaboration!

We are looking for AI researchers, CUDA engineers, and data scientists to join us in this journey of building Project Surya and the ST-X Series models. Whether it's optimization, custom tokenization, or architecture design—let’s build the future together.

Explore the Dataset on Hugging Face:

🔗 Shrijanagain/SKT-OMNI-CORPUS-146T-V1

DSR -- 🔗 Shrijanagain/SKT-DSRx10000

#AI #MachineLearning #OpenSource #IndicAI #SKTAILABS #LLM #BigData #HuggingFace #InnovationIndia

liked a Space 16 days ago

GGUF My Repo

🦙

1.9k

Create GGUF quantized model from a Hugging Face repo

liked a model 16 days ago

nvidia/Nemotron-Cascade-2-30B-A3B

Text Generation • 32B • Updated 11 days ago • 142k • 457

liked a dataset 16 days ago

nvidia/Nemotron-Cascade-2-SFT-Data

Viewer • Updated 16 days ago • 15.9M • 12.6k • 51

liked a model about 1 month ago

bartowski/Qwen_Qwen3.5-0.8B-GGUF

Image-Text-to-Text • 0.8B • Updated 26 days ago • 184k • 11

liked a model 2 months ago

Qwen/Qwen3-Coder-Next

Text Generation • 80B • Updated Feb 3 • 814k • • 1.22k

reactedto nyuuzyou's post with 👍 2 months ago

Post

2736

🏛️ Microsoft CodePlex Archive Dataset - nyuuzyou/ms-codeplex-archive

Following the strong response to the Google Code Archive nyuuzyou/google-code-archive (thanks!), this release preserves another major historical repository: the Microsoft CodePlex Archive.

CodePlex served as Microsoft’s primary open-source hosting platform from 2006 to 2017. This dataset captures the distinct .NET and Windows-centric development ecosystem that flourished before the industry standardizing on GitHub.

Key Stats:

- 5,043,730 files from 38,087 repositories
- 3.6 GB compressed Parquet
- 91 programming languages (Heavily featuring C#, ASP.NET, and C++)
- Cleaned of binaries, build artifacts, and vendor directories (node_modules, packages)
- Includes platform-specific license metadata (Ms-PL, Ms-RL)

reactedto AdinaY's post with 🔥 3 months ago

Post

2400

Z.ai just released a powerful lightweight option of GLM 4.7

✨ 30B total/3B active - MoE

zai-org/GLM-4.7-Flash

1 reply

liked a model 3 months ago

Alic-Li/Mini_RWKV_7_34.2M

Updated Aug 12, 2025 • 2

liked a Space 3 months ago

RWKV7 G0 7.2B Llamacpp

🐨

Generate text based on input prompts

upvoted a paper 7 months ago

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18, 2025 • 154

upvoted a collection 7 months ago

Huihui-MoE

Collection

4 items • Updated Mar 2 • 3

liked 2 models 8 months ago

mradermacher/Huihui-Qwen3-4B-Instruct-2507-abliterated-GGUF

4B • Updated Aug 8, 2025 • 1.25k • 11

mradermacher/Huihui-Qwen3-4B-Instruct-2507-abliterated-i1-GGUF

4B • Updated Jan 1 • 226 • 5

liked a Space 9 months ago

EasyControl Ghibli

🦀

New Ghibli EasyControl model is now released!!

liked a model 9 months ago

POLARIS-Project/Polaris-4B-Preview

4B • Updated Jul 10, 2025 • 599 • 138

reactedto alandao's post with 🔥 10 months ago

Post

1461

Don’t give up 🔥

Do you know what I was planning to do this time last week?

I was preparing to write a report declaring that Jan Nano was a failed project because the benchmark results didn’t meet expectations.

But I thought — it can’t be. When loading the model into the app, the performance clearly felt better. So why were the benchmark results worse?

That’s when I reviewed the entire benchmark codebase and realized something fundamental: agentic or workflow-based approaches introduce a huge gap and variation when benchmarking. Jan-nano was trained with an agentic setup — it simply can’t be benchmarked using a rigid workflow-based method.

I made the necessary changes, and the model ended up performing even better than before the issues arose. Turns out the previous benchmarking method conflicted with the way the model was trained.

What if I had given up? That would’ve meant 1.5 months of training and a huge amount of company resources wasted.

But now, this is officially the most successful and biggest release for the whole team — all thanks to Jan-nano.

Menlo/Jan-nano

published a model 10 months ago

bndp/AceReason-Nemotron-1.1-7B-Q4_K_M-GGUF

Text Generation • 8B • Updated Jun 17, 2025 • 6

updated a model 10 months ago

bndp/AceReason-Nemotron-1.1-7B-Q4_K_M-GGUF

Text Generation • 8B • Updated Jun 17, 2025 • 6

bndp

AI & ML interests

Recent Activity

Organizations

bndp's activity

GGUF My Repo

RWKV7 G0 7.2B Llamacpp

EasyControl Ghibli