AlphaSue (AlphaSue)

liked 3 models 11 months ago

liked a Space about 1 year ago

TxT360: Trillion Extracted Text

📖

133

Explore and download the TxT360 LLM pre‑training dataset

liked a model about 1 year ago

jinaai/ReaderLM-v2

Text Generation • 2B • Updated Mar 4, 2025 • 11.3k • • 769

liked a Space about 1 year ago

The Ultra-Scale Playbook

🌌

3.74k

The ultimate guide to training LLM on large GPU Clusters

liked a dataset about 1 year ago

microsoft/RedStone

Updated Dec 5, 2024 • 11 • 35

liked a model about 1 year ago

open-web-math/filtering-models

Updated Nov 2, 2023 • 9

liked a dataset about 1 year ago

m-a-p/FineFineWeb

Viewer • Updated Dec 19, 2024 • 4.89B • 1.39M • 107

liked 2 models over 1 year ago

nvidia/quality-classifier-deberta

Updated Sep 22, 2025 • 3.52k • 75

oliverguhr/fullstop-punctuation-multilang-large

Token Classification • Updated Nov 16, 2023 • 393k • • 174

liked a dataset over 1 year ago

teknium/OpenHermes-2.5

Viewer • Updated Apr 15, 2024 • 1M • 8.6k • 800

liked a model over 1 year ago

Snowflake/snowflake-arctic-embed-m

liked a Space almost 2 years ago

FineWeb: decanting the web for the finest text data at scale

🍷

1.31k

Generate a curated web‑text dataset for LLM training

liked 4 datasets almost 2 years ago

liwu/MNBVC

Updated Dec 3, 2025 • 147k • 590

togethercomputer/RedPajama-Data-1T

Viewer • Updated Jun 17, 2024 • 1.73M • 2.35k • 1.14k

allenai/dolma

Updated Apr 17, 2024 • 7.21k • 997

HuggingFaceFW/fineweb

Viewer • Updated Jul 11, 2025 • 52.5B • 157k • 2.7k

liked a Space over 2 years ago

ControlNet V1.1

📉

1.18k

Generate images from sketches, edges, or poses

liked a model over 2 years ago

TheBloke/Llama-2-7B-Chat-GGML

Text Generation • Updated Sep 27, 2023 • 537 • 872

AlphaSue

AI & ML interests

Organizations

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

deepseek-ai/DeepSeek-R1

gair-prox/web-chunk-refining-lm

TxT360: Trillion Extracted Text

jinaai/ReaderLM-v2

The Ultra-Scale Playbook

microsoft/RedStone

open-web-math/filtering-models

m-a-p/FineFineWeb

nvidia/quality-classifier-deberta

oliverguhr/fullstop-punctuation-multilang-large

teknium/OpenHermes-2.5

Snowflake/snowflake-arctic-embed-m

FineWeb: decanting the web for the finest text data at scale

liwu/MNBVC

togethercomputer/RedPajama-Data-1T

allenai/dolma

HuggingFaceFW/fineweb

ControlNet V1.1

TheBloke/Llama-2-7B-Chat-GGML

AlphaSue

AI & ML interests

Organizations

AlphaSue's activity

TxT360: Trillion Extracted Text

The Ultra-Scale Playbook

FineWeb: decanting the web for the finest text data at scale

ControlNet V1.1