4 9 10

Albert Catalan-Tatjer

aldakata

https://aldakata.github.io/

aldakata

AI & ML interests

Efficiency

Recent Activity

liked a model 4 days ago

JonasGeiping/stream-qwen3.5-27b

new activity 4 days ago

JonasGeiping/stream-qwen3-8b:typo

liked a model 4 days ago

JonasGeiping/stream-qwen3-8b

View all activity

Organizations

None yet

liked a model 4 days ago

JonasGeiping/stream-qwen3.5-27b

Text Generation • 27B • Updated 4 days ago • 478 • 15

New activity in JonasGeiping/stream-qwen3-8b 4 days ago

typo

#1 opened 4 days ago by

aldakata

liked a model 4 days ago

JonasGeiping/stream-qwen3-8b

Text Generation • 8B • Updated 4 days ago • 48 • 4

New activity in allenai/OLMo-2-0425-1B about 2 months ago

Main revision

#5 opened 7 months ago by

aldakata

liked a dataset 3 months ago

ricdomolm/MATH-500

Viewer • Updated Feb 6, 2025 • 12.5k • 714 • 4

liked a dataset 4 months ago

christopher/rosetta-code

Viewer • Updated Sep 24, 2023 • 79k • 501 • 39

upvoted a paper 4 months ago

Olmo 3

Paper • 2512.13961 • Published Dec 15, 2025 • 32

upvoted a collection 4 months ago

Olmo 3

Collection

Artifacts for the Olmo 3 release. • 7 items • Updated Mar 2 • 169

liked 2 models 6 months ago

deepseek-ai/DeepSeek-Math-V2

Text Generation • 685B • Updated Nov 27, 2025 • 614 • 694

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated Dec 17, 2025 • 16.1k • 1.45k

liked 2 Spaces 6 months ago

The Smol Training Playbook

📚

3.18k

The secrets to building world-class LLMs

The Ultra-Scale Playbook

🌌

3.84k

The ultimate guide to training LLM on large GPU Clusters

upvoted an article 7 months ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

not-lain

•

Jan 30, 2025

• 329

liked a dataset 7 months ago

bigcode/starcoderdata

Viewer • Updated May 16, 2023 • 207M • 29.2k • 509

authored a paper 7 months ago

Training Dynamics Impact Post-Training Quantization Robustness

Paper • 2510.06213 • Published Oct 7, 2025 • 3

upvoted a paper 7 months ago

Training Dynamics Impact Post-Training Quantization Robustness

Paper • 2510.06213 • Published Oct 7, 2025 • 3

New activity in HuggingFaceTB/SmolLM3-3B-checkpoints 8 months ago

Main branch

#6 opened 8 months ago by

aldakata

upvoted a collection 8 months ago

open-sci-ref-0.01 nemotron-hq

Collection

10 items • Updated Aug 17, 2025 • 4

upvoted an article 10 months ago

Article

Prefill and Decode for Concurrent Requests - Optimizing LLM Performance

tngtech

•

Apr 16, 2025

• 78

liked a model 10 months ago

HuggingFaceTB/SmolLM3-3B-checkpoints

Updated Aug 14, 2025 • 12.9k • 24

Albert Catalan-Tatjer

AI & ML interests

Recent Activity

Organizations

aldakata's activity

typo

Main revision

The Smol Training Playbook

The Ultra-Scale Playbook

KV Caching Explained: Optimizing Transformer Inference Efficiency

Main branch

Prefill and Decode for Concurrent Requests - Optimizing LLM Performance