jiogenes

jiogenes

AI & ML interests

None yet

Recent Activity

updated a model 3 months ago

jiogenes/llama-3.2-1B-Instruct-three-kingdoms

published a model 3 months ago

jiogenes/llama-3.2-1B-Instruct-three-kingdoms

updated a model 3 months ago

jiogenes/value_repeat

View all activity

Organizations

updated a model 3 months ago

jiogenes/llama-3.2-1B-Instruct-three-kingdoms

Text Generation • 1B • Updated Oct 21, 2025 • 4

published a model 3 months ago

jiogenes/llama-3.2-1B-Instruct-three-kingdoms

Text Generation • 1B • Updated Oct 21, 2025 • 4

updated a model 3 months ago

jiogenes/value_repeat

Updated Oct 13, 2025

published a model 3 months ago

jiogenes/value_repeat

Updated Oct 13, 2025

upvoted a paper 7 months ago

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22, 2024 • 134

updated a Space 9 months ago

First Agent Template

⚡

Fetch and display current time in a specified timezone

upvoted a paper 10 months ago

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10, 2025 • 153

upvoted 2 articles 10 months ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

Jan 30, 2025

•

219

Article

4D masks support in Transformers

Jan 8, 2024

•

upvoted a paper over 1 year ago

LLaNA: Large Language and NeRF Assistant

Paper • 2406.11840 • Published Jun 17, 2024 • 18

updated 2 models almost 2 years ago

jiogenes/bert-base-cased-finetuning-hrs

Fill-Mask • 0.1B • Updated Feb 20, 2024 • 3

jiogenes/bert-base-uncased-finetuning-hrs

Fill-Mask • 0.1B • Updated Feb 20, 2024 • 1

updated 3 models about 2 years ago

upvoted a paper about 2 years ago

PyNeRF: Pyramidal Neural Radiance Fields

Paper • 2312.00252 • Published Nov 30, 2023 • 11

jiogenes

AI & ML interests

Recent Activity

Organizations

jiogenes's activity

First Agent Template

KV Caching Explained: Optimizing Transformer Inference Efficiency

4D masks support in Transformers