Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2512.19535

Guided Self-Evolving LLMs with Minimal Human Supervision

Paper • 2512.02472 • Published Dec 2, 2025 • 51
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29, 2025 • 141
Video Reasoning without Training

Paper • 2510.17045 • Published Oct 19, 2025 • 7
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 271

CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion

Paper • 2512.19535 • Published 19 days ago • 11

CASA: Cross-Attention as Self-Attention for Efficient Vision-Language Fusion on long context streaming inputs

Running

1

CASA Gallery

🏠

1

Video Gallery for CASA: Cross-Attention via Self-Attention
CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion

Paper • 2512.19535 • Published 19 days ago • 11
kyutai/CASA-Helium1-VL-2B

Image-Text-to-Text • 3B • Updated 18 days ago • 199 • 5
kyutai/CASA-Qwen2_5-VL-3B

Image-Text-to-Text • 4B • Updated 18 days ago • 246 • 1

Vision Language Action models

A Survey on Vision-Language-Action Models: An Action Tokenization Perspective

Paper • 2507.01925 • Published Jul 2, 2025 • 39
Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning

Paper • 2507.16746 • Published Jul 22, 2025 • 35
MolmoAct: Action Reasoning Models that can Reason in Space

Paper • 2508.07917 • Published Aug 11, 2025 • 44
Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies

Paper • 2508.20072 • Published Aug 27, 2025 • 31

YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection

Paper • 2512.23273 • Published 12 days ago • 13
A 58-Addition, Rank-23 Scheme for General 3x3 Matrix Multiplication

Paper • 2512.21980 • Published 15 days ago • 2
Step-DeepResearch Technical Report

Paper • 2512.20491 • Published 18 days ago • 82
SAM Audio: Segment Anything in Audio

Paper • 2512.18099 • Published 22 days ago • 21

Video generation

SemanticGen: Video Generation in Semantic Space

Paper • 2512.20619 • Published 18 days ago • 90
CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion

Paper • 2512.19535 • Published 19 days ago • 11

Agentic Learner with Grow-and-Refine Multimodal Semantic Memory

Paper • 2511.21678 • Published Nov 26, 2025 • 12
QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation

Paper • 2512.19134 • Published 19 days ago • 31
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

Paper • 2512.16969 • Published 23 days ago • 111
CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion

Paper • 2512.19535 • Published 19 days ago • 11

Guided Self-Evolving LLMs with Minimal Human Supervision

Paper • 2512.02472 • Published Dec 2, 2025 • 51
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29, 2025 • 141
Video Reasoning without Training

Paper • 2510.17045 • Published Oct 19, 2025 • 7
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 271

YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection

Paper • 2512.23273 • Published 12 days ago • 13
A 58-Addition, Rank-23 Scheme for General 3x3 Matrix Multiplication

Paper • 2512.21980 • Published 15 days ago • 2
Step-DeepResearch Technical Report

Paper • 2512.20491 • Published 18 days ago • 82
SAM Audio: Segment Anything in Audio

Paper • 2512.18099 • Published 22 days ago • 21

CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion

Paper • 2512.19535 • Published 19 days ago • 11

Video generation

SemanticGen: Video Generation in Semantic Space

Paper • 2512.20619 • Published 18 days ago • 90
CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion

Paper • 2512.19535 • Published 19 days ago • 11

CASA: Cross-Attention as Self-Attention for Efficient Vision-Language Fusion on long context streaming inputs

Running

1

CASA Gallery

🏠

1

Video Gallery for CASA: Cross-Attention via Self-Attention
CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion

Paper • 2512.19535 • Published 19 days ago • 11
kyutai/CASA-Helium1-VL-2B

Image-Text-to-Text • 3B • Updated 18 days ago • 199 • 5
kyutai/CASA-Qwen2_5-VL-3B

Image-Text-to-Text • 4B • Updated 18 days ago • 246 • 1

Agentic Learner with Grow-and-Refine Multimodal Semantic Memory

Paper • 2511.21678 • Published Nov 26, 2025 • 12
QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation

Paper • 2512.19134 • Published 19 days ago • 31
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

Paper • 2512.16969 • Published 23 days ago • 111
CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion

Paper • 2512.19535 • Published 19 days ago • 11

Vision Language Action models

A Survey on Vision-Language-Action Models: An Action Tokenization Perspective

Paper • 2507.01925 • Published Jul 2, 2025 • 39
Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning

Paper • 2507.16746 • Published Jul 22, 2025 • 35
MolmoAct: Action Reasoning Models that can Reason in Space

Paper • 2508.07917 • Published Aug 11, 2025 • 44
Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies

Paper • 2508.20072 • Published Aug 27, 2025 • 31

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs