InteractiveOmni: A Unified Omni-modal Model for Audio-Visual Multi-turn Dialogue Paper • 2510.13747 • Published Oct 15, 2025 • 32
view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective LinkedIn • Jan 27 • 75
Uni-ViGU: Towards Unified Video Generation and Understanding via A Diffusion-Based Video Generator Paper • 2604.08121 • Published Apr 9 • 43
Harvey Collection A legal reasoning model specialized in Salvadoran jurisprudence • 4 items • Updated Apr 12 • 1
Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning Paper • 2604.04746 • Published Apr 8 • 72
view article Article Training Design for Text-to-Image Models: Lessons from Ablations Photoroom • Feb 3 • 73
Aquiles-Studio Collection High-performance image and video generation models for Aquiles-Image. Faster inference, lower costs • 9 items • Updated Mar 2 • 2
view article Article Diffusers welcomes FLUX-2 +6 YiYiXu, dg845, sayakpaul, OzzyGT, dn6, ariG23498, linoyts, multimodalart • Nov 25, 2025 • 191
view article Article We’re open-sourcing our text-to-image model and the process behind it Photoroom • Nov 12, 2025 • 99