Orr Zohar PRO

orrzohar

https://ai.stanford.edu/~orrzohar/

AI & ML interests

Large Multi-Modal Models, Foundation Models, Video Understanding

Recent Activity

updated a Space 7 days ago

orrzohar/demo2

updated a Space 11 days ago

orrzohar/demo1

published a Space 11 days ago

orrzohar/demo2

View all activity

Organizations

upvoted 2 papers about 1 month ago

OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

Paper • 2601.15369 • Published Jan 21 • 21

Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

Paper • 2601.15892 • Published Jan 22 • 53

upvoted 2 papers 3 months ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published Nov 26, 2025 • 125

Downscaling Intelligence: Exploring Perception and Reasoning Bottlenecks in Small Multimodal Models

Paper • 2511.17487 • Published Nov 21, 2025 • 12

upvoted 5 papers 4 months ago

upvoted a paper 5 months ago

SciVideoBench: Benchmarking Scientific Video Reasoning in Large Multimodal Models

Paper • 2510.08559 • Published Oct 9, 2025 • 9

upvoted a paper 7 months ago

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 206

upvoted an article 7 months ago

Article

TimeScope: How Long Can Your Video Large Multimodal Model Go?

Jul 23, 2025

•

upvoted 3 papers 9 months ago

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2, 2025 • 151

UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning

Paper • 2505.14231 • Published May 20, 2025 • 53

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20, 2025 • 76

upvoted 5 papers 10 months ago

Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math

Paper • 2504.21233 • Published Apr 30, 2025 • 49

RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation

Paper • 2504.17502 • Published Apr 24, 2025 • 55

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published Apr 22, 2025 • 64

FlowReasoner: Reinforcing Query-Level Meta-Agents

Paper • 2504.15257 • Published Apr 21, 2025 • 47

Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21, 2025 • 88

Orr Zohar PRO

AI & ML interests

Recent Activity

Organizations

orrzohar's activity

TimeScope: How Long Can Your Video Large Multimodal Model Go?