Yuseung "Phillip" Lee's picture

Yuseung "Phillip" Lee

phillipinseoul

·

https://phillipinseoul.github.io/

phillipinseoul

AI & ML interests

Computer Vision

Recent Activity

liked a dataset 2 days ago

FlagEval/EmbSpatial-Bench

upvoted a paper 3 days ago

K-EXAONE Technical Report

liked a model 4 days ago

nvidia/Cosmos-Reason2-8B

View all activity

Organizations

upvoted a paper 3 days ago

K-EXAONE Technical Report

Paper • 2601.01739 • Published 4 days ago • 75

upvoted a paper 4 days ago

SenseNova-MARS: Empowering Multimodal Agentic Reasoning and Search via Reinforcement Learning

Paper • 2512.24330 • Published 10 days ago • 33

upvoted a paper 7 days ago

Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems

Paper • 2512.24385 • Published 10 days ago • 7

upvoted 4 papers 10 days ago

An Information Theoretic Perspective on Agentic System Design

Paper • 2512.21720 • Published 15 days ago • 7

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published 13 days ago • 43

SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents

Paper • 2512.22322 • Published 14 days ago • 38

Yume-1.5: A Text-Controlled Interactive World Generation Model

Paper • 2512.22096 • Published 14 days ago • 57

upvoted a paper 14 days ago

Latent Implicit Visual Reasoning

Paper • 2512.21218 • Published 16 days ago • 66

upvoted a paper 15 days ago

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

Paper • 2512.20557 • Published 17 days ago • 49

upvoted 2 papers 16 days ago

Reinforcement Learning for Self-Improving Agent with Skill Library

Paper • 2512.17102 • Published 22 days ago • 32

SpatialTree: How Spatial Abilities Branch Out in MLLMs

Paper • 2512.20617 • Published 17 days ago • 42

upvoted a paper 17 days ago

WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion

Paper • 2512.19678 • Published 18 days ago • 29

upvoted 5 papers 18 days ago

MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence

Paper • 2512.10863 • Published 29 days ago • 21

Multimodal RewardBench 2: Evaluating Omni Reward Models for Interleaved Text and Image

Paper • 2512.16899 • Published 22 days ago • 12

PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence

Paper • 2512.16793 • Published 22 days ago • 72

Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs

Paper • 2512.17008 • Published 22 days ago • 10

When Reasoning Meets Its Laws

Paper • 2512.17901 • Published 21 days ago • 56

upvoted 3 papers 21 days ago

AdaTooler-V: Adaptive Tool-Use for Images and Videos

Paper • 2512.16918 • Published 22 days ago • 12

Next-Embedding Prediction Makes Strong Vision Learners

Paper • 2512.16922 • Published 22 days ago • 83

N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

Paper • 2512.16561 • Published 22 days ago • 19