FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance Paper • 2305.05176 • Published May 9, 2023 • 6
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22, 2024 • 259
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction Paper • 2404.02905 • Published Apr 3, 2024 • 74
Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward Paper • 2404.01258 • Published Apr 1, 2024 • 12
Improving Text-to-Image Consistency via Automatic Prompt Optimization Paper • 2403.17804 • Published Mar 26, 2024 • 20
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation Paper • 2403.16990 • Published Mar 25, 2024 • 25
WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs Paper • 2403.07944 • Published Mar 10, 2024 • 1
Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition Paper • 2403.14148 • Published Mar 21, 2024 • 21
AnimateDiff-Lightning: Cross-Model Diffusion Distillation Paper • 2403.12706 • Published Mar 19, 2024 • 18
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation Paper • 2403.12015 • Published Mar 18, 2024 • 70
VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models Paper • 2403.05438 • Published Mar 8, 2024 • 20