Spaces:
Running
on
Zero
Running
on
Zero
| title: CineGen | |
| emoji: π | |
| colorFrom: pink | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 6.0.1 | |
| app_file: app.py | |
| pinned: false | |
| short_description: automate the process of short movie creation | |
| tags: | |
| - mcp-in-action-track-creative | |
| **CineGen AI Director** is an AI agent designed to automate the process of short movie creation. It transforms a simple text or image idea into a fully realized video production by handling scriptwriting, storyboard generation, character design, and video synthesis using a multi-model approach. | |
| - **Sponsor Platforms**: Uses Google Gemini (story + character prompts) and Hugging Face Inference Client with fal.ai hosting for Wan 2.2 TI2V video renders; | |
| - **Autonomous Agent Flow**: StoryGenerator β CharacterDesigner β VideoDirector pipeline runs sequentially inside a single Gradio Blocks app, with MCP-friendly abstractions (`StoryGenerator`, `CharacterDesigner`, `VideoDirector`) designed for tool-call orchestration. | |
| - **Evaluation Notes**: Covers reasoning (Gemini JSON storyboard spec), planning (scene/character tables that feed downstream steps), and execution (queued video renders with serialized HF jobs). | |
| ## Artifacts for Reviewers | |
| - **Social Media Proof**: Replace `<SOCIAL_LINK_HERE>` with your live tweet/thread/LinkedIn post so judges can verify community sharing. | |
| - **Video Recording**: Upload a walkthrough of the Gradio agent (screen + narration) and swap `<DEMO_VIDEO_LINK>` with the shareable link. | |
| ## π Key Features | |
| * **End-to-End Automation**: Converts a single sentence idea into a complete short film (approx. 30s-60s runtime). | |
| * **Intelligent Storyboarding**: Breaks down concepts into scene-by-scene visual prompts and narrative descriptions. | |
| * **Character Consistency System**: | |
| * Automatically identifies main characters. | |
| * Generates visual reference sheets (Character Anchors). | |
| * Allows users to "tag" specific characters in specific scenes to ensure visual consistency in the video generation prompt. | |
| * **Multi-Model Video Generation**: Supports multiple state-of-the-art open-source video models via Hugging Face. | |
| * **Robust Fallback System**: If the selected video model fails (e.g., server overload), the system automatically tries alternative models until generation succeeds. | |
| * **Interactive Editing**: | |
| * Edit visual prompts manually. | |
| * Add, Insert, or Delete scenes during production. | |
| * Regenerate specific clips or character looks. | |
| * **Client-Side Video Merging**: Combines individual generated clips into a single continuous movie file directly in the browser without requiring a backend video processing server. | |
| ## π€ AI Models & API Usage | |
| The application orchestrates two primary AI services: | |
| ### 1. Google Gemini API (`@google/genai`) | |
| Used for the "Brain" and "Art Department" of the application. | |
| * **Logic & Scripting**: `gemini-2.5-flash` | |
| * **Role**: Analyzes the user's idea, generates the title, creates character profiles, and writes the JSON-structured storyboard with visual prompts. | |
| * **Technique**: Uses Structured Output (JSON Schema) to ensure the app can parse the story data reliably. | |
| * **Character Design**: `gemini-2.5-flash-image` | |
| * **Role**: Generates static reference images for characters based on the script's descriptions. | |
| * **Role**: Acts as the visual anchor for the user to verify character appearance before video generation. | |
| ### 2. Hugging Face Inference API (`@huggingface/inference`) | |
| Used for the "Production/Camera" department. | |
| * **Video Generation Models**: | |
| * **Wan 2.1 (Wan-AI)**: `Wan-AI/Wan2.1-T2V-14B` (Primary/Default) | |
| * **LTX Video (Lightricks)**: `Lightricks/LTX-Video-0.9.7-distilled` | |
| * **Hunyuan Video 1.5**: `tencent/HunyuanVideo-1.5` | |
| * **CogVideoX**: `THUDM/CogVideoX-5b` | |
| * **Provider**: Defaults to `fal-ai` via Hugging Face Inference for high-performance GPU access. | |