Spaces:
Sleeping
Sleeping
File size: 4,971 Bytes
c0e1c6a d16eb70 c0e1c6a d16eb70 c0e1c6a d16eb70 c0e1c6a d16eb70 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 |
---
title: Wan2.2 Video Generation
emoji: π₯
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 5.49.0
app_file: app.py
pinned: false
license: apache-2.0
tags:
- video-generation
- text-to-video
- image-to-video
- diffusers
- wan
- ai-video
- zero-gpu
python_version: "3.10"
---
# Wan2.2 Video Generation π₯
Generate high-quality videos from text prompts or images using the powerful **Wan2.2-TI2V-5B** model!
This Space provides an easy-to-use interface for creating videos with state-of-the-art AI technology.
## Features β¨
- **Text-to-Video**: Generate videos from descriptive text prompts
- **Image-to-Video**: Animate your images by adding an input image
- **High Quality**: 720P resolution at 24fps
- **Customizable**: Adjust resolution, number of frames, guidance scale, and more
- **Reproducible**: Use seeds to recreate your favorite generations
## Model Information π€
**Wan2.2-TI2V-5B** is a unified text-to-video and image-to-video generation model with:
- **5 billion parameters** optimized for consumer-grade GPUs
- **720P resolution** support (1280x704 default)
- **24 fps** smooth video output
- **Optimized duration**: Default 3 seconds (optimized for Zero GPU limits)
The model uses a Mixture-of-Experts (MoE) architecture and delivers outstanding video generation quality, surpassing many commercial models.
## How to Use π
### Text-to-Video Generation
1. Enter your prompt describing the video you want to create
2. Adjust settings in "Advanced Settings" if desired
3. Click "Generate Video"
4. Wait for generation (typically 2-3 minutes on Zero GPU with default settings)
### Image-to-Video Generation
1. Upload an input image
2. Enter a prompt describing how the image should animate
3. Click "Generate Video"
4. The output will maintain the aspect ratio of your input image
5. Generation takes 2-3 minutes with optimized settings
## Advanced Settings βοΈ
- **Width/Height**: Video resolution (default: 1280x704)
- **Number of Frames**: Longer videos need more frames (default: 73 frames β 3 seconds, max: 145)
- **Inference Steps**: More steps = better quality but slower (default: 35, optimized for speed)
- **Guidance Scale**: How closely to follow the prompt (default: 5.0)
- **Seed**: Set a specific seed for reproducible results
**Note**: Settings are optimized to complete within Zero GPU's 3-minute time limit for Pro users.
## Tips for Best Results π‘
1. **Detailed Prompts**: Be specific about what you want to see
- Good: "Two anthropomorphic cats in comfy boxing gear fight on stage with dramatic lighting"
- Basic: "cats fighting"
2. **Image-to-Video**: Use clear, high-quality input images that match your prompt
3. **Quality vs Speed** (optimized for Zero GPU limits):
- Fast: 25-30 steps (~2 minutes)
- Balanced: 35 steps (default, ~2-3 minutes)
- Higher Quality: 40-50 steps (~3+ minutes, may timeout)
4. **Experiment**: Try different guidance scales:
- Lower (3-4): More creative, less literal
- Default (5): Good balance
- Higher (7-10): Strictly follows prompt
## Example Prompts π
- "Two anthropomorphic cats in comfy boxing gear fight on stage"
- "A serene underwater scene with colorful coral reefs and tropical fish swimming gracefully"
- "A bustling futuristic city at night with neon lights and flying cars"
- "A peaceful mountain landscape with snow-capped peaks and a flowing river"
- "An astronaut riding a horse through a nebula in deep space"
- "A dragon flying over a medieval castle at sunset"
## Technical Details π§
- **Model**: Wan-AI/Wan2.2-TI2V-5B-Diffusers
- **Framework**: Hugging Face Diffusers
- **Backend**: PyTorch with bfloat16 precision
- **GPU**: Hugging Face Zero GPU (H200 with 70GB VRAM, automatically allocated)
- **GPU Duration**: 180 seconds (3 minutes) for Pro users
- **Generation Time**: ~2-3 minutes with optimized settings (73 frames, 35 steps)
## Limitations β οΈ
- Generation requires compute time (2-3 minutes with default settings)
- Zero GPU allocation is time-limited (3 minutes for Pro, 60 seconds for Free)
- Videos longer than 6 seconds (145 frames) may timeout
- Higher quality settings (50+ steps) may timeout on Zero GPU
- Complex scenes with many objects may be challenging
## Credits π
- **Model**: [Wan-AI](https://huggingface.co/Wan-AI)
- **Original Repository**: [Wan2.2](https://github.com/Wan-Video/Wan2.2)
- **Framework**: [Hugging Face Diffusers](https://github.com/huggingface/diffusers)
## License π
This Space uses the Wan2.2 model which is released under Apache 2.0 license.
## Related Links π
- [Wan-AI on Hugging Face](https://huggingface.co/Wan-AI)
- [Original Model Card](https://huggingface.co/Wan-AI/Wan2.2-TI2V-5B-Diffusers)
- [Diffusers Documentation](https://huggingface.co/docs/diffusers)
---
**Note**: This is a community-created Space for easy access to Wan2.2 video generation. Generation times may vary based on current GPU availability.
|