Spaces:

Smikke
/

wan2-video-generation

Running

Create a New Space
- Go to https://huggingface.co/new-space
- Choose a name for your Space (e.g., "wan2-video-gen")
- Select "Gradio" as the SDK
- Choose "Public" or "Private" visibility
- Click "Create Space"
Upload Files
- Use the web interface to upload files:
  - app.py
  - requirements.txt
  - README.md
  - .gitignore
Enable Zero GPU
- In your Space settings, enable "Zero GPU"
- This provides automatic GPU allocation during inference
Wait for Build
- Hugging Face will automatically build your Space
- This may take 10-15 minutes for the first build
- Check the build logs for any errors

Option 2: Deploy via Git (Recommended)

Clone Your Space

git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
cd YOUR_SPACE_NAME

Copy Files

# Copy all files from huggingface-wan2.2 directory
cp /path/to/huggingface-wan2.2/* .

Commit and Push

git add .
git commit -m "Initial deployment of Wan2.2 video generation"
git push

Enable Zero GPU
- Go to your Space settings on Hugging Face
- Navigate to "Settings" → "Zero GPU"
- Enable Zero GPU support

Option 3: Deploy from This Repository

If you've already cloned this repository:

cd /home/user/Kakka/huggingface-wan2.2

# Initialize git if not already done
git init

# Add Hugging Face Space as remote
git remote add hf https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME

# Commit files
git add .
git commit -m "Initial deployment of Wan2.2 video generation"

# Push to Hugging Face
git push hf main

Configuration

Zero GPU Settings

The app is configured to use Zero GPU with the following settings:

Duration: 180 seconds (3 minutes) per generation
Allocation: Automatic (triggered by generation request)
Optimized defaults: Reduced frames (73) and steps (35) to fit within time limit

This is configured in app.py with the decorator:

@spaces.GPU(duration=180)  # 3 minutes max for Pro accounts

Important: Even with Pro subscription, the maximum GPU duration is limited to 180 seconds (3 minutes). The default settings have been optimized to complete generation within this time:

Default frames: 73 (3 seconds of video at 24fps)
Default inference steps: 35 (balanced speed/quality)
Maximum frames slider: 145 (6 seconds)
Maximum inference steps: 60

Memory Requirements

The Wan2.2-TI2V-5B model requires:

Minimum: 24GB VRAM
Recommended: 40GB+ VRAM for Zero GPU

Zero GPU on Hugging Face Spaces provides sufficient VRAM for this model (H200 GPU with 70GB).

Testing Your Deployment

Wait for Build to Complete
- Check the build logs in your Space
- Wait for "Running" status
Test Basic Generation
- Try the default example: "Two anthropomorphic cats in comfy boxing gear fight on stage"
- Generation should take 5-10 minutes
Test Image-to-Video
- Upload a test image
- Add a descriptive prompt
- Verify video generation works

Troubleshooting

Critical: Import Order Issue

Issue: RuntimeError: CUDA has been initialized before importing the 'spaces' package

Solution: This is CRITICAL! The spaces package MUST be imported BEFORE any CUDA-related packages (torch, diffusers, etc.)

Correct import order in app.py:

# IMPORTANT: spaces must be imported first
import spaces

# Standard library imports
import os

# Third-party imports (non-CUDA)
import numpy as np
from PIL import Image
import gradio as gr

# CUDA-related imports (must come after spaces)
import torch
from diffusers import WanPipeline, AutoencoderKLWan

Why this matters: Hugging Face Zero GPU needs to manage CUDA initialization. If torch or other CUDA libraries initialize CUDA before spaces is imported, Zero GPU cannot properly manage GPU allocation.

Build Fails

Issue: Requirements installation fails

Solution: Check requirements.txt for compatibility issues
Ensure PyTorch version is compatible with CUDA on Zero GPU
Make sure using latest Gradio version (5.49.0+) for security

Issue: Out of memory during build

Solution: Zero GPU should have enough memory; check model loading code

Issue: "Can't initialize NVML" warnings

Solution: These are normal in Zero GPU environment during build time
They should not affect runtime when GPU is allocated

Runtime Errors

Issue: "CUDA out of memory"

Solution: Reduce num_frames or image resolution
Check if Zero GPU is properly enabled in settings

Issue: "Model not found"

Solution: Verify internet connection for model download
Check Hugging Face Hub status

Issue: Generation timeout

Solution: Reduce inference steps or video length
Increase GPU duration in @spaces.GPU(duration=XX)

Issue: Gradio security vulnerability warning

Solution: Update to Gradio 5.49.0 or later in requirements.txt
Check README.md YAML front matter has correct sdk_version: 5.49.0

Issue: "ZeroGPU illegal duration! The requested GPU duration (Xs) is larger than the maximum allowed"

Solution: Reduce the duration parameter in @spaces.GPU(duration=XX)
For Pro accounts, use 180 seconds or less: @spaces.GPU(duration=180)
Free tier typically limited to 60 seconds
Optimize your default settings to complete within the time limit:
- Reduce num_frames (e.g., 73 for 3 seconds instead of 121 for 5 seconds)
- Reduce num_inference_steps (e.g., 35 instead of 50)

Slow Generation

Issue: Generation takes too long

Solution: This is expected; video generation is compute-intensive
Typical time: 2-3 minutes for 3-second video with optimized settings (73 frames, 35 steps)
Consider reducing num_inference_steps to 25-30 for faster (but lower quality) results
Note: Must complete within 180 seconds (3 minutes) for Pro, 60 seconds for Free tier

Optimization Tips

Current Optimized Settings
- Already optimized: num_frames=73 (3 seconds) and num_inference_steps=35
- These settings are designed to complete within 180-second Zero GPU limit
- For even faster testing, reduce steps to 25-30
Add Caching (Optional)
- Enable example caching with cache_examples=True to pre-generate examples
- Note: This increases build time and storage requirements
- Current setting: cache_examples=False for faster builds
Queue Management
- Current setting: demo.queue(max_size=20)
- Adjust based on expected traffic
- Larger queue = more concurrent users but more resource usage

Customization

Change Default Model

To use a different Wan2.2 variant, modify app.py:

# For larger model with better quality
MODEL_ID = "Wan-AI/Wan2.2-T2V-A14B-Diffusers"

# For image-to-video focused
MODEL_ID = "Wan-AI/Wan2.2-I2V-A14B-Diffusers"

Adjust UI

Modify the Gradio interface in app.py:

Change default values in sliders
Add more examples
Customize theme and styling

Add Features

Consider adding:

Video upscaling
Multiple video outputs
Batch generation
Download history
Custom aspect ratios

Monitoring

Check Space Status

Visit your Space URL
Check "Settings" → "Logs" for runtime logs
Monitor usage in "Settings" → "Analytics"

Usage Limits

Zero GPU on Hugging Face has:

Time limits per session
Concurrent user limits
Monthly compute quotas (check your tier)

Support

If you encounter issues:

Check Logs: Space logs often contain error details
Hugging Face Forums: https://discuss.huggingface.co/
Model Issues: Report at Wan-AI's GitHub or model card
Space Settings: Verify Zero GPU is enabled and quota is available

License

This deployment uses:

Wan2.2 model (Apache 2.0)
Gradio (Apache 2.0)
Diffusers (Apache 2.0)

Ensure compliance with all licenses when deploying.

Happy Deploying! 🚀