Spaces:
Running
A newer version of the Gradio SDK is available:
6.1.0
Deployment Guide for Wan2.2 on Hugging Face Spaces
This guide explains how to deploy the Wan2.2 video generation model to Hugging Face Spaces with Zero GPU support.
Prerequisites
- A Hugging Face account (create one at https://huggingface.co/join)
- Git installed on your local machine
- Git LFS (Large File Storage) installed
Deployment Steps
Option 1: Deploy via Hugging Face Web Interface
Create a New Space
- Go to https://huggingface.co/new-space
- Choose a name for your Space (e.g., "wan2-video-gen")
- Select "Gradio" as the SDK
- Choose "Public" or "Private" visibility
- Click "Create Space"
Upload Files
- Use the web interface to upload files:
app.pyrequirements.txtREADME.md.gitignore
- Use the web interface to upload files:
Enable Zero GPU
- In your Space settings, enable "Zero GPU"
- This provides automatic GPU allocation during inference
Wait for Build
- Hugging Face will automatically build your Space
- This may take 10-15 minutes for the first build
- Check the build logs for any errors
Option 2: Deploy via Git (Recommended)
Clone Your Space
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME cd YOUR_SPACE_NAMECopy Files
# Copy all files from huggingface-wan2.2 directory cp /path/to/huggingface-wan2.2/* .Commit and Push
git add . git commit -m "Initial deployment of Wan2.2 video generation" git pushEnable Zero GPU
- Go to your Space settings on Hugging Face
- Navigate to "Settings" β "Zero GPU"
- Enable Zero GPU support
Option 3: Deploy from This Repository
If you've already cloned this repository:
cd /home/user/Kakka/huggingface-wan2.2
# Initialize git if not already done
git init
# Add Hugging Face Space as remote
git remote add hf https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
# Commit files
git add .
git commit -m "Initial deployment of Wan2.2 video generation"
# Push to Hugging Face
git push hf main
Configuration
Zero GPU Settings
The app is configured to use Zero GPU with the following settings:
- Duration: 180 seconds (3 minutes) per generation
- Allocation: Automatic (triggered by generation request)
- Optimized defaults: Reduced frames (73) and steps (35) to fit within time limit
This is configured in app.py with the decorator:
@spaces.GPU(duration=180) # 3 minutes max for Pro accounts
Important: Even with Pro subscription, the maximum GPU duration is limited to 180 seconds (3 minutes). The default settings have been optimized to complete generation within this time:
- Default frames: 73 (3 seconds of video at 24fps)
- Default inference steps: 35 (balanced speed/quality)
- Maximum frames slider: 145 (6 seconds)
- Maximum inference steps: 60
Memory Requirements
The Wan2.2-TI2V-5B model requires:
- Minimum: 24GB VRAM
- Recommended: 40GB+ VRAM for Zero GPU
Zero GPU on Hugging Face Spaces provides sufficient VRAM for this model (H200 GPU with 70GB).
Testing Your Deployment
Wait for Build to Complete
- Check the build logs in your Space
- Wait for "Running" status
Test Basic Generation
- Try the default example: "Two anthropomorphic cats in comfy boxing gear fight on stage"
- Generation should take 5-10 minutes
Test Image-to-Video
- Upload a test image
- Add a descriptive prompt
- Verify video generation works
Troubleshooting
Critical: Import Order Issue
Issue: RuntimeError: CUDA has been initialized before importing the 'spaces' package
Solution: This is CRITICAL! The spaces package MUST be imported BEFORE any CUDA-related packages (torch, diffusers, etc.)
Correct import order in app.py:
# IMPORTANT: spaces must be imported first
import spaces
# Standard library imports
import os
# Third-party imports (non-CUDA)
import numpy as np
from PIL import Image
import gradio as gr
# CUDA-related imports (must come after spaces)
import torch
from diffusers import WanPipeline, AutoencoderKLWan
Why this matters: Hugging Face Zero GPU needs to manage CUDA initialization. If torch or other CUDA libraries initialize CUDA before spaces is imported, Zero GPU cannot properly manage GPU allocation.
Build Fails
Issue: Requirements installation fails
- Solution: Check
requirements.txtfor compatibility issues - Ensure PyTorch version is compatible with CUDA on Zero GPU
- Make sure using latest Gradio version (5.49.0+) for security
Issue: Out of memory during build
- Solution: Zero GPU should have enough memory; check model loading code
Issue: "Can't initialize NVML" warnings
- Solution: These are normal in Zero GPU environment during build time
- They should not affect runtime when GPU is allocated
Runtime Errors
Issue: "CUDA out of memory"
- Solution: Reduce
num_framesor image resolution - Check if Zero GPU is properly enabled in settings
Issue: "Model not found"
- Solution: Verify internet connection for model download
- Check Hugging Face Hub status
Issue: Generation timeout
- Solution: Reduce inference steps or video length
- Increase GPU duration in
@spaces.GPU(duration=XX)
Issue: Gradio security vulnerability warning
- Solution: Update to Gradio 5.49.0 or later in requirements.txt
- Check README.md YAML front matter has correct
sdk_version: 5.49.0
Issue: "ZeroGPU illegal duration! The requested GPU duration (Xs) is larger than the maximum allowed"
- Solution: Reduce the duration parameter in
@spaces.GPU(duration=XX) - For Pro accounts, use 180 seconds or less:
@spaces.GPU(duration=180) - Free tier typically limited to 60 seconds
- Optimize your default settings to complete within the time limit:
- Reduce
num_frames(e.g., 73 for 3 seconds instead of 121 for 5 seconds) - Reduce
num_inference_steps(e.g., 35 instead of 50)
- Reduce
Slow Generation
Issue: Generation takes too long
- Solution: This is expected; video generation is compute-intensive
- Typical time: 2-3 minutes for 3-second video with optimized settings (73 frames, 35 steps)
- Consider reducing
num_inference_stepsto 25-30 for faster (but lower quality) results - Note: Must complete within 180 seconds (3 minutes) for Pro, 60 seconds for Free tier
Optimization Tips
Current Optimized Settings
- Already optimized:
num_frames=73(3 seconds) andnum_inference_steps=35 - These settings are designed to complete within 180-second Zero GPU limit
- For even faster testing, reduce steps to 25-30
- Already optimized:
Add Caching (Optional)
- Enable example caching with
cache_examples=Trueto pre-generate examples - Note: This increases build time and storage requirements
- Current setting:
cache_examples=Falsefor faster builds
- Enable example caching with
Queue Management
- Current setting:
demo.queue(max_size=20) - Adjust based on expected traffic
- Larger queue = more concurrent users but more resource usage
- Current setting:
Customization
Change Default Model
To use a different Wan2.2 variant, modify app.py:
# For larger model with better quality
MODEL_ID = "Wan-AI/Wan2.2-T2V-A14B-Diffusers"
# For image-to-video focused
MODEL_ID = "Wan-AI/Wan2.2-I2V-A14B-Diffusers"
Adjust UI
Modify the Gradio interface in app.py:
- Change default values in sliders
- Add more examples
- Customize theme and styling
Add Features
Consider adding:
- Video upscaling
- Multiple video outputs
- Batch generation
- Download history
- Custom aspect ratios
Monitoring
Check Space Status
- Visit your Space URL
- Check "Settings" β "Logs" for runtime logs
- Monitor usage in "Settings" β "Analytics"
Usage Limits
Zero GPU on Hugging Face has:
- Time limits per session
- Concurrent user limits
- Monthly compute quotas (check your tier)
Support
If you encounter issues:
- Check Logs: Space logs often contain error details
- Hugging Face Forums: https://discuss.huggingface.co/
- Model Issues: Report at Wan-AI's GitHub or model card
- Space Settings: Verify Zero GPU is enabled and quota is available
License
This deployment uses:
- Wan2.2 model (Apache 2.0)
- Gradio (Apache 2.0)
- Diffusers (Apache 2.0)
Ensure compliance with all licenses when deploying.
Happy Deploying! π