LiquidTAD: Efficient Temporal Action Detection via Parallel Liquid-Inspired Temporal Relaxation
Paper β’ 2604.18274 β’ Published
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
A novel attention-free image generation model based on Liquid Neural Networks
LiquidDiffusion is a first-of-its-kind image generation model that replaces attention with Parallel CfC (Closed-form Continuous-depth) blocks from Liquid Neural Network research. No existing paper combines LNNs with image generation β this fills that gap.
LiquidDiffusion_Training.ipynb in ColabPixel Image (3Γ256Γ256)
β [Frozen SD-VAE Encode] β Latent (4Γ32Γ32)
β [LiquidDiffusion U-Net] β Velocity prediction (4Γ32Γ32)
β [Frozen SD-VAE Decode] β Generated Image (3Γ256Γ256)
Each LiquidDiffusionBlock contains:
# CfC Eq.10 adapted for images:
gate = Ο(time_a(t_emb) Β· f(features) - time_b(t_emb)) # liquid time-gating
out = gate Β· g(features) + (1 - gate) Β· h(features) # CfC interpolation
Ξ± = exp(-Ξ» Β· |t_emb|) # liquid relaxation
output = Ξ± Β· input + (1 - Ξ±) Β· out # time-aware residual
All tested and working (with streaming support):
| Dataset | Images | Description | Native Resolution |
|---|---|---|---|
huggan/AFHQv2 |
16K | Animal faces (cats, dogs, wildlife) | 512Γ512 |
nielsr/CelebA-faces |
202K | Celebrity faces | 178Γ218 |
huggan/flowers-102-categories |
8K | Flower photographs | Variable |
reach-vb/pokemon-blip-captions |
833 | Pokemon illustrations | 1280Γ1280 |
huggan/anime-faces |
63K | Anime faces | 64Γ64 |
Norod78/cartoon-blip-captions |
~3K | Cartoon characters | 512Γ512 |
Uses stabilityai/sd-vae-ft-mse (83.7M params, frozen during training):
| Config | Params | 256px VRAM (w/ VAE) | 512px VRAM |
|---|---|---|---|
| tiny | ~23M | ~6 GB | ~12 GB |
| small | ~69M | ~10 GB | ~20 GB |
| base | ~154M | ~16 GB | ~30 GB |
Objective: Rectified Flow β simple MSE on velocity
x_t = (1 - t) Β· x0 + t Β· noise # linear interpolation
v_target = noise - x0 # constant velocity
loss = MSE(model(x_t, t), v_target) # that's it!
Sampling: Euler ODE integration, 25-50 steps
| Paper | Contribution |
|---|---|
| CfC Networks (Nature MI 2022) | CfC Eq.10, parallelizable closed-form |
| LTC Networks (AAAI 2021) | Liquid time-constant ODE |
| LiquidTAD (2024) | Parallel liquid relaxation |
| USM (CVPR 2025) | U-Net + SSM for diffusion |
| DiffuSSM (2023) | SSM replaces attention in diffusion |
| Rectified Flow (ICLR 2023) | Simple velocity training |
βββ liquid_diffusion/
β βββ __init__.py
β βββ model.py # Full model architecture
β βββ trainer.py # Trainer + dataset utilities
βββ LiquidDiffusion_Training.ipynb # Complete Colab notebook
βββ test_model.py
βββ README.md
MIT