🌊 LiquidFlow β€” Liquid-SSM Flow Matching Image Generator

A novel lightweight architecture for image generation that combines:

Component Source Role
Liquid Time-Constant Networks Hasani et al. 2020 Adaptive ODE dynamics via CfC closed-form β€” bounded by construction
Selective State Space Models Gu & Dao 2023 (Mamba) Linear-time long-range context, parallelizable scanning
Zigzag Scanning ZigMa 2024 2D spatial awareness through alternating scan patterns
Physics-Informed Loss Wang et al. 2020, PIDM 2024 Smoothness + TV regularization for training stability
Rectified Flow Matching Lipman et al. 2022 ODE-based generation β€” no noise schedule tuning needed

🎯 Key Properties

  • Trainable on Google Colab free tier (T4 16GB) and Kaggle
  • Mobile-deployable β€” tiny model is only 6M params (24MB)
  • No custom CUDA kernels β€” pure PyTorch, runs anywhere
  • No training collapse/explosion β€” sigmoid gating in Liquid CfC guarantees bounded dynamics
  • No noise schedule tuning β€” flow matching uses simple linear interpolation

πŸ“ Architecture

Noise xβ‚€ ~ N(0,I)  ──→  LiquidFlow v_ΞΈ(xβ‚œ, t)  ──→  Image x₁
                           β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
                    β”‚  Patchify   β”‚  (image β†’ non-overlapping patches)
                    β”‚  + PosEmb   β”‚  (2D learnable positions)
                    β”‚  + DepthConvβ”‚  (local structure preservation)
                    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚    L Γ— LiquidSSM Block   β”‚
              β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
              β”‚  β”‚ AdaLN (t-cond)   β”‚    β”‚   ← DiT-style conditioning
              β”‚  β”‚ Zigzag Scan      β”‚    β”‚   ← rotates scan pattern per layer
              β”‚  β”‚ SelectiveSSM     β”‚    β”‚   ← Mamba-style, input-dependent A,B,C,Ξ”
              β”‚  β”‚ + LiquidCfC      β”‚    β”‚   ← CfC gating: Οƒ(-f_Ο„)βŠ™h + (1-Οƒ(-f_Ο„))βŠ™f_x
              β”‚  β”‚ + FFN            β”‚    β”‚   ← GELU feed-forward
              β”‚  β”‚ + Skip Connect   β”‚    β”‚   ← U-Net style long skips
              β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
                    β”‚  DepthConv  β”‚  (local refinement)
                    β”‚  Unpatchify β”‚  (patches β†’ image)
                    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                     velocity v_ΞΈ (same shape as input)

Core Innovation: Liquid CfC Cell

Instead of solving the Liquid ODE numerically (sequential, slow):

dx/dt = -[1/Ο„ + f(x,I,t)] * x + f(x,I,t)

We use the Closed-form Continuous-depth (CfC) solution (parallel, fast, stable):

gate = sigmoid(-f_tau(x, h))    # time-constant gating
new_h = gate * h + (1 - gate) * f_x(x, h)  # bounded update

The sigmoid gating guarantees that hidden states stay bounded β€” no explosion or collapse possible by construction.

Dual-Path Processing

Each LiquidSSM Block has two parallel branches:

  1. SSM Branch: Selective scan (Mamba-style) with zigzag patterns β†’ captures global spatial dependencies
  2. Liquid Branch: CfC cell β†’ adds continuous-time adaptive dynamics

A learnable mixing coefficient Ξ± balances them: output = Ξ±Β·SSM + (1-Ξ±)Β·Liquid

πŸ“Š Model Variants

Variant Params Image Size Patch GPU VRAM (bs=16) Use Case
tiny 5.9M 128Γ—128 4 ~4 GB Quick experiments, mobile
small 13.7M 128Γ—128 4 ~8 GB Production 128Γ—128
base 37.6M 256Γ—256 8 ~12 GB High quality
512 38.1M 512Γ—512 16 ~14 GB High resolution

πŸš€ Quick Start

Colab / Kaggle (Recommended)

Open the notebook: LiquidFlow_Training.ipynb

It has interactive widgets for:

  • Dataset selection (CIFAR-10, Flowers-102, CelebA, Fashion-MNIST, AFHQ, custom folder)
  • Model size and all hyperparameters
  • Auto batch-size adjustment for your GPU

Command Line

pip install torch torchvision einops pillow matplotlib tqdm

# Quick test (CIFAR-10 32Γ—32)
python liquidflow/train.py --model_size tiny --img_size 32 --dataset cifar10 --epochs 50 --batch_size 64

# Production (Flowers 128Γ—128)
python liquidflow/train.py --model_size small --img_size 128 --dataset flowers --epochs 200 --batch_size 16

# Custom images
python liquidflow/train.py --model_size small --img_size 128 --dataset folder --data_dir /path/to/images

Python API

from liquidflow import liquidflow_small, euler_sample, make_grid_image
import torch

model = liquidflow_small(img_size=128)  # 13.7M params
# ... after training ...
model.eval()
images = euler_sample(model, (16, 3, 128, 128), num_steps=50, device='cuda')
grid = make_grid_image(images.clamp(-1,1)*0.5+0.5, nrow=4)
grid.save('generated.png')

πŸ“¦ File Structure

β”œβ”€β”€ liquidflow/
β”‚   β”œβ”€β”€ __init__.py          # Package exports
β”‚   β”œβ”€β”€ model.py             # Core architecture (LiquidFlowNet, LiquidCfCCell, SelectiveSSM)
β”‚   β”œβ”€β”€ losses.py            # Physics-informed flow matching loss + EMA
β”‚   β”œβ”€β”€ sampling.py          # Euler & Heun ODE samplers
β”‚   └── train.py             # Full training script with CLI
β”œβ”€β”€ LiquidFlow_Training.ipynb  # πŸ““ Colab/Kaggle notebook
β”œβ”€β”€ smoke_test.py            # Comprehensive CPU test suite (25 tests)
└── README.md

πŸ”¬ Physics-Informed Loss

L = L_flow + Ξ»_smooth Β· L_smooth + Ξ»_tv Β· L_tv
Term Formula Purpose
L_flow β€–v_ΞΈ(xβ‚œ,t) - (x₁-xβ‚€)β€–Β² Learn straight-line velocity field
L_smooth β€–βˆ‡Β²x_predβ€–Β² (Laplacian) Penalize high-frequency noise
L_tv β€–βˆ‡x_pred‖₁ (Total Variation) Edge-preserving smoothness

Physics loss is warmed up over the first 500 steps.

πŸ§ͺ Recommended Experiments

Goal Dataset Model Size Epochs Time (T4)
Sanity check CIFAR-10 tiny 32 20 ~5 min
Baseline CIFAR-10 tiny 128 100 ~2 hrs
Quality Flowers-102 small 128 200 ~4 hrs
Faces CelebA small 128 50 ~6 hrs
High-res CelebA 512 512 100 ~12 hrs

πŸ“± Mobile Export

The notebook includes TorchScript and ONNX export cells. The tiny model produces a ~24MB file for on-device inference.

βœ… Verified (25/25 smoke tests pass)

  • All 4 model variants: forward pass βœ“
  • Backward pass: all parameters receive gradients βœ“
  • Gradient health: no NaN, no Inf βœ“
  • Loss convergence: finite across optimizer steps βœ“
  • Individual components: LiquidCfCCell, SelectiveSSM, LiquidSSMBlock βœ“
  • Scan patterns: 4 patterns, all invertible βœ“
  • Sampling: Euler + Heun produce finite images βœ“
  • EMA: apply/restore cycle βœ“
  • Checkpoint: save/load round-trip βœ“
  • Physics loss: all terms finite and positive βœ“

πŸ“š References

  1. Hasani et al., "Liquid Time-Constant Networks", AAAI 2021 (2006.04439)
  2. Hasani et al., "Closed-form Continuous-depth Models", Nature MI 2022
  3. Gu & Dao, "Mamba: Linear-Time Sequence Modeling", 2023 (2312.00752)
  4. Teng et al., "DiM: Diffusion Mamba", 2024 (2405.14224)
  5. Hu et al., "ZigMa: Zigzag Mamba Diffusion", 2024 (2403.13802)
  6. Lipman et al., "Flow Matching for Generative Modeling", ICLR 2023
  7. Raissi et al., "Physics-Informed Neural Networks", JCP 2019 (1711.10561)
  8. Wang et al., "Gradient Pathologies in PINNs", 2020 (2001.04536)
  9. Bastek & Kochmann, "Physics-Informed Diffusion Models", 2024 (2403.14404)
  10. Zhu et al., "Vision Mamba", 2024 (2401.09417)

License

MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Papers for krystv/LiquidFlow