AMD support

by Avaredra - opened 4 days ago

4 days ago

Is AMD support possibly coming? I know the main team doesn't have any consumer AMD cards. On comfyui, you have to run it with --cpu_vae, and that takes way too long.

mrfakename

4 days ago

maybe try without using comfyui

Avaredra

4 days ago

maybe try without using comfyui

Yeah, I didn't succeed getting the gradio thing to work, but I think you have to be a gradio genius to get it to work with amd lol so that may not mean it can't work!

mrfakename

4 days ago

What do you mean, the Gradio setup is relatively straightforward

bluemoehre

3 days ago

Is AMD support possibly coming? I know the main team doesn't have any consumer AMD cards. On comfyui, you have to run it with --cpu_vae, and that takes way too long.

I'm running this on AMD Strix Halo using ROCm7.2:
https://github.com/IgnatBeresnev/comfyui-gfx1151

In case you have some other AMD GPU, it may be compatible or at least you can use this container to adapt.

The required workflow is already provided within ComfyUI / Workflows / ACE-Step 1.5 Music Generation AIO

urtuuuu

3 days ago

•

edited 3 days ago

"Standard Installation (All Platforms)" on their github page worked no problem. And it even runs in cpu mode only. I don't mind waiting several minutes for this goodness.

And it's even better like this, because i can use my computer while its generating, cpu only 50% usage, and no vram usage

rnczzz

about 20 hours ago

•

edited about 17 hours ago

GRadio works fine on RX7900 XTX AMD (24gb vram)on Ubuntu 24.04 LTS. Just create a venv in python3.11, install torch and torchaudio from pytorch.org ROCm7.1 nightly and after that the requirements.txt of the git repository. Once done you can run python -m acestep.acestep_v15_pipeline --server-name 127.0.0.1 --port 7680 and open it in the browser localhost:7680.

torchcodec must be compiled ( see: https://github.com/vllm-project/vllm/blob/main/tools/install_torchcodec_rocm.sh ) manually if you want to train your own LoRa. I get some errors with training loras atm:
RuntimeError: Input type (CUDABFloat16Type) and weight type (CPUBFloat16Type) should be the same

EDIT: turning off offload to CPU to have everything in the VRAM was the solutions.

These are the commands I run, or you can put them in a bash script, looks like the whole lora training works on AMD now, besides generating AI music :)

Also had to install python3.11-dev with apt install.

set -e
export TORCHCODEC_USE_SYSTEM_FFMPEG=1
export TORCH_CUDNN_ENABLED=0
export PYTORCH_ALLOC_CONF=expandable_segments:True
export PYTORCH_NO_HIP_MEMORY_CACHING=1

venv/bin/python -m acestep.acestep_v15_pipeline
--server-name 0.0.0.0
--port 7680
--init_service True
--device cuda
--backend pt
--offload_dit_to_cpu True
--offload_to_cpu False

Lora training pre-processing of the dataset builder tab of two songs took half an hour, training the lora probably will be finished after 1h30m for 1000 steps.

sysia48

about 11 hours ago

Nie to już koniec !

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment