Files

Nick a5adfe060e Initial commit: ComfyUI RunPod Serverless endpoint

- Dockerfile with CUDA 12.8.1, Python 3.12, PyTorch 2.8.0+cu128
- SageAttention 2.2 compiled from source
- Nunchaku wheel installation
- 12 custom nodes pre-installed
- Handler with image/video output support
- Model symlinks to /userdata network volume

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2025-12-25 21:59:09 +13:00

4.8 KiB

Raw Blame History

ComfyUI RunPod Serverless Project

Project Overview

Build a RunPod Serverless endpoint running ComfyUI with SageAttention 2.2 for image/video generation. Self-hosted frontend will call the RunPod API.

Architecture

RunPod Serverless: Hosts ComfyUI worker with GPU inference
Network Volume: Mounts at /userdata
Gitea Registry: Hosts Docker image
Frontend: Self-hosted on home server, calls RunPod API over HTTPS

Reference Environment (extracted from working pod)

Base System

Ubuntu 22.04.5 LTS (Jammy)
Python 3.12.12
CUDA 12.8 (nvcc 12.8.93)
cuDNN 9.8.0.87
NCCL 2.25.1

PyTorch Stack

torch==2.8.0+cu128
torchvision==0.23.0+cu128
torchaudio==2.8.0+cu128
triton==3.4.0

Key Dependencies

transformers==4.56.2
diffusers==0.35.2
accelerate==1.11.0
safetensors==0.6.2
onnxruntime-gpu==1.23.2
opencv-python==4.12.0.88
mediapipe==0.10.14
insightface==0.7.3
spandrel==0.4.1
kornia==0.8.2
einops==0.8.1
timm==1.0.22
peft==0.17.1
gguf==0.17.1
av==16.0.1 (video)
imageio-ffmpeg==0.6.0

Nunchaku (prebuilt wheel)

nunchaku @ https://github.com/nunchaku-tech/nunchaku/releases/download/v1.0.2/nunchaku-1.0.2+torch2.8-cp312-cp312-linux_x86_64.whl

ComfyUI

Location: /workspace/ComfyUI
Uses venv at /workspace/ComfyUI/venv
Commit: 532e2850794c7b497174a0a42ac0cb1fe5b62499 (Dec 24, 2025)

Custom Nodes (from CUSTOM_NODES env var + actual install)

ltdrdata/ComfyUI-Manager
jnxmx/ComfyUI_HuggingFace_Downloader
kijai/ComfyUI-KJNodes
Fannovel16/comfyui_controlnet_aux
crystian/ComfyUI-Crystools
Kosinkadink/ComfyUI-VideoHelperSuite
willmiao/ComfyUI-Lora-Manager
city96/ComfyUI-GGUF
Fannovel16/ComfyUI-Frame-Interpolation
nunchaku-tech/ComfyUI-nunchaku
evanspearman/ComfyMath
ssitu/ComfyUI_UltimateSDUpscale

Environment Variables (relevant)

HF_HOME=/workspace/.cache/huggingface
HF_HUB_ENABLE_HF_TRANSFER=1
TRANSFORMERS_CACHE=/workspace/.cache/huggingface/transformers
PYTHONUNBUFFERED=1
LD_LIBRARY_PATH=/usr/local/cuda/lib64
LIBRARY_PATH=/usr/local/cuda/lib64/stubs

Network Volume Mount

Mount point: /userdata

Technical Requirements

SageAttention 2.2 (Critical)

Must be compiled from source with no build isolation:

git clone https://github.com/thu-ml/SageAttention.git
cd SageAttention
pip install triton
export EXT_PARALLEL=4 NVCC_APPEND_FLAGS="--threads 8" MAX_JOBS=32
pip install --no-build-isolation -e .

Network Volume Structure

/userdata/
├── models/
│   ├── checkpoints/
│   ├── loras/
│   ├── vae/
│   ├── controlnet/
│   ├── clip/
│   └── upscale_models/
└── .cache/
    └── huggingface/

Handler Requirements

Accept JSON input: {"image": "base64", "prompt": "string", "workflow": {}}
Image upload to ComfyUI if provided
Inject prompt into workflow at specified node
Queue workflow, poll for completion
Return output as base64:
- Images: PNG/JPEG base64
- Videos: MP4 base64 (or presigned URL if >10MB)
Detect output type from workflow output node
Timeout handling (max 600s for video generation)

Dockerfile Requirements

Base: nvidia/cuda:12.8.1-devel-ubuntu22.04 (or equivalent with CUDA 12.8 devel)
Python 3.12
PyTorch 2.8.0+cu128 from pytorch index
Install nunchaku from GitHub wheel
Compile SageAttention with --no-build-isolation
Symlink model directories to /userdata
Clone and install all custom nodes
Install ffmpeg for video handling
Expose handler as entrypoint

File Structure

/project
├── Dockerfile
├── handler.py
├── requirements.txt
├── scripts/
│   └── install_custom_nodes.sh
├── workflows/
│   └── default_workflow_api.json
└── README.md

Tasks

Create Dockerfile matching reference environment (CUDA 12.8, Python 3.12, PyTorch 2.8)
Create requirements.txt from extracted pip freeze (pruned to essentials)
Create install_custom_nodes.sh for all listed custom nodes
Create handler.py with ComfyUI API integration (image + video output support)
Document deployment steps in README.md

Notes

Nick is a Principal Systems Engineer, prefers direct technical communication
Target deployment: RunPod Serverless with 5090 GPU
Development machine: RTX 3080 (forward compatible)
Registry: Self-hosted Gitea
Output will likely be video - ensure ffmpeg installed and handler detects output type
Reference pod uses venv - serverless image can install globally

Claude Code Init Command

Read PROJECT.md fully. Build the Dockerfile first, matching the reference environment exactly: CUDA 12.8.1, Python 3.12, PyTorch 2.8.0+cu128, triton 3.4.0. Install nunchaku from the GitHub wheel URL. Compile SageAttention 2.2 with --no-build-isolation. Install all custom nodes listed. Symlink model paths to /userdata. Do not use a venv in the container.

4.8 KiB Raw Blame History