Files
comfyui-serverless/PROJECT.md
Nick a5adfe060e Initial commit: ComfyUI RunPod Serverless endpoint
- Dockerfile with CUDA 12.8.1, Python 3.12, PyTorch 2.8.0+cu128
- SageAttention 2.2 compiled from source
- Nunchaku wheel installation
- 12 custom nodes pre-installed
- Handler with image/video output support
- Model symlinks to /userdata network volume

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-25 21:59:09 +13:00

4.8 KiB

ComfyUI RunPod Serverless Project

Project Overview

Build a RunPod Serverless endpoint running ComfyUI with SageAttention 2.2 for image/video generation. Self-hosted frontend will call the RunPod API.

Architecture

  • RunPod Serverless: Hosts ComfyUI worker with GPU inference
  • Network Volume: Mounts at /userdata
  • Gitea Registry: Hosts Docker image
  • Frontend: Self-hosted on home server, calls RunPod API over HTTPS

Reference Environment (extracted from working pod)

Base System

  • Ubuntu 22.04.5 LTS (Jammy)
  • Python 3.12.12
  • CUDA 12.8 (nvcc 12.8.93)
  • cuDNN 9.8.0.87
  • NCCL 2.25.1

PyTorch Stack

  • torch==2.8.0+cu128
  • torchvision==0.23.0+cu128
  • torchaudio==2.8.0+cu128
  • triton==3.4.0

Key Dependencies

  • transformers==4.56.2
  • diffusers==0.35.2
  • accelerate==1.11.0
  • safetensors==0.6.2
  • onnxruntime-gpu==1.23.2
  • opencv-python==4.12.0.88
  • mediapipe==0.10.14
  • insightface==0.7.3
  • spandrel==0.4.1
  • kornia==0.8.2
  • einops==0.8.1
  • timm==1.0.22
  • peft==0.17.1
  • gguf==0.17.1
  • av==16.0.1 (video)
  • imageio-ffmpeg==0.6.0

Nunchaku (prebuilt wheel)

nunchaku @ https://github.com/nunchaku-tech/nunchaku/releases/download/v1.0.2/nunchaku-1.0.2+torch2.8-cp312-cp312-linux_x86_64.whl

ComfyUI

  • Location: /workspace/ComfyUI
  • Uses venv at /workspace/ComfyUI/venv
  • Commit: 532e2850794c7b497174a0a42ac0cb1fe5b62499 (Dec 24, 2025)

Custom Nodes (from CUSTOM_NODES env var + actual install)

ltdrdata/ComfyUI-Manager
jnxmx/ComfyUI_HuggingFace_Downloader
kijai/ComfyUI-KJNodes
Fannovel16/comfyui_controlnet_aux
crystian/ComfyUI-Crystools
Kosinkadink/ComfyUI-VideoHelperSuite
willmiao/ComfyUI-Lora-Manager
city96/ComfyUI-GGUF
Fannovel16/ComfyUI-Frame-Interpolation
nunchaku-tech/ComfyUI-nunchaku
evanspearman/ComfyMath
ssitu/ComfyUI_UltimateSDUpscale

Environment Variables (relevant)

HF_HOME=/workspace/.cache/huggingface
HF_HUB_ENABLE_HF_TRANSFER=1
TRANSFORMERS_CACHE=/workspace/.cache/huggingface/transformers
PYTHONUNBUFFERED=1
LD_LIBRARY_PATH=/usr/local/cuda/lib64
LIBRARY_PATH=/usr/local/cuda/lib64/stubs

Network Volume Mount

  • Mount point: /userdata

Technical Requirements

SageAttention 2.2 (Critical)

Must be compiled from source with no build isolation:

git clone https://github.com/thu-ml/SageAttention.git
cd SageAttention
pip install triton
export EXT_PARALLEL=4 NVCC_APPEND_FLAGS="--threads 8" MAX_JOBS=32
pip install --no-build-isolation -e .

Network Volume Structure

/userdata/
├── models/
│   ├── checkpoints/
│   ├── loras/
│   ├── vae/
│   ├── controlnet/
│   ├── clip/
│   └── upscale_models/
└── .cache/
    └── huggingface/

Handler Requirements

  • Accept JSON input: {"image": "base64", "prompt": "string", "workflow": {}}
  • Image upload to ComfyUI if provided
  • Inject prompt into workflow at specified node
  • Queue workflow, poll for completion
  • Return output as base64:
    • Images: PNG/JPEG base64
    • Videos: MP4 base64 (or presigned URL if >10MB)
  • Detect output type from workflow output node
  • Timeout handling (max 600s for video generation)

Dockerfile Requirements

  • Base: nvidia/cuda:12.8.1-devel-ubuntu22.04 (or equivalent with CUDA 12.8 devel)
  • Python 3.12
  • PyTorch 2.8.0+cu128 from pytorch index
  • Install nunchaku from GitHub wheel
  • Compile SageAttention with --no-build-isolation
  • Symlink model directories to /userdata
  • Clone and install all custom nodes
  • Install ffmpeg for video handling
  • Expose handler as entrypoint

File Structure

/project
├── Dockerfile
├── handler.py
├── requirements.txt
├── scripts/
│   └── install_custom_nodes.sh
├── workflows/
│   └── default_workflow_api.json
└── README.md

Tasks

  1. Create Dockerfile matching reference environment (CUDA 12.8, Python 3.12, PyTorch 2.8)
  2. Create requirements.txt from extracted pip freeze (pruned to essentials)
  3. Create install_custom_nodes.sh for all listed custom nodes
  4. Create handler.py with ComfyUI API integration (image + video output support)
  5. Document deployment steps in README.md

Notes

  • Nick is a Principal Systems Engineer, prefers direct technical communication
  • Target deployment: RunPod Serverless with 5090 GPU
  • Development machine: RTX 3080 (forward compatible)
  • Registry: Self-hosted Gitea
  • Output will likely be video - ensure ffmpeg installed and handler detects output type
  • Reference pod uses venv - serverless image can install globally

Claude Code Init Command

Read PROJECT.md fully. Build the Dockerfile first, matching the reference environment exactly: CUDA 12.8.1, Python 3.12, PyTorch 2.8.0+cu128, triton 3.4.0. Install nunchaku from the GitHub wheel URL. Compile SageAttention 2.2 with --no-build-isolation. Install all custom nodes listed. Symlink model paths to /userdata. Do not use a venv in the container.