Blackwell (sm_100) may not be fully supported yet. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ComfyUI RunPod Serverless
RunPod Serverless endpoint for ComfyUI with SageAttention 2.2, supporting image and video generation workflows.
Stack
- CUDA 12.8.1 / Ubuntu 22.04
- Python 3.12
- PyTorch 2.8.0+cu128
- SageAttention 2.2 (compiled)
- Nunchaku 1.0.2
- 12 custom nodes pre-installed
Prerequisites
- Docker with NVIDIA runtime
- RunPod account with API key
- Network volume created in RunPod
- Container registry (Docker Hub, Gitea, etc.)
Build
docker build -t comfyui-runpod:latest .
Build with specific platform (if building on ARM):
docker build --platform linux/amd64 -t comfyui-runpod:latest .
Push to Registry
Docker Hub:
docker tag comfyui-runpod:latest yourusername/comfyui-runpod:latest
docker push yourusername/comfyui-runpod:latest
Self-hosted Gitea:
docker tag comfyui-runpod:latest git.yourdomain.com/username/comfyui-runpod:latest
docker push git.yourdomain.com/username/comfyui-runpod:latest
Network Volume Setup
Create a network volume in RunPod and populate with models:
/userdata/
├── models/
│ ├── checkpoints/ # SD, SDXL, Flux models
│ ├── loras/ # LoRA models
│ ├── vae/ # VAE models
│ ├── controlnet/ # ControlNet models
│ ├── clip/ # CLIP models
│ └── upscale_models/ # Upscaler models
└── .cache/
└── huggingface/ # HF model cache
Upload models via RunPod pod or rclone to the network volume before deploying serverless.
RunPod Deployment
-
Go to RunPod Console > Serverless > New Endpoint
-
Configure endpoint:
- Container Image:
yourusername/comfyui-runpod:latest - GPU: RTX 4090, RTX 5090, or A100 recommended
- Network Volume: Select your volume (mounts at
/userdata) - Active Workers: 0 (scale to zero)
- Max Workers: Based on budget
- Idle Timeout: 5-10 seconds
- Execution Timeout: 600 seconds (for video)
- Container Image:
-
Deploy and note the Endpoint ID
API Usage
Endpoint URL
https://api.runpod.ai/v2/{ENDPOINT_ID}/runsync
Headers
Authorization: Bearer {RUNPOD_API_KEY}
Content-Type: application/json
Request Schema
{
"input": {
"workflow": {},
"prompt": "optional prompt text",
"image": "optional base64 image",
"prompt_node_id": "optional node id for prompt",
"image_node_id": "optional node id for image",
"timeout": 300
}
}
Example: Text-to-Image
curl -X POST "https://api.runpod.ai/v2/${ENDPOINT_ID}/runsync" \
-H "Authorization: Bearer ${RUNPOD_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"input": {
"workflow": '"$(cat workflow_api.json)"',
"prompt": "a photo of a cat in space"
}
}'
Example: Image-to-Video
curl -X POST "https://api.runpod.ai/v2/${ENDPOINT_ID}/runsync" \
-H "Authorization: Bearer ${RUNPOD_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"input": {
"workflow": '"$(cat i2v_workflow_api.json)"',
"image": "'"$(base64 -w0 input.png)"'",
"prompt": "the cat walks forward",
"timeout": 600
}
}'
Response Schema
Success:
{
"id": "job-id",
"status": "COMPLETED",
"output": {
"status": "success",
"prompt_id": "abc123",
"outputs": [
{
"type": "video",
"filename": "output.mp4",
"data": "base64...",
"size": 1234567
}
]
}
}
Large files (>10MB video):
{
"outputs": [
{
"type": "video",
"filename": "output.mp4",
"path": "/userdata/outputs/output.mp4",
"size": 52428800
}
]
}
Error:
{
"output": {
"error": "error message",
"status": "error"
}
}
Async Execution
For long-running video jobs, use async endpoint:
# Submit job
curl -X POST "https://api.runpod.ai/v2/${ENDPOINT_ID}/run" \
-H "Authorization: Bearer ${RUNPOD_API_KEY}" \
-H "Content-Type: application/json" \
-d '{"input": {...}}'
# Response: {"id": "job-id", "status": "IN_QUEUE"}
# Poll for result
curl "https://api.runpod.ai/v2/${ENDPOINT_ID}/status/${JOB_ID}" \
-H "Authorization: Bearer ${RUNPOD_API_KEY}"
Workflow Export
Export workflows from ComfyUI in API format:
- Open ComfyUI
- Enable Dev Mode in settings
- Click "Save (API Format)"
- Use the exported JSON as the
workflowparameter
Custom Nodes Included
- ComfyUI-Manager
- ComfyUI_HuggingFace_Downloader
- ComfyUI-KJNodes
- comfyui_controlnet_aux
- ComfyUI-Crystools
- ComfyUI-VideoHelperSuite
- ComfyUI-Lora-Manager
- ComfyUI-GGUF
- ComfyUI-Frame-Interpolation
- ComfyUI-nunchaku
- ComfyMath
- ComfyUI_UltimateSDUpscale
Troubleshooting
Cold Start Timeout
First request starts ComfyUI server (~30-60s). Increase idle timeout or use warm workers.
Out of Memory
Reduce batch size or resolution in workflow. Use GGUF quantized models for large models.
Model Not Found
Ensure models are uploaded to correct /userdata/models/ subdirectory matching ComfyUI folder structure.
Video Generation Timeout
Default max is 600s. For longer videos, split into segments or increase resolution/reduce frames.
Connection Refused
ComfyUI server may have crashed. Check logs in RunPod console. Ensure workflow is valid.
Local Testing
# Build
docker build -t comfyui-runpod:latest .
# Run with GPU
docker run --gpus all -p 8188:8188 \
-v /path/to/models:/userdata/models \
comfyui-runpod:latest
# Test handler
curl -X POST http://localhost:8188/runsync \
-H "Content-Type: application/json" \
-d '{"input": {"workflow": {...}}}'
Environment Variables
| Variable | Default | Description |
|---|---|---|
HF_HOME |
/workspace/.cache/huggingface |
HuggingFace cache |
HF_HUB_ENABLE_HF_TRANSFER |
1 |
Fast HF downloads |
PYTHONUNBUFFERED |
1 |
Realtime logs |