comfyui-serverless/README.md

# ComfyUI RunPod Serverless

RunPod Serverless endpoint for ComfyUI with SageAttention 2.2, supporting image and video generation workflows.

## Stack

- CUDA 12.8.1 / Ubuntu 22.04
- Python 3.12
- PyTorch 2.8.0+cu128
- SageAttention 2.2 (compiled)
- Nunchaku 1.0.2
- 12 custom nodes pre-installed

## Prerequisites

- Docker with NVIDIA runtime
- RunPod account with API key
- Network volume created in RunPod
- Container registry (Docker Hub, Gitea, etc.)

## Build

```bash
docker build -t comfyui-runpod:latest .
```

Build with specific platform (if building on ARM):

```bash
docker build --platform linux/amd64 -t comfyui-runpod:latest .
```

## Push to Registry

Docker Hub:

```bash
docker tag comfyui-runpod:latest yourusername/comfyui-runpod:latest
docker push yourusername/comfyui-runpod:latest
```

Self-hosted Gitea:

```bash
docker tag comfyui-runpod:latest git.yourdomain.com/username/comfyui-runpod:latest
docker push git.yourdomain.com/username/comfyui-runpod:latest
```

## Network Volume Setup

Create a network volume in RunPod and populate with models:

```
/userdata/
├── models/
│   ├── checkpoints/     # SD, SDXL, Flux models
│   ├── loras/           # LoRA models
│   ├── vae/             # VAE models
│   ├── controlnet/      # ControlNet models
│   ├── clip/            # CLIP models
│   └── upscale_models/  # Upscaler models
└── .cache/
    └── huggingface/     # HF model cache
```

Upload models via RunPod pod or rclone to the network volume before deploying serverless.

## RunPod Deployment

1. Go to RunPod Console > Serverless > New Endpoint

2. Configure endpoint:
   - **Container Image**: `yourusername/comfyui-runpod:latest`
   - **GPU**: RTX 4090, RTX 5090, or A100 recommended
   - **Network Volume**: Select your volume (mounts at `/userdata`)
   - **Active Workers**: 0 (scale to zero)
   - **Max Workers**: Based on budget
   - **Idle Timeout**: 5-10 seconds
   - **Execution Timeout**: 600 seconds (for video)

3. Deploy and note the Endpoint ID

## API Usage

### Endpoint URL

```
https://api.runpod.ai/v2/{ENDPOINT_ID}/runsync
```

### Headers

```
Authorization: Bearer {RUNPOD_API_KEY}
Content-Type: application/json
```

### Request Schema

```json
{
  "input": {
    "workflow": {},
    "prompt": "optional prompt text",
    "image": "optional base64 image",
    "prompt_node_id": "optional node id for prompt",
    "image_node_id": "optional node id for image",
    "timeout": 300
  }
}
```

### Example: Text-to-Image

```bash
curl -X POST "https://api.runpod.ai/v2/${ENDPOINT_ID}/runsync" \
  -H "Authorization: Bearer ${RUNPOD_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "workflow": '"$(cat workflow_api.json)"',
      "prompt": "a photo of a cat in space"
    }
  }'
```

### Example: Image-to-Video

```bash
curl -X POST "https://api.runpod.ai/v2/${ENDPOINT_ID}/runsync" \
  -H "Authorization: Bearer ${RUNPOD_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "workflow": '"$(cat i2v_workflow_api.json)"',
      "image": "'"$(base64 -w0 input.png)"'",
      "prompt": "the cat walks forward",
      "timeout": 600
    }
  }'
```

### Response Schema

Success:

```json
{
  "id": "job-id",
  "status": "COMPLETED",
  "output": {
    "status": "success",
    "prompt_id": "abc123",
    "outputs": [
      {
        "type": "video",
        "filename": "output.mp4",
        "data": "base64...",
        "size": 1234567
      }
    ]
  }
}
```

Large files (>10MB video):

```json
{
  "outputs": [
    {
      "type": "video",
      "filename": "output.mp4",
      "path": "/userdata/outputs/output.mp4",
      "size": 52428800
    }
  ]
}
```

Error:

```json
{
  "output": {
    "error": "error message",
    "status": "error"
  }
}
```

## Async Execution

For long-running video jobs, use async endpoint:

```bash
# Submit job
curl -X POST "https://api.runpod.ai/v2/${ENDPOINT_ID}/run" \
  -H "Authorization: Bearer ${RUNPOD_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"input": {...}}'

# Response: {"id": "job-id", "status": "IN_QUEUE"}

# Poll for result
curl "https://api.runpod.ai/v2/${ENDPOINT_ID}/status/${JOB_ID}" \
  -H "Authorization: Bearer ${RUNPOD_API_KEY}"
```

## Workflow Export

Export workflows from ComfyUI in API format:

1. Open ComfyUI
2. Enable Dev Mode in settings
3. Click "Save (API Format)"
4. Use the exported JSON as the `workflow` parameter

## Custom Nodes Included

- ComfyUI-Manager
- ComfyUI_HuggingFace_Downloader
- ComfyUI-KJNodes
- comfyui_controlnet_aux
- ComfyUI-Crystools
- ComfyUI-VideoHelperSuite
- ComfyUI-Lora-Manager
- ComfyUI-GGUF
- ComfyUI-Frame-Interpolation
- ComfyUI-nunchaku
- ComfyMath
- ComfyUI_UltimateSDUpscale

## Troubleshooting

### Cold Start Timeout

First request starts ComfyUI server (~30-60s). Increase idle timeout or use warm workers.

### Out of Memory

Reduce batch size or resolution in workflow. Use GGUF quantized models for large models.

### Model Not Found

Ensure models are uploaded to correct `/userdata/models/` subdirectory matching ComfyUI folder structure.

### Video Generation Timeout

Default max is 600s. For longer videos, split into segments or increase resolution/reduce frames.

### Connection Refused

ComfyUI server may have crashed. Check logs in RunPod console. Ensure workflow is valid.

## Local Testing

```bash
# Build
docker build -t comfyui-runpod:latest .

# Run with GPU
docker run --gpus all -p 8188:8188 \
  -v /path/to/models:/userdata/models \
  comfyui-runpod:latest

# Test handler
curl -X POST http://localhost:8188/runsync \
  -H "Content-Type: application/json" \
  -d '{"input": {"workflow": {...}}}'
```

## Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `HF_HOME` | `/workspace/.cache/huggingface` | HuggingFace cache |
| `HF_HUB_ENABLE_HF_TRANSFER` | `1` | Fast HF downloads |
| `PYTHONUNBUFFERED` | `1` | Realtime logs |