SageAttention was only compiled for A100 (sm80) and H100 (sm90).
RTX 5090 (Blackwell sm120) has no compatible kernel, causing ComfyUI
to crash during generation with "Connection reset by peer".
Switch to PyTorch's native SDPA which works on all architectures.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add JobLogger class to handler.py for structured timestamped logging
- Increase MAX_TIMEOUT from 600s to 1200s (20 minutes)
- Add logs column to generated_content table via migration
- Store and display job execution logs in gallery UI
- Add Logs button to gallery items with modal display
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Click on video thumbnail to open in large viewer
- Video plays on hover, pauses when mouse leaves
- Close viewer by clicking X or clicking outside video
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace short negative prompt with comprehensive list
- Add X button to clear selected image before generating
- Allow selecting a new image by clearing the current one
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add "Hide Media" / "Show Media" button to both sections
- Blur images and videos when privacy mode is active
- Persist privacy preference in localStorage per section
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Inline onclick handlers on async functions fail silently when
promises reject. This affected delete buttons, edit buttons,
modal close/cancel buttons, and pagination.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The find command searching for .safetensors across /runpod-volume
was timing out after 30 seconds on volumes with many files.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Same issue as View Progress button - replace inline onclick handler
with proper addEventListener to fix silent failures from async
promise rejections.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace inline onclick handlers with proper addEventListener to fix
silent failures from async promise rejections. Add try-catch error
handling to show errors to user instead of failing silently.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Background Job Processor:
- Add src/services/jobProcessor.ts that polls RunPod every 30s for stuck jobs
- Automatically completes or fails jobs that were abandoned (user navigated away)
- Times out jobs after 25 minutes
Client-Side Resume:
- Add GET /api/generate/pending endpoint to fetch user's processing jobs
- Add checkPendingJobs() that runs on login/page load
- Show notification banner when user has jobs generating in background
- Add "View Progress" button to resume polling for a job
Timeout Increases (10min → 25min):
- src/utils/validators.ts: request validation max/default
- src/config.ts: RUNPOD_MAX_TIMEOUT_MS default
- public/js/app.js: client-side polling maxTime
- src/services/jobProcessor.ts: background processor timeout
CI/CD Optimization:
- Add paths-ignore to backend build.yaml to skip rebuilds on frontend-only changes
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Node.js/Express backend with TypeScript
- SQLite database for users, sessions, and content metadata
- Authentication with TOTP and WebAuthn MFA support
- Admin user auto-created on first startup
- User content gallery with view/delete functionality
- RunPod API proxy (keeps API keys server-side)
- Docker setup with CI/CD for Gitea registry
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Empty backend string still triggered inductor. Removing compile_args
connection from WanVideoModelLoader nodes (131, 132) ensures no
torch.compile is applied to the transformer blocks.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The inductor backend was causing cudaErrorInvalidValue during Triton
kernel autotuning on Blackwell architecture. Setting backend to empty
string disables torch.compile entirely.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Disable torch.compile (inductor -> disabled) to reduce cold start time
- Fix handler to detect video type from file extension, not output key
- Fix HTML to check filename extension for video display
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Switch to PyTorch nightly with CUDA 12.8 (required for sm_120)
- Target TORCH_CUDA_ARCH_LIST="12.0" for Blackwell
- Remove nunchaku (incompatible with PyTorch nightly)
- Use latest SageAttention (has sm_120 kernel support)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The sm90 kernels use wgmma instructions that can't be compiled for
sm86/sm89 targets. Restricting to 8.0 (A100) and 9.0 (H100) only.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Recent commits broke TORCH_CUDA_ARCH_LIST support, requiring a GPU
during build. Pin to 2aecfa8 which respects the env var.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update SageAttention CUDA arch list to support A100, A10, RTX 4090, L40, H100/H200
- Add interactive HTML test page for RunPod API testing
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Models are stored in /runpod-volume/ComfyUI/models/ on the network
volume, not /runpod-volume/models/. Updated all symlinks to match.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use correct URL from styler00dollar/VSGAN-tensorrt-docker releases.
Also fix path to ckpts/rife/ subdirectory.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The RIFE model is small (~15MB) and required for the workflow.
Pre-downloading avoids runtime download delays.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
WanVideo models are stored in diffusion_models/, and the CLIP text
encoder is in text_encoders/. These were missing from the symlink setup.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The frontend-to-API conversion was using outdated widget names that
don't match the current WanVideo node API. Using the exported API
format workflow directly bypasses this issue.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Bypassed/muted nodes should not be included in the API workflow,
and connections from bypassed nodes should be ignored.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use REST API to update template's docker image
- Use saveEndpoint mutation with required name field
- Cycle workers to 0 then back to force image refresh
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Updates the serverless endpoint with new image tag and purges
existing workers to force restart with the new image after build.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The convert_frontend_to_api() function was missing mappings for most
node types, causing "Required input is missing" errors when the API
received workflows in frontend format.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fixes workflow error: MathExpression|pysssss node not found.
These nodes are required by the Wan22-I2V-Remix workflow.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Combine PyTorch + triton install into single layer
- Add pip cache cleanup after each install step
- Change SageAttention to regular install and remove source after build
- Consolidate custom node dependencies into single layer
- Add CLAUDE.md, i2v-workflow.json, update handler.py and PROJECT.md
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>