Reduce build parallelism to avoid OOM during SageAttention compile
Some checks failed
Build and Push Docker Image / build (push) Has been cancelled

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Debian
2026-01-03 20:17:10 +00:00
parent 99fdda5b2b
commit c61aca4074

View File

@@ -62,9 +62,9 @@ RUN pip install -r /tmp/requirements.txt && rm -rf /root/.cache/pip
# Compile SageAttention 2.2 from source with no build isolation
WORKDIR /tmp
ENV EXT_PARALLEL=2
ENV NVCC_APPEND_FLAGS="--threads 2"
ENV MAX_JOBS=4
ENV EXT_PARALLEL=1
ENV NVCC_APPEND_FLAGS="--threads 1"
ENV MAX_JOBS=2
# Target RunPod GPU architectures:
# 8.0 = A100, 8.6 = A10/RTX 3090, 8.9 = RTX 4090/L40, 9.0 = H100/H200
# Note: Blackwell (10.0) not yet supported by SageAttention