Limit SageAttention to A100/H100 due to cross-compile issues
Some checks failed
Build and Push Docker Image / build (push) Has been cancelled
Some checks failed
Build and Push Docker Image / build (push) Has been cancelled
The sm90 kernels use wgmma instructions that can't be compiled for sm86/sm89 targets. Restricting to 8.0 (A100) and 9.0 (H100) only. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -66,9 +66,9 @@ ENV EXT_PARALLEL=2
|
|||||||
ENV NVCC_APPEND_FLAGS="--threads 2"
|
ENV NVCC_APPEND_FLAGS="--threads 2"
|
||||||
ENV MAX_JOBS=4
|
ENV MAX_JOBS=4
|
||||||
# Target RunPod GPU architectures:
|
# Target RunPod GPU architectures:
|
||||||
# 8.0 = A100, 8.6 = A10/RTX 3090, 8.9 = RTX 4090/L40, 9.0 = H100/H200
|
# 8.0 = A100, 9.0 = H100/H200
|
||||||
# Note: Blackwell (10.0) not yet supported by SageAttention
|
# Note: 8.6/8.9 excluded due to SageAttention sm90 kernel cross-compile issues
|
||||||
ENV TORCH_CUDA_ARCH_LIST="8.0;8.6;8.9;9.0"
|
ENV TORCH_CUDA_ARCH_LIST="8.0;9.0"
|
||||||
RUN git clone https://github.com/thu-ml/SageAttention.git && \
|
RUN git clone https://github.com/thu-ml/SageAttention.git && \
|
||||||
cd SageAttention && \
|
cd SageAttention && \
|
||||||
git checkout 2aecfa8 && \
|
git checkout 2aecfa8 && \
|
||||||
|
|||||||
Reference in New Issue
Block a user