Add multi-GPU support and HTML test interface

- Update SageAttention CUDA arch list to support A100, A10, RTX 4090, L40, H100/H200 - Add interactive HTML test page for RunPod API testing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 10:25:08 +00:00
parent f69bbc2f45
commit 99fdda5b2b
2 changed files with 615 additions and 2 deletions
--- a/5
+++ b/5
@@ -65,9 +65,10 @@ WORKDIR /tmp
 ENV EXT_PARALLEL=2
 ENV NVCC_APPEND_FLAGS="--threads 2"
 ENV MAX_JOBS=4
-# Target RunPod GPU architectures: H100/H200(9.0)
+# Target RunPod GPU architectures:
+# 8.0 = A100, 8.6 = A10/RTX 3090, 8.9 = RTX 4090/L40, 9.0 = H100/H200
 # Note: Blackwell (10.0) not yet supported by SageAttention
-ENV TORCH_CUDA_ARCH_LIST="9.0"
+ENV TORCH_CUDA_ARCH_LIST="8.0;8.6;8.9;9.0"
 RUN git clone https://github.com/thu-ml/SageAttention.git && \
    cd SageAttention && \
    pip install --no-build-isolation . && \