Fix stuck processing jobs and increase timeouts
Background Job Processor: - Add src/services/jobProcessor.ts that polls RunPod every 30s for stuck jobs - Automatically completes or fails jobs that were abandoned (user navigated away) - Times out jobs after 25 minutes Client-Side Resume: - Add GET /api/generate/pending endpoint to fetch user's processing jobs - Add checkPendingJobs() that runs on login/page load - Show notification banner when user has jobs generating in background - Add "View Progress" button to resume polling for a job Timeout Increases (10min → 25min): - src/utils/validators.ts: request validation max/default - src/config.ts: RUNPOD_MAX_TIMEOUT_MS default - public/js/app.js: client-side polling maxTime - src/services/jobProcessor.ts: background processor timeout CI/CD Optimization: - Add paths-ignore to backend build.yaml to skip rebuilds on frontend-only changes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
227
frontend/IMPLEMENTATION_PLAN.md
Normal file
227
frontend/IMPLEMENTATION_PLAN.md
Normal file
@@ -0,0 +1,227 @@
|
||||
# Fix: Handle Navigation Away During Video Generation
|
||||
|
||||
## Problem
|
||||
|
||||
When a user submits a video generation job and navigates away from the page:
|
||||
1. Client-side polling stops
|
||||
2. RunPod job continues but results are never fetched
|
||||
3. Content stays stuck as "processing" forever
|
||||
4. Video file never gets saved to disk
|
||||
|
||||
## Solution: Two-Part Fix
|
||||
|
||||
### Part 1: Background Job Processor (Server-Side)
|
||||
|
||||
Create a background worker that periodically checks for stuck "processing" jobs and completes them.
|
||||
|
||||
**New file: `src/services/jobProcessor.ts`**
|
||||
|
||||
```typescript
|
||||
// Runs every 30 seconds
|
||||
// Queries: SELECT * FROM generated_content WHERE status = 'processing' AND runpod_job_id IS NOT NULL
|
||||
// For each job:
|
||||
// 1. Poll RunPod status
|
||||
// 2. If COMPLETED: download file, update status to 'completed'
|
||||
// 3. If FAILED: update status to 'failed' with error message
|
||||
// 4. If still running and created_at > 15 minutes ago: mark as 'failed' (timeout)
|
||||
```
|
||||
|
||||
**Modify: `src/index.ts`**
|
||||
- Import and start the job processor on server startup
|
||||
- Clean shutdown handling
|
||||
|
||||
### Part 2: Resume Polling on Page Load (Client-Side)
|
||||
|
||||
When user returns to the app, check for their in-progress jobs and resume polling.
|
||||
|
||||
**Modify: `public/js/app.js`**
|
||||
|
||||
```javascript
|
||||
// On login/page load:
|
||||
// 1. Call GET /api/content?status=processing to find pending jobs
|
||||
// 2. For each processing job with a runpod_job_id:
|
||||
// - Show notification "You have X jobs in progress"
|
||||
// - Optionally auto-resume polling for most recent one
|
||||
// 3. Update gallery to show real-time status
|
||||
```
|
||||
|
||||
**New API endpoint: `GET /api/generate/pending`**
|
||||
- Returns user's jobs that are still processing
|
||||
- Include runpod_job_id so client can poll
|
||||
|
||||
## Files to Modify
|
||||
|
||||
1. **`src/services/jobProcessor.ts`** (NEW)
|
||||
- `startJobProcessor()` - starts interval
|
||||
- `stopJobProcessor()` - cleanup
|
||||
- `processStuckJobs()` - main logic
|
||||
|
||||
2. **`src/index.ts`**
|
||||
- Import jobProcessor
|
||||
- Call `startJobProcessor()` after DB init
|
||||
- Call `stopJobProcessor()` in shutdown handler
|
||||
|
||||
3. **`src/routes/generate.ts`**
|
||||
- Add `GET /pending` endpoint for user's processing jobs
|
||||
|
||||
4. **`public/js/app.js`**
|
||||
- Add `checkPendingJobs()` function
|
||||
- Call it after successful login in `showMainApp()`
|
||||
- Show UI notification for pending jobs
|
||||
- Add "Resume" button or auto-resume latest
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### jobProcessor.ts
|
||||
|
||||
```typescript
|
||||
import { getDb } from '../db/index.js';
|
||||
import { getJobStatus } from './runpodService.js';
|
||||
import { updateContentStatus, saveContentFile } from './contentService.js';
|
||||
import { logger } from '../utils/logger.js';
|
||||
|
||||
let processorInterval: NodeJS.Timeout | null = null;
|
||||
|
||||
const POLL_INTERVAL = 30000; // 30 seconds
|
||||
const JOB_TIMEOUT = 15 * 60 * 1000; // 15 minutes
|
||||
|
||||
export function startJobProcessor(): void {
|
||||
logger.info('Starting background job processor');
|
||||
processorInterval = setInterval(processStuckJobs, POLL_INTERVAL);
|
||||
// Run immediately on startup
|
||||
processStuckJobs();
|
||||
}
|
||||
|
||||
export function stopJobProcessor(): void {
|
||||
if (processorInterval) {
|
||||
clearInterval(processorInterval);
|
||||
processorInterval = null;
|
||||
logger.info('Stopped background job processor');
|
||||
}
|
||||
}
|
||||
|
||||
async function processStuckJobs(): Promise<void> {
|
||||
const db = getDb();
|
||||
|
||||
const pendingJobs = db.prepare(`
|
||||
SELECT * FROM generated_content
|
||||
WHERE status = 'processing' AND runpod_job_id IS NOT NULL
|
||||
`).all();
|
||||
|
||||
for (const job of pendingJobs) {
|
||||
try {
|
||||
const createdAt = new Date(job.created_at).getTime();
|
||||
const age = Date.now() - createdAt;
|
||||
|
||||
// Timeout check
|
||||
if (age > JOB_TIMEOUT) {
|
||||
updateContentStatus(job.id, 'failed', {
|
||||
errorMessage: 'Job timed out'
|
||||
});
|
||||
continue;
|
||||
}
|
||||
|
||||
// Poll RunPod
|
||||
const status = await getJobStatus(job.runpod_job_id);
|
||||
|
||||
if (status.status === 'COMPLETED' && status.output?.outputs?.[0]) {
|
||||
const output = status.output.outputs[0];
|
||||
if (output.data) {
|
||||
saveContentFile(job.id, output.data);
|
||||
} else {
|
||||
updateContentStatus(job.id, 'completed', { fileSize: output.size });
|
||||
}
|
||||
logger.info({ contentId: job.id }, 'Background processor completed job');
|
||||
} else if (status.status === 'FAILED') {
|
||||
updateContentStatus(job.id, 'failed', {
|
||||
errorMessage: status.error || 'Job failed'
|
||||
});
|
||||
}
|
||||
} catch (error) {
|
||||
logger.error({ error, contentId: job.id }, 'Error processing stuck job');
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Frontend changes (app.js)
|
||||
|
||||
Add after `showMainApp()` is called:
|
||||
|
||||
```javascript
|
||||
async function checkPendingJobs() {
|
||||
try {
|
||||
const data = await api('/generate/pending');
|
||||
if (data.jobs && data.jobs.length > 0) {
|
||||
showPendingJobsNotification(data.jobs);
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('Failed to check pending jobs:', error);
|
||||
}
|
||||
}
|
||||
|
||||
function showPendingJobsNotification(jobs) {
|
||||
// Create a notification banner
|
||||
const banner = document.createElement('div');
|
||||
banner.className = 'pending-jobs-banner';
|
||||
banner.innerHTML = `
|
||||
<span>You have ${jobs.length} video(s) generating</span>
|
||||
<button onclick="resumeLatestJob()">View Progress</button>
|
||||
<button onclick="this.parentElement.remove()">Dismiss</button>
|
||||
`;
|
||||
document.querySelector('.main-content').prepend(banner);
|
||||
}
|
||||
```
|
||||
|
||||
### New endpoint in generate.ts
|
||||
|
||||
```typescript
|
||||
// Get user's pending jobs
|
||||
router.get('/pending', (req, res) => {
|
||||
const authReq = req as AuthenticatedRequest;
|
||||
const db = getDb();
|
||||
|
||||
const jobs = db.prepare(`
|
||||
SELECT id, runpod_job_id, prompt, created_at
|
||||
FROM generated_content
|
||||
WHERE user_id = ? AND status = 'processing' AND runpod_job_id IS NOT NULL
|
||||
ORDER BY created_at DESC
|
||||
`).all(authReq.user!.id);
|
||||
|
||||
res.json({ jobs });
|
||||
});
|
||||
```
|
||||
|
||||
## CSS Addition (style.css)
|
||||
|
||||
```css
|
||||
.pending-jobs-banner {
|
||||
background: linear-gradient(135deg, var(--primary), var(--secondary));
|
||||
color: white;
|
||||
padding: 12px 20px;
|
||||
border-radius: var(--radius);
|
||||
margin-bottom: 20px;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: space-between;
|
||||
gap: 15px;
|
||||
}
|
||||
|
||||
.pending-jobs-banner button {
|
||||
background: rgba(255,255,255,0.2);
|
||||
border: 1px solid rgba(255,255,255,0.3);
|
||||
color: white;
|
||||
padding: 6px 12px;
|
||||
border-radius: 4px;
|
||||
cursor: pointer;
|
||||
}
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
1. Start a generation job
|
||||
2. Navigate to Gallery while processing
|
||||
3. Verify background processor picks it up within 30 seconds
|
||||
4. Verify job completes and file is saved
|
||||
5. Test timeout scenario (mock a stuck job)
|
||||
6. Test page reload shows pending jobs notification
|
||||
Reference in New Issue
Block a user