Files

Debian 688cfe57ed Initial Ralph scaffold for ralph-vibe

2026-01-10 11:59:27 +00:00

24 KiB

Raw Blame History

The Ralph Method: Comprehensive Framework for 100% Vibe-Coded Greenfield Projects

Executive Summary

The Ralph Method is an iterative AI development methodology that runs autonomous coding agents in persistent loops until task completion. Named after Ralph Wiggum from The Simpsons—perpetually confused but never stopping—it embodies the philosophy: "Better to fail predictably than succeed unpredictably."

Core principle: Iteration beats perfection. The agent sees previous work via git history and modified files, learns from it, and iteratively improves.

Part 1: Philosophy and Prerequisites

The Ralph Mindset

Deterministically bad in an undeterministic world: Failures are predictable and informative
Operator skill matters: Success depends on writing good prompts, not just having a good model
LLMs are mirrors of operator skill: The quality of output reflects the quality of input
Faith in eventual consistency: Ralph will test you. Each failure tunes the system like a guitar

Prerequisites

Claude Code installed and configured
Git initialized in project directory
jq installed (required dependency on Windows/Git Bash)
Clear success criteria for your project
Cost awareness: Opus 4.5 burns $15-75/hour on large contexts

Part 2: PRD Structure for Ralph Consumption

Traditional PRDs must be converted to machine-parseable format. Create PROMPT.md in your project root.

PRD Template

# Project: [NAME]

## Objective
[One sentence describing the end state]

## Application Type
[Web app | Desktop app | CLI tool | Library | Mobile app | Other]

## Architecture
- Interface types: [REST API | GraphQL | IPC | Module boundaries | CLI | File formats]
- Persistence: [Remote DB | Local DB | File-based | In-memory | None]
- Deployment: [Cloud | Self-hosted | Desktop installer | Package registry | App store]

## Tech Stack
- Language: [X]
- Framework: [X]
- Database/Storage: [X]
- Testing: [X]
- CI: [X]

## Completion Criteria
All of the following must be true:
1. Build command passes with zero errors
2. Test command passes with >80% coverage
3. Lint command passes
4. All features documented in README.md
5. [App-type specific criteria, e.g.:]
   - Web: Docker compose brings up working stack
   - Desktop: Installers build for all target platforms
   - CLI: Binary runs and --help works
   - Library: Package publishes to registry (dry-run)

Output `<promise>PROJECT_COMPLETE</promise>` when all criteria met.

## Phase 1: Foundation
- [ ] Initialize project structure
- [ ] Set up CI/linting/formatting
- [ ] Create data layer (DB schema | file handlers | state management)
- [ ] Write failing tests for core models

## Phase 2: Core [Features | UI | Commands | API]
[Adapted to app type]
- [ ] Feature A
  - Acceptance: [binary criterion]
- [ ] Feature B
  - Acceptance: [binary criterion]

## Phase 3: [Integration | System | I/O]
[Adapted to app type:]
- Web: External services, auth, error handling
- Desktop: System integration, file watching, native APIs
- CLI: Input parsing, output formatting, piping support
- Library: Edge cases, error handling, type exports

## Phase 4: [Polish | Packaging | Publishing]
[Adapted to app type:]
- Web: Documentation, performance, deployment config
- Desktop: Installers, auto-update, icons, signing
- CLI: Man pages, shell completions, distribution
- Library: API docs, examples, changelog, publishing

## Constraints
- No external API calls without mocking in tests
- All functions must have docstrings/JSDoc
- [Language-specific, e.g., No `any` types for TypeScript]
- [Additional constraints]

## Abort Conditions
If any of these occur, output `<promise>ABORT_BLOCKED</promise>`:
- Cannot resolve circular dependency after 5 attempts
- External service unavailable after 3 retries
- Architectural issue that requires human decision
- After 15 iterations if not fixed, document blocking issues

PRD Conversion Rules

Traditional PRD Element	Ralph Format
User stories	Concrete acceptance tests
"Should support X"	Specific test file that must pass
Architecture diagrams	File structure specification
"Nice to have"	Remove entirely or move to Phase 4
Ambiguous requirements	Binary pass/fail criteria
Success metrics	Automated verification commands

JSON-Based PRD (Matt Pocock Method)

For scoping work correctly and preventing context rot:

{
  "project": "MyApp",
  "features": [
    {
      "id": "auth",
      "description": "JWT authentication system",
      "priority": 1,
      "passes": false,
      "acceptance": "npm run test:auth passes"
    },
    {
      "id": "api",
      "description": "REST API endpoints",
      "priority": 2,
      "passes": false,
      "acceptance": "all endpoints return 200 for valid requests"
    }
  ]
}

Prompt the agent to:

Pick the highest priority feature with passes: false
Work ONLY on that feature
Update the passing status when acceptance criteria met
Commit progress to progress.txt

Part 3: Directory Setup

Initial Structure

mkdir my-project && cd my-project
git init

# Core prompt file
cat > PROMPT.md << 'EOF'
[Your structured PRD here]
EOF

# Progress tracker (critical for context persistence)
touch progress.txt

# Agent context files (optional but recommended)
mkdir -p agent_docs
touch agent_docs/tech_stack.md
touch agent_docs/code_patterns.md
touch agent_docs/testing.md

# Initial commit
git add .
git commit -m "Initial prompt and structure"

Recommended Full Structure

my-project/
├── PROMPT.md              # Main task definition
├── progress.txt           # Append-only progress log
├── prd.json               # JSON PRD for feature tracking
├── agent_docs/
│   ├── tech_stack.md      # Tech decisions and rationale
│   ├── code_patterns.md   # Project conventions
│   ├── testing.md         # Test strategy
│   └── resources.md       # External references
├── CLAUDE.md              # Claude Code specific config
├── .cursorrules           # Cursor config (if using)
├── .gitignore
├── README.md
└── src/

Part 4: Execution Methods

Method A: Official Ralph Wiggum Plugin (Recommended)

# Install from official plugin marketplace
/plugin marketplace add anthropics/claude-code
/plugin install ralph-wiggum@claude-plugins-official

# Basic usage
/ralph-wiggum:ralph-loop "<prompt>" --max-iterations N

# With completion promise
/ralph-wiggum:ralph-loop "<prompt>" --max-iterations N --completion-promise "text"

# Cancel active loop
/ralph-wiggum:cancel-ralph

Example:

/ralph-wiggum:ralph-loop "Build the authentication system as specified in PROMPT.md. Output <promise>AUTH_COMPLETE</promise> when all auth tests pass." --max-iterations 30 --completion-promise "AUTH_COMPLETE"

Method B: Raw Bash Loop (Universal)

#!/bin/bash
# ralph-loop.sh

MAX_ITERATIONS=${1:-30}
ITERATION=0

while [ $ITERATION -lt $MAX_ITERATIONS ]; do
    echo "=== Iteration $((ITERATION + 1)) of $MAX_ITERATIONS ==="
    
    # Feed prompt to Claude Code
    cat PROMPT.md | claude --print
    
    # Check for completion promise
    if grep -rq "<promise>PROJECT_COMPLETE</promise>" . 2>/dev/null; then
        echo "Project complete at iteration $((ITERATION + 1))"
        exit 0
    fi
    
    # Check for abort
    if grep -rq "<promise>ABORT_BLOCKED</promise>" . 2>/dev/null; then
        echo "Agent blocked. Check logs."
        exit 1
    fi
    
    # Auto-commit progress
    git add -A
    git diff --cached --quiet || git commit -m "Ralph iteration $((ITERATION + 1))"
    
    ITERATION=$((ITERATION + 1))
done

echo "Max iterations reached"

Method C: Enhanced Loop with Safety Controls

#!/bin/bash
set -e

MAX_ITERATIONS=${1:-30}
FAIL_COUNT=0
MAX_CONSECUTIVE_FAILS=5

for i in $(seq 1 $MAX_ITERATIONS); do
    echo "=== Iteration $i of $MAX_ITERATIONS ==="
    echo "Started: $(date)" >> progress.txt
    
    # Run Claude with timeout (10 minutes per iteration)
    timeout 600 bash -c 'cat PROMPT.md | claude --print' || {
        echo "Iteration timed out, continuing..."
        echo "Iteration $i: TIMEOUT" >> progress.txt
    }
    
    # Commit whatever got done
    git add -A
    git diff --cached --quiet || git commit -m "Ralph iteration $i"
    
    # Check completion
    if grep -rq "<promise>PROJECT_COMPLETE</promise>" .; then
        echo "Complete at iteration $i"
        echo "PROJECT COMPLETE at iteration $i" >> progress.txt
        exit 0
    fi
    
    # Check abort
    if grep -rq "<promise>ABORT_BLOCKED</promise>" .; then
        echo "Agent signaled abort"
        exit 1
    fi
    
    # Verify tests pass (prevents compounding errors)
    if command -v npm &> /dev/null && [ -f package.json ]; then
        if ! npm test 2>/dev/null; then
            FAIL_COUNT=$((FAIL_COUNT + 1))
            echo "Test failure #$FAIL_COUNT" >> progress.txt
            if [ $FAIL_COUNT -gt $MAX_CONSECUTIVE_FAILS ]; then
                echo "Too many consecutive test failures. Stopping."
                exit 1
            fi
        else
            FAIL_COUNT=0
        fi
    fi
    
    echo "Iteration $i complete" >> progress.txt
done

echo "Max iterations reached without completion"

Method D: Amp/Sourcegraph Loop

while :; do cat PROMPT.md | npx --yes @sourcegraph/amp ; done

Part 5: Prompt Engineering for Ralph

Critical Prompt Elements

Clear Completion Signal

Output <promise>COMPLETE</promise> when:
- All tests pass
- Linting passes
- Build succeeds

Atomic Task Scoping

Work ONLY on the highest-priority incomplete feature.
Do NOT proceed to next feature until current feature passes all acceptance criteria.

Progress Tracking

After each significant change:
1. Commit your work with descriptive message
2. Append progress to progress.txt (use verb 'append', do not modify previous entries)
3. Run tests to verify state

CI Green Requirement

Each commit MUST pass all tests and type checks.
Never commit code that breaks the build.
Run `npm run build && npm run test && npm run lint` before every commit.

Self-Correction Pattern

If tests fail:
1. Read the error message
2. Identify root cause
3. Implement fix
4. Run tests again
5. Repeat until green

Prompt Tuning Process

Geoffrey Huntley's guitar tuning metaphor:

Start with no guardrails: Let Ralph build the playground first
Add signs when Ralph fails: When Ralph falls off the slide, add a sign saying "SLIDE DOWN, DON'T JUMP"
Iterate on failures: Each failure teaches you what guardrails to add
Eventually get a new Ralph: Once prompts are tuned, the defects disappear

Example: Feature Implementation Prompt

# Task: Implement User Authentication

Read PROMPT.md for full project context.

## Current Feature
Implement JWT-based user authentication.

## Requirements
- POST /auth/register - Create new user
- POST /auth/login - Return JWT token
- GET /auth/me - Return current user (protected)
- Password hashing with bcrypt
- Token expiration: 24 hours

## Process
1. Write failing tests first (TDD)
2. Implement minimal code to pass
3. Run tests
4. If failing, debug and fix
5. Refactor if needed
6. Repeat until all green

## Success Criteria
- All tests in tests/auth.test.ts pass
- No linter errors
- Coverage > 80% for auth module

## On Completion
1. Update prd.json: set auth.passes = true
2. Append summary to progress.txt
3. Output <promise>AUTH_DONE</promise>

## If Stuck After 10 Attempts
- Document blocking issues in progress.txt
- List attempted approaches
- Output <promise>AUTH_BLOCKED</promise>

Part 6: AFK Ralph vs HOTL Ralph

AFK (Away From Keyboard) Ralph

For overnight/weekend runs on well-defined tasks:

## AFK Mode Instructions

You are running autonomously. Human will not intervene.

Rules:
1. Never stop to ask questions - make reasonable decisions
2. If blocked, try 3 alternative approaches before marking blocked
3. Commit frequently (every 5-10 minutes of work)
4. Log all decisions to progress.txt
5. Prefer reversible decisions over blocking
6. If uncertain, implement the simpler option
7. Document assumptions made

Safety:
- Never delete data without explicit instruction
- Never modify files outside project directory
- Never make network requests to production systems
- Stop if costs approach $100 in estimated tokens

HOTL (Human On The Loop) Ralph

For complex tasks requiring occasional judgment:

## HOTL Mode Instructions

Human is monitoring but not actively directing.

Rules:
1. Work autonomously on implementation details
2. Signal for human input on:
   - Architectural decisions
   - Security-critical code
   - External API integrations
   - Ambiguous requirements
3. Use <question>Your question here</question> tags when blocked
4. Continue with other work while waiting for response
5. Log questions to progress.txt with timestamp

Part 7: Advanced Patterns

Parallel Development with Git Worktrees

# Create isolated worktrees for parallel features
git worktree add ../project-auth -b feature/auth
git worktree add ../project-api -b feature/api
git worktree add ../project-ui -b feature/ui

# Terminal 1: Auth
cd ../project-auth
/ralph-wiggum:ralph-loop "Implement authentication..." --max-iterations 30

# Terminal 2: API (simultaneously)
cd ../project-api
/ralph-wiggum:ralph-loop "Build REST API..." --max-iterations 30

# Terminal 3: UI (simultaneously)
cd ../project-ui
/ralph-wiggum:ralph-loop "Build frontend components..." --max-iterations 30

# Later: merge completed features
git checkout main
git merge feature/auth
git merge feature/api
git merge feature/ui

Multi-Phase Sequential Development

# Phase 1: Data layer
/ralph-wiggum:ralph-loop "Phase 1: Build data models and database schema. Output <promise>PHASE1_DONE</promise>" --max-iterations 20 --completion-promise "PHASE1_DONE"

# Phase 2: Business logic
/ralph-wiggum:ralph-loop "Phase 2: Build service layer and business logic. Output <promise>PHASE2_DONE</promise>" --max-iterations 25 --completion-promise "PHASE2_DONE"

# Phase 3: API
/ralph-wiggum:ralph-loop "Phase 3: Build API endpoints. Output <promise>PHASE3_DONE</promise>" --max-iterations 25 --completion-promise "PHASE3_DONE"

# Phase 4: Frontend
/ralph-wiggum:ralph-loop "Phase 4: Build UI. Output <promise>PHASE4_DONE</promise>" --max-iterations 30 --completion-promise "PHASE4_DONE"

Overnight Batch Processing

#!/bin/bash
# overnight-work.sh

LOG_FILE="overnight-$(date +%Y%m%d).log"

echo "Starting overnight batch: $(date)" >> $LOG_FILE

# Project 1
cd /path/to/project1
echo "Starting project1..." >> $LOG_FILE
timeout 14400 bash -c '/ralph-wiggum:ralph-loop "$(cat PROMPT.md)" --max-iterations 50' >> $LOG_FILE 2>&1
echo "Project1 finished: $(date)" >> $LOG_FILE

# Project 2
cd /path/to/project2
echo "Starting project2..." >> $LOG_FILE
timeout 14400 bash -c '/ralph-wiggum:ralph-loop "$(cat PROMPT.md)" --max-iterations 50' >> $LOG_FILE 2>&1
echo "Project2 finished: $(date)" >> $LOG_FILE

echo "Batch complete: $(date)" >> $LOG_FILE

# Optional: send notification
# curl -X POST "https://hooks.slack.com/..." -d '{"text":"Overnight batch complete"}'

Part 8: Quality Control and Safety

Feedback Loops (Critical)

Tests: Every commit must pass all tests
Types: Full type checking on every commit
Lint: Consistent style enforcement
Build: Verify deployability continuously

## Mandatory Verification Sequence

Before EVERY commit, run:
```bash
npm run typecheck && npm run lint && npm run test && npm run build

If ANY command fails:

Do NOT commit
Fix the issue
Re-run verification
Only commit when all pass


### Cost Control

| Context Size | Approx Cost/Hour | Recommended Max Iterations |
|-------------|------------------|---------------------------|
| Small (<50k tokens) | $15-25 | 50 |
| Medium (50-200k tokens) | $25-50 | 30 |
| Large (>200k tokens) | $50-100+ | 20 |

```bash
# Set hard limits
/ralph-wiggum:ralph-loop "..." --max-iterations 30

# Monitor during execution
# Check ~/.claude/usage.log or equivalent

Safety Guardrails

## Safety Rules

NEVER:
- Delete production data
- Modify files outside project directory
- Execute commands with sudo
- Make requests to production APIs
- Store credentials in code
- Commit sensitive data to git

ALWAYS:
- Use environment variables for secrets
- Mock external services in tests
- Operate in sandboxed environment
- Maintain ability to rollback via git

Recovery Procedures

# If Ralph goes off the rails
/ralph-wiggum:cancel-ralph

# Review what happened
git log --oneline -20
git diff HEAD~5

# Rollback if needed
git reset --hard HEAD~N

# Clean up any mess
git clean -fd

# Restart with adjusted prompt

Part 9: When to Use (and Not Use) Ralph

Good For

Greenfield projects with clear specifications
Mechanical tasks: migrations, refactors, test coverage
Well-defined tasks with automatic verification
Batch operations: lint fixes, dependency updates
Documentation generation
Test writing with clear acceptance criteria
Overnight development while you sleep

Not Good For

Ambiguous requirements without binary success criteria
Architectural decisions requiring human judgment
Security-critical code needing human review
Exploratory work without clear goals
Production debugging requiring context
UX/design decisions requiring human taste
Complex integrations with undocumented APIs

Decision Matrix

Task Type	Ralph Suitability	Recommended Approach
New API endpoint	High	AFK Ralph
Database migration	High	AFK Ralph
Test coverage	High	AFK Ralph
Security audit	Low	Manual review
UI polish	Medium	HOTL Ralph
Performance tuning	Medium	HOTL Ralph
Architecture design	Low	Manual design + Ralph implementation
Bug investigation	Low	Manual + targeted Ralph

Part 10: Real-World Results

Documented Successes

CURSED Programming Language: 3-month continuous loop, built complete esoteric language
YC Hackathon: 6 repositories shipped overnight for $297 in API costs
$50k Contract: Delivered complete MVP with Ralph for ~$300 in compute
Integration Test Optimization: Reduced runtime from 4 minutes to 2 seconds

Cost Expectations

Project Complexity	Estimated Cost	Typical Iterations
Simple CRUD app	$50-100	20-30
Medium SaaS MVP	$200-500	50-100
Complex system	$500-2000+	100-300

Part 11: Quick Reference

Checklist: Before Starting Ralph

Git initialized
PROMPT.md created with binary completion criteria
progress.txt created (empty)
Test framework configured
Linting configured
Build command working
--max-iterations set appropriately
Cost limits understood
Rollback strategy clear

Checklist: During Ralph Run

Monitoring progress.txt
Watching git log
Checking test results
Monitoring token usage
Ready to /cancel-ralph if needed

Checklist: After Ralph Run

Review all commits
Run full test suite
Review code quality
Squash/rebase commits if desired
Document any issues found
Update PRD with lessons learned

Command Reference

# Start loop
/ralph-wiggum:ralph-loop "<prompt>" --max-iterations N --completion-promise "TEXT"

# Cancel loop
/ralph-wiggum:cancel-ralph

# Help
/ralph-wiggum:help

# Raw bash alternative
while :; do cat PROMPT.md | claude --print ; done

Appendix A: Complete PROMPT.md Template

# Project: [NAME]

## Context
[2-3 sentences about what this project is]

## Tech Stack
- Runtime: Node.js 20 / Python 3.12 / Go 1.22
- Framework: Express / FastAPI / Gin
- Database: PostgreSQL 16
- ORM: Prisma / SQLAlchemy / GORM
- Testing: Jest / pytest / go test
- Linting: ESLint / ruff / golangci-lint

## Directory Structure

src/ ├── controllers/ # HTTP handlers ├── services/ # Business logic ├── models/ # Data models ├── middleware/ # Auth, logging, etc. ├── utils/ # Helpers └── types/ # Type definitions

tests/ ├── unit/ ├── integration/ └── e2e/


## Completion Criteria
All must be true:
1. `npm run build` exits 0
2. `npm run test` exits 0 with >80% coverage
3. `npm run lint` exits 0
4. README.md contains API documentation
5. All features in prd.json have passes=true

Output `<promise>PROJECT_COMPLETE</promise>` when ALL criteria met.

## Features

### Phase 1: Foundation (Priority 1)
- [ ] Project setup with all dependencies
- [ ] Database connection and migrations
- [ ] Basic health check endpoint
- [ ] Logging middleware
- [ ] Error handling middleware

### Phase 2: Core (Priority 2)
- [ ] User model and CRUD
- [ ] Authentication (JWT)
- [ ] Authorization middleware
- [ ] [Feature specific to project]
- [ ] [Feature specific to project]

### Phase 3: Integration (Priority 3)
- [ ] [External service]
- [ ] Email notifications
- [ ] File uploads

### Phase 4: Polish (Priority 4)
- [ ] API documentation
- [ ] Rate limiting
- [ ] Caching
- [ ] Performance optimization

## Constraints
- All async operations must have error handling
- No hardcoded credentials
- All endpoints must validate input
- All database operations must use transactions where appropriate
- Test coverage minimum 80%
- No TypeScript `any` types / Python type hints required

## Working Process
1. Read this file and prd.json
2. Select highest priority feature with passes=false
3. Write failing tests first
4. Implement minimal code to pass
5. Verify: `npm run build && npm run test && npm run lint`
6. If fail: fix and retry
7. If pass: commit with descriptive message
8. Append progress to progress.txt
9. Update prd.json feature.passes = true
10. Repeat until all features complete

## Abort Conditions
Output `<promise>ABORT_BLOCKED</promise>` if:
- Circular dependency cannot be resolved after 5 attempts
- External API required but unavailable
- Fundamental architecture change needed
- After 15 iterations on same feature without progress

Document blocking issue in progress.txt before aborting.

## Safety
- Never modify files outside project directory
- Never use sudo
- Never store real credentials
- Always use .env for configuration
- Commit frequently

Appendix B: Troubleshooting

"Claude keeps stopping early"

Cause: Completion promise detected incorrectly or unclear criteria Fix: Make completion criteria more specific, use unique promise text

"Tests keep failing in loop"

Cause: Breaking changes accumulating Fix: Add stronger "CI green" requirement, reduce iteration count, add checkpoint validation

"Context getting too large"

Cause: Too much history accumulating Fix: Use phases with fresh context each, summarize progress.txt periodically

"Ralph taking wrong direction"

Cause: Ambiguous requirements Fix: Add explicit constraints, use the "sign next to slide" pattern - add guardrails for specific failure modes

"Loop running forever"

Cause: Impossible completion criteria Fix: Always set --max-iterations, add abort conditions, monitor manually

"Costs spiraling"

Cause: Large context, too many iterations Fix: Break into smaller phases, set strict iteration limits, use smaller models for simpler tasks

Framework version: 1.0 Last updated: January 2026 Based on techniques by Geoffrey Huntley, Matt Pocock, and the Claude Code community

24 KiB Raw Blame History