Files
ralph-vibe/docs/framework.md
2026-01-10 11:59:27 +00:00

24 KiB

The Ralph Method: Comprehensive Framework for 100% Vibe-Coded Greenfield Projects

Executive Summary

The Ralph Method is an iterative AI development methodology that runs autonomous coding agents in persistent loops until task completion. Named after Ralph Wiggum from The Simpsons—perpetually confused but never stopping—it embodies the philosophy: "Better to fail predictably than succeed unpredictably."

Core principle: Iteration beats perfection. The agent sees previous work via git history and modified files, learns from it, and iteratively improves.


Part 1: Philosophy and Prerequisites

The Ralph Mindset

  • Deterministically bad in an undeterministic world: Failures are predictable and informative
  • Operator skill matters: Success depends on writing good prompts, not just having a good model
  • LLMs are mirrors of operator skill: The quality of output reflects the quality of input
  • Faith in eventual consistency: Ralph will test you. Each failure tunes the system like a guitar

Prerequisites

  1. Claude Code installed and configured
  2. Git initialized in project directory
  3. jq installed (required dependency on Windows/Git Bash)
  4. Clear success criteria for your project
  5. Cost awareness: Opus 4.5 burns $15-75/hour on large contexts

Part 2: PRD Structure for Ralph Consumption

Traditional PRDs must be converted to machine-parseable format. Create PROMPT.md in your project root.

PRD Template

# Project: [NAME]

## Objective
[One sentence describing the end state]

## Application Type
[Web app | Desktop app | CLI tool | Library | Mobile app | Other]

## Architecture
- Interface types: [REST API | GraphQL | IPC | Module boundaries | CLI | File formats]
- Persistence: [Remote DB | Local DB | File-based | In-memory | None]
- Deployment: [Cloud | Self-hosted | Desktop installer | Package registry | App store]

## Tech Stack
- Language: [X]
- Framework: [X]
- Database/Storage: [X]
- Testing: [X]
- CI: [X]

## Completion Criteria
All of the following must be true:
1. Build command passes with zero errors
2. Test command passes with >80% coverage
3. Lint command passes
4. All features documented in README.md
5. [App-type specific criteria, e.g.:]
   - Web: Docker compose brings up working stack
   - Desktop: Installers build for all target platforms
   - CLI: Binary runs and --help works
   - Library: Package publishes to registry (dry-run)

Output `<promise>PROJECT_COMPLETE</promise>` when all criteria met.

## Phase 1: Foundation
- [ ] Initialize project structure
- [ ] Set up CI/linting/formatting
- [ ] Create data layer (DB schema | file handlers | state management)
- [ ] Write failing tests for core models

## Phase 2: Core [Features | UI | Commands | API]
[Adapted to app type]
- [ ] Feature A
  - Acceptance: [binary criterion]
- [ ] Feature B
  - Acceptance: [binary criterion]

## Phase 3: [Integration | System | I/O]
[Adapted to app type:]
- Web: External services, auth, error handling
- Desktop: System integration, file watching, native APIs
- CLI: Input parsing, output formatting, piping support
- Library: Edge cases, error handling, type exports

## Phase 4: [Polish | Packaging | Publishing]
[Adapted to app type:]
- Web: Documentation, performance, deployment config
- Desktop: Installers, auto-update, icons, signing
- CLI: Man pages, shell completions, distribution
- Library: API docs, examples, changelog, publishing

## Constraints
- No external API calls without mocking in tests
- All functions must have docstrings/JSDoc
- [Language-specific, e.g., No `any` types for TypeScript]
- [Additional constraints]

## Abort Conditions
If any of these occur, output `<promise>ABORT_BLOCKED</promise>`:
- Cannot resolve circular dependency after 5 attempts
- External service unavailable after 3 retries
- Architectural issue that requires human decision
- After 15 iterations if not fixed, document blocking issues

PRD Conversion Rules

Traditional PRD Element Ralph Format
User stories Concrete acceptance tests
"Should support X" Specific test file that must pass
Architecture diagrams File structure specification
"Nice to have" Remove entirely or move to Phase 4
Ambiguous requirements Binary pass/fail criteria
Success metrics Automated verification commands

JSON-Based PRD (Matt Pocock Method)

For scoping work correctly and preventing context rot:

{
  "project": "MyApp",
  "features": [
    {
      "id": "auth",
      "description": "JWT authentication system",
      "priority": 1,
      "passes": false,
      "acceptance": "npm run test:auth passes"
    },
    {
      "id": "api",
      "description": "REST API endpoints",
      "priority": 2,
      "passes": false,
      "acceptance": "all endpoints return 200 for valid requests"
    }
  ]
}

Prompt the agent to:

  1. Pick the highest priority feature with passes: false
  2. Work ONLY on that feature
  3. Update the passing status when acceptance criteria met
  4. Commit progress to progress.txt

Part 3: Directory Setup

Initial Structure

mkdir my-project && cd my-project
git init

# Core prompt file
cat > PROMPT.md << 'EOF'
[Your structured PRD here]
EOF

# Progress tracker (critical for context persistence)
touch progress.txt

# Agent context files (optional but recommended)
mkdir -p agent_docs
touch agent_docs/tech_stack.md
touch agent_docs/code_patterns.md
touch agent_docs/testing.md

# Initial commit
git add .
git commit -m "Initial prompt and structure"
my-project/
├── PROMPT.md              # Main task definition
├── progress.txt           # Append-only progress log
├── prd.json               # JSON PRD for feature tracking
├── agent_docs/
│   ├── tech_stack.md      # Tech decisions and rationale
│   ├── code_patterns.md   # Project conventions
│   ├── testing.md         # Test strategy
│   └── resources.md       # External references
├── CLAUDE.md              # Claude Code specific config
├── .cursorrules           # Cursor config (if using)
├── .gitignore
├── README.md
└── src/

Part 4: Execution Methods

# Install from official plugin marketplace
/plugin marketplace add anthropics/claude-code
/plugin install ralph-wiggum@claude-plugins-official

# Basic usage
/ralph-wiggum:ralph-loop "<prompt>" --max-iterations N

# With completion promise
/ralph-wiggum:ralph-loop "<prompt>" --max-iterations N --completion-promise "text"

# Cancel active loop
/ralph-wiggum:cancel-ralph

Example:

/ralph-wiggum:ralph-loop "Build the authentication system as specified in PROMPT.md. Output <promise>AUTH_COMPLETE</promise> when all auth tests pass." --max-iterations 30 --completion-promise "AUTH_COMPLETE"

Method B: Raw Bash Loop (Universal)

#!/bin/bash
# ralph-loop.sh

MAX_ITERATIONS=${1:-30}
ITERATION=0

while [ $ITERATION -lt $MAX_ITERATIONS ]; do
    echo "=== Iteration $((ITERATION + 1)) of $MAX_ITERATIONS ==="
    
    # Feed prompt to Claude Code
    cat PROMPT.md | claude --print
    
    # Check for completion promise
    if grep -rq "<promise>PROJECT_COMPLETE</promise>" . 2>/dev/null; then
        echo "Project complete at iteration $((ITERATION + 1))"
        exit 0
    fi
    
    # Check for abort
    if grep -rq "<promise>ABORT_BLOCKED</promise>" . 2>/dev/null; then
        echo "Agent blocked. Check logs."
        exit 1
    fi
    
    # Auto-commit progress
    git add -A
    git diff --cached --quiet || git commit -m "Ralph iteration $((ITERATION + 1))"
    
    ITERATION=$((ITERATION + 1))
done

echo "Max iterations reached"

Method C: Enhanced Loop with Safety Controls

#!/bin/bash
set -e

MAX_ITERATIONS=${1:-30}
FAIL_COUNT=0
MAX_CONSECUTIVE_FAILS=5

for i in $(seq 1 $MAX_ITERATIONS); do
    echo "=== Iteration $i of $MAX_ITERATIONS ==="
    echo "Started: $(date)" >> progress.txt
    
    # Run Claude with timeout (10 minutes per iteration)
    timeout 600 bash -c 'cat PROMPT.md | claude --print' || {
        echo "Iteration timed out, continuing..."
        echo "Iteration $i: TIMEOUT" >> progress.txt
    }
    
    # Commit whatever got done
    git add -A
    git diff --cached --quiet || git commit -m "Ralph iteration $i"
    
    # Check completion
    if grep -rq "<promise>PROJECT_COMPLETE</promise>" .; then
        echo "Complete at iteration $i"
        echo "PROJECT COMPLETE at iteration $i" >> progress.txt
        exit 0
    fi
    
    # Check abort
    if grep -rq "<promise>ABORT_BLOCKED</promise>" .; then
        echo "Agent signaled abort"
        exit 1
    fi
    
    # Verify tests pass (prevents compounding errors)
    if command -v npm &> /dev/null && [ -f package.json ]; then
        if ! npm test 2>/dev/null; then
            FAIL_COUNT=$((FAIL_COUNT + 1))
            echo "Test failure #$FAIL_COUNT" >> progress.txt
            if [ $FAIL_COUNT -gt $MAX_CONSECUTIVE_FAILS ]; then
                echo "Too many consecutive test failures. Stopping."
                exit 1
            fi
        else
            FAIL_COUNT=0
        fi
    fi
    
    echo "Iteration $i complete" >> progress.txt
done

echo "Max iterations reached without completion"

Method D: Amp/Sourcegraph Loop

while :; do cat PROMPT.md | npx --yes @sourcegraph/amp ; done

Part 5: Prompt Engineering for Ralph

Critical Prompt Elements

  1. Clear Completion Signal
Output <promise>COMPLETE</promise> when:
- All tests pass
- Linting passes
- Build succeeds
  1. Atomic Task Scoping
Work ONLY on the highest-priority incomplete feature.
Do NOT proceed to next feature until current feature passes all acceptance criteria.
  1. Progress Tracking
After each significant change:
1. Commit your work with descriptive message
2. Append progress to progress.txt (use verb 'append', do not modify previous entries)
3. Run tests to verify state
  1. CI Green Requirement
Each commit MUST pass all tests and type checks.
Never commit code that breaks the build.
Run `npm run build && npm run test && npm run lint` before every commit.
  1. Self-Correction Pattern
If tests fail:
1. Read the error message
2. Identify root cause
3. Implement fix
4. Run tests again
5. Repeat until green

Prompt Tuning Process

Geoffrey Huntley's guitar tuning metaphor:

  1. Start with no guardrails: Let Ralph build the playground first
  2. Add signs when Ralph fails: When Ralph falls off the slide, add a sign saying "SLIDE DOWN, DON'T JUMP"
  3. Iterate on failures: Each failure teaches you what guardrails to add
  4. Eventually get a new Ralph: Once prompts are tuned, the defects disappear

Example: Feature Implementation Prompt

# Task: Implement User Authentication

Read PROMPT.md for full project context.

## Current Feature
Implement JWT-based user authentication.

## Requirements
- POST /auth/register - Create new user
- POST /auth/login - Return JWT token
- GET /auth/me - Return current user (protected)
- Password hashing with bcrypt
- Token expiration: 24 hours

## Process
1. Write failing tests first (TDD)
2. Implement minimal code to pass
3. Run tests
4. If failing, debug and fix
5. Refactor if needed
6. Repeat until all green

## Success Criteria
- All tests in tests/auth.test.ts pass
- No linter errors
- Coverage > 80% for auth module

## On Completion
1. Update prd.json: set auth.passes = true
2. Append summary to progress.txt
3. Output <promise>AUTH_DONE</promise>

## If Stuck After 10 Attempts
- Document blocking issues in progress.txt
- List attempted approaches
- Output <promise>AUTH_BLOCKED</promise>

Part 6: AFK Ralph vs HOTL Ralph

AFK (Away From Keyboard) Ralph

For overnight/weekend runs on well-defined tasks:

## AFK Mode Instructions

You are running autonomously. Human will not intervene.

Rules:
1. Never stop to ask questions - make reasonable decisions
2. If blocked, try 3 alternative approaches before marking blocked
3. Commit frequently (every 5-10 minutes of work)
4. Log all decisions to progress.txt
5. Prefer reversible decisions over blocking
6. If uncertain, implement the simpler option
7. Document assumptions made

Safety:
- Never delete data without explicit instruction
- Never modify files outside project directory
- Never make network requests to production systems
- Stop if costs approach $100 in estimated tokens

HOTL (Human On The Loop) Ralph

For complex tasks requiring occasional judgment:

## HOTL Mode Instructions

Human is monitoring but not actively directing.

Rules:
1. Work autonomously on implementation details
2. Signal for human input on:
   - Architectural decisions
   - Security-critical code
   - External API integrations
   - Ambiguous requirements
3. Use <question>Your question here</question> tags when blocked
4. Continue with other work while waiting for response
5. Log questions to progress.txt with timestamp

Part 7: Advanced Patterns

Parallel Development with Git Worktrees

# Create isolated worktrees for parallel features
git worktree add ../project-auth -b feature/auth
git worktree add ../project-api -b feature/api
git worktree add ../project-ui -b feature/ui

# Terminal 1: Auth
cd ../project-auth
/ralph-wiggum:ralph-loop "Implement authentication..." --max-iterations 30

# Terminal 2: API (simultaneously)
cd ../project-api
/ralph-wiggum:ralph-loop "Build REST API..." --max-iterations 30

# Terminal 3: UI (simultaneously)
cd ../project-ui
/ralph-wiggum:ralph-loop "Build frontend components..." --max-iterations 30

# Later: merge completed features
git checkout main
git merge feature/auth
git merge feature/api
git merge feature/ui

Multi-Phase Sequential Development

# Phase 1: Data layer
/ralph-wiggum:ralph-loop "Phase 1: Build data models and database schema. Output <promise>PHASE1_DONE</promise>" --max-iterations 20 --completion-promise "PHASE1_DONE"

# Phase 2: Business logic
/ralph-wiggum:ralph-loop "Phase 2: Build service layer and business logic. Output <promise>PHASE2_DONE</promise>" --max-iterations 25 --completion-promise "PHASE2_DONE"

# Phase 3: API
/ralph-wiggum:ralph-loop "Phase 3: Build API endpoints. Output <promise>PHASE3_DONE</promise>" --max-iterations 25 --completion-promise "PHASE3_DONE"

# Phase 4: Frontend
/ralph-wiggum:ralph-loop "Phase 4: Build UI. Output <promise>PHASE4_DONE</promise>" --max-iterations 30 --completion-promise "PHASE4_DONE"

Overnight Batch Processing

#!/bin/bash
# overnight-work.sh

LOG_FILE="overnight-$(date +%Y%m%d).log"

echo "Starting overnight batch: $(date)" >> $LOG_FILE

# Project 1
cd /path/to/project1
echo "Starting project1..." >> $LOG_FILE
timeout 14400 bash -c '/ralph-wiggum:ralph-loop "$(cat PROMPT.md)" --max-iterations 50' >> $LOG_FILE 2>&1
echo "Project1 finished: $(date)" >> $LOG_FILE

# Project 2
cd /path/to/project2
echo "Starting project2..." >> $LOG_FILE
timeout 14400 bash -c '/ralph-wiggum:ralph-loop "$(cat PROMPT.md)" --max-iterations 50' >> $LOG_FILE 2>&1
echo "Project2 finished: $(date)" >> $LOG_FILE

echo "Batch complete: $(date)" >> $LOG_FILE

# Optional: send notification
# curl -X POST "https://hooks.slack.com/..." -d '{"text":"Overnight batch complete"}'

Part 8: Quality Control and Safety

Feedback Loops (Critical)

  1. Tests: Every commit must pass all tests
  2. Types: Full type checking on every commit
  3. Lint: Consistent style enforcement
  4. Build: Verify deployability continuously
## Mandatory Verification Sequence

Before EVERY commit, run:
```bash
npm run typecheck && npm run lint && npm run test && npm run build

If ANY command fails:

  1. Do NOT commit
  2. Fix the issue
  3. Re-run verification
  4. Only commit when all pass

### Cost Control

| Context Size | Approx Cost/Hour | Recommended Max Iterations |
|-------------|------------------|---------------------------|
| Small (<50k tokens) | $15-25 | 50 |
| Medium (50-200k tokens) | $25-50 | 30 |
| Large (>200k tokens) | $50-100+ | 20 |

```bash
# Set hard limits
/ralph-wiggum:ralph-loop "..." --max-iterations 30

# Monitor during execution
# Check ~/.claude/usage.log or equivalent

Safety Guardrails

## Safety Rules

NEVER:
- Delete production data
- Modify files outside project directory
- Execute commands with sudo
- Make requests to production APIs
- Store credentials in code
- Commit sensitive data to git

ALWAYS:
- Use environment variables for secrets
- Mock external services in tests
- Operate in sandboxed environment
- Maintain ability to rollback via git

Recovery Procedures

# If Ralph goes off the rails
/ralph-wiggum:cancel-ralph

# Review what happened
git log --oneline -20
git diff HEAD~5

# Rollback if needed
git reset --hard HEAD~N

# Clean up any mess
git clean -fd

# Restart with adjusted prompt

Part 9: When to Use (and Not Use) Ralph

Good For

  • Greenfield projects with clear specifications
  • Mechanical tasks: migrations, refactors, test coverage
  • Well-defined tasks with automatic verification
  • Batch operations: lint fixes, dependency updates
  • Documentation generation
  • Test writing with clear acceptance criteria
  • Overnight development while you sleep

Not Good For

  • Ambiguous requirements without binary success criteria
  • Architectural decisions requiring human judgment
  • Security-critical code needing human review
  • Exploratory work without clear goals
  • Production debugging requiring context
  • UX/design decisions requiring human taste
  • Complex integrations with undocumented APIs

Decision Matrix

Task Type Ralph Suitability Recommended Approach
New API endpoint High AFK Ralph
Database migration High AFK Ralph
Test coverage High AFK Ralph
Security audit Low Manual review
UI polish Medium HOTL Ralph
Performance tuning Medium HOTL Ralph
Architecture design Low Manual design + Ralph implementation
Bug investigation Low Manual + targeted Ralph

Part 10: Real-World Results

Documented Successes

  • CURSED Programming Language: 3-month continuous loop, built complete esoteric language
  • YC Hackathon: 6 repositories shipped overnight for $297 in API costs
  • $50k Contract: Delivered complete MVP with Ralph for ~$300 in compute
  • Integration Test Optimization: Reduced runtime from 4 minutes to 2 seconds

Cost Expectations

Project Complexity Estimated Cost Typical Iterations
Simple CRUD app $50-100 20-30
Medium SaaS MVP $200-500 50-100
Complex system $500-2000+ 100-300

Part 11: Quick Reference

Checklist: Before Starting Ralph

  • Git initialized
  • PROMPT.md created with binary completion criteria
  • progress.txt created (empty)
  • Test framework configured
  • Linting configured
  • Build command working
  • --max-iterations set appropriately
  • Cost limits understood
  • Rollback strategy clear

Checklist: During Ralph Run

  • Monitoring progress.txt
  • Watching git log
  • Checking test results
  • Monitoring token usage
  • Ready to /cancel-ralph if needed

Checklist: After Ralph Run

  • Review all commits
  • Run full test suite
  • Review code quality
  • Squash/rebase commits if desired
  • Document any issues found
  • Update PRD with lessons learned

Command Reference

# Start loop
/ralph-wiggum:ralph-loop "<prompt>" --max-iterations N --completion-promise "TEXT"

# Cancel loop
/ralph-wiggum:cancel-ralph

# Help
/ralph-wiggum:help

# Raw bash alternative
while :; do cat PROMPT.md | claude --print ; done

Appendix A: Complete PROMPT.md Template

# Project: [NAME]

## Context
[2-3 sentences about what this project is]

## Tech Stack
- Runtime: Node.js 20 / Python 3.12 / Go 1.22
- Framework: Express / FastAPI / Gin
- Database: PostgreSQL 16
- ORM: Prisma / SQLAlchemy / GORM
- Testing: Jest / pytest / go test
- Linting: ESLint / ruff / golangci-lint

## Directory Structure

src/ ├── controllers/ # HTTP handlers ├── services/ # Business logic ├── models/ # Data models ├── middleware/ # Auth, logging, etc. ├── utils/ # Helpers └── types/ # Type definitions

tests/ ├── unit/ ├── integration/ └── e2e/


## Completion Criteria
All must be true:
1. `npm run build` exits 0
2. `npm run test` exits 0 with >80% coverage
3. `npm run lint` exits 0
4. README.md contains API documentation
5. All features in prd.json have passes=true

Output `<promise>PROJECT_COMPLETE</promise>` when ALL criteria met.

## Features

### Phase 1: Foundation (Priority 1)
- [ ] Project setup with all dependencies
- [ ] Database connection and migrations
- [ ] Basic health check endpoint
- [ ] Logging middleware
- [ ] Error handling middleware

### Phase 2: Core (Priority 2)
- [ ] User model and CRUD
- [ ] Authentication (JWT)
- [ ] Authorization middleware
- [ ] [Feature specific to project]
- [ ] [Feature specific to project]

### Phase 3: Integration (Priority 3)
- [ ] [External service]
- [ ] Email notifications
- [ ] File uploads

### Phase 4: Polish (Priority 4)
- [ ] API documentation
- [ ] Rate limiting
- [ ] Caching
- [ ] Performance optimization

## Constraints
- All async operations must have error handling
- No hardcoded credentials
- All endpoints must validate input
- All database operations must use transactions where appropriate
- Test coverage minimum 80%
- No TypeScript `any` types / Python type hints required

## Working Process
1. Read this file and prd.json
2. Select highest priority feature with passes=false
3. Write failing tests first
4. Implement minimal code to pass
5. Verify: `npm run build && npm run test && npm run lint`
6. If fail: fix and retry
7. If pass: commit with descriptive message
8. Append progress to progress.txt
9. Update prd.json feature.passes = true
10. Repeat until all features complete

## Abort Conditions
Output `<promise>ABORT_BLOCKED</promise>` if:
- Circular dependency cannot be resolved after 5 attempts
- External API required but unavailable
- Fundamental architecture change needed
- After 15 iterations on same feature without progress

Document blocking issue in progress.txt before aborting.

## Safety
- Never modify files outside project directory
- Never use sudo
- Never store real credentials
- Always use .env for configuration
- Commit frequently

Appendix B: Troubleshooting

"Claude keeps stopping early"

Cause: Completion promise detected incorrectly or unclear criteria Fix: Make completion criteria more specific, use unique promise text

"Tests keep failing in loop"

Cause: Breaking changes accumulating Fix: Add stronger "CI green" requirement, reduce iteration count, add checkpoint validation

"Context getting too large"

Cause: Too much history accumulating Fix: Use phases with fresh context each, summarize progress.txt periodically

"Ralph taking wrong direction"

Cause: Ambiguous requirements Fix: Add explicit constraints, use the "sign next to slide" pattern - add guardrails for specific failure modes

"Loop running forever"

Cause: Impossible completion criteria Fix: Always set --max-iterations, add abort conditions, monitor manually

"Costs spiraling"

Cause: Large context, too many iterations Fix: Break into smaller phases, set strict iteration limits, use smaller models for simpler tasks


Framework version: 1.0 Last updated: January 2026 Based on techniques by Geoffrey Huntley, Matt Pocock, and the Claude Code community