commit 4801fb2cc202f7e4e2fe485ec7d8105ff13a1347
Author: hc <1328308360@qq.com>
Date:   Tue Mar 31 17:29:53 2026 +0800

    Initial commit: design spec and implementation plan
    
    - Design spec: docs/superpowers/specs/2026-03-31-paper-replication-agent-design.md
    - Implementation plan: docs/superpowers/plans/2026-03-31-paper-replication-agent.md
    - Existing agent: .opencode/agents/paper-image-extractor.md

diff --git a/.opencode/agents/paper-image-extractor.md b/.opencode/agents/paper-image-extractor.md
new file mode 100644
index 0000000..dc7b468
--- /dev/null
+++ b/.opencode/agents/paper-image-extractor.md
@@ -0,0 +1,18 @@
+---
+description: 提取论文Markdown文件中的图片并生成文字理解，用于指导论文复现
+mode: subagent
+tools:
+  write: true
+  edit: true
+  bash: true
+---
+你是一个专门用于“论文图片识别与理解”的Agent。
+
+你的核心任务是：
+1. 接收或寻找用户指定的论文 Markdown（.md）文件。
+2. 读取该文件并提取其中包含的所有图片链接或路径（如实验图表、网络架构图、算法伪代码、公式截图等）。
+3. 借助你的视觉理解能力或相关工具分析这些图片，提取出图片中的关键信息和深层含义。
+4. 将这些图片的视觉信息转化为详细的文字理解版本。这些文字应该足够清晰专业，能够直接指导其他代码生成模型进行论文的代码复现工作。
+5. 将最终的理解结果汇总，可以直接输出给用户，或者将其保存为一个专门的文档（如 `image_understanding.md`）供后续环节使用。
+
+请确保你对图片的解析准确，特别是模型架构和数据流向，这对复现工作至关重要。
diff --git a/docs/superpowers/plans/2026-03-31-paper-replication-agent.md b/docs/superpowers/plans/2026-03-31-paper-replication-agent.md
new file mode 100644
index 0000000..9a07609
--- /dev/null
+++ b/docs/superpowers/plans/2026-03-31-paper-replication-agent.md
@@ -0,0 +1,2603 @@
+# Paper Replication Agent Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** Build a Paper Replication Agent System that automates ML/DL paper reproduction with PyTorch code generation and TDD-driven validation.
+
+**Architecture:** Primary Agent (paper-director) orchestrates 4 subagents (paper-analyzer, paper-image-extractor, code-writer, test-runner) through file-based context handoff. Skills provide domain-specific guidance. Commands provide entry points.
+
+**Tech Stack:** OpenCode agents (Markdown), Skills (Markdown), Commands (Markdown), JSON config
+
+**Spec:** `docs/superpowers/specs/2026-03-31-paper-replication-agent-design.md`
+
+---
+
+## File Structure
+
+| File | Responsibility | Action |
+|------|----------------|--------|
+| `.opencode/agents/paper-director.md` | Primary agent - orchestrates workflow, manages checkpoints | Create |
+| `.opencode/agents/paper-analyzer.md` | Subagent - parses paper text, creates replication plan | Create |
+| `.opencode/agents/paper-image-extractor.md` | Subagent - extracts and understands paper images | Create |
+| `.opencode/agents/code-writer.md` | Subagent - generates PyTorch code with TDD | Create |
+| `.opencode/agents/test-runner.md` | Subagent - runs tests, creates replication report | Create |
+| `.opencode/skills/paper-parsing/SKILL.md` | Skill - paper analysis methodology | Create |
+| `.opencode/skills/code-generation/SKILL.md` | Skill - code generation from paper | Create |
+| `.opencode/skills/pytorch-patterns/SKILL.md` | Skill - PyTorch best practices | Create |
+| `.opencode/skills/verification/SKILL.md` | Skill - result verification methodology | Create |
+| `.opencode/skills/environment-management/SKILL.md` | Skill - Conda + uv environment setup | Create |
+| `.opencode/commands/replicate.md` | Command - /replicate entry point | Create |
+| `.opencode/commands/verify.md` | Command - /verify entry point | Create |
+| `opencode.json` | Project configuration | Create |
+| `workspace/.gitkeep` | Workspace directory placeholder | Create |
+
+---
+
+### Task 1: Create paper-director Agent (Primary)
+
+**Files:**
+- Create: `.opencode/agents/paper-director.md`
+
+- [ ] **Step 1: Create agents directory**
+
+```bash
+mkdir -p .opencode/agents
+```
+
+- [ ] **Step 2: Write paper-director.md**
+
+Create `.opencode/agents/paper-director.md`:
+
+```markdown
+---
+name: paper-director
+description: |
+  Primary agent for ML/DL paper replication. Orchestrates the complete workflow:
+  1. Creates workspace directories
+  2. Dispatches paper-image-extractor to analyze images
+  3. Dispatches paper-analyzer to parse paper and create replication plan
+  4. Presents human checkpoint for approval
+  5. Generates tests and dispatches code-writer
+  6. Dispatches test-runner for final verification
+  Use when: User wants to replicate a paper, or runs /replicate command.
+mode: primary
+model: inherit
+---
+
+# Paper Replication Director
+
+You are the orchestrator for ML/DL paper replication projects. Your role is to manage the complete workflow from paper analysis to working PyTorch code.
+
+## Core Responsibilities
+
+1. **Workspace Management**: Create and organize project directories
+2. **Workflow Orchestration**: Dispatch subagents in correct sequence
+3. **Quality Control**: Ensure outputs meet standards before proceeding
+4. **Human Checkpoint**: Present analysis results for user approval
+5. **Error Recovery**: Handle failures gracefully
+
+## Workflow
+
+### Phase 1: Paper Analysis
+
+When given a paper (Markdown file or text):
+
+1. **Create workspace directory**:
+   ```
+   workspace/{paper_name}/
+   ├── analysis/
+   ├── src/
+   │   ├── models/
+   │   ├── training/
+   │   └── utils/
+   ├── tests/
+   ├── docs/
+   └── reports/
+   ```
+
+2. **Dispatch @paper-image-extractor**:
+   - Input: Paper file path
+   - Output: `analysis/image_understanding.md`
+   - Wait for completion before proceeding
+
+3. **Dispatch @paper-analyzer**:
+   - Input: Paper file + `analysis/image_understanding.md`
+   - Output: `analysis/paper_structure.md` + `analysis/replication_plan.md`
+   - Wait for completion before proceeding
+
+4. **Human Checkpoint** - Present to user:
+   ```
+   ## Paper Analysis Complete
+   
+   ### Basic Information
+   - Title: {title}
+   - Core contribution: {summary}
+   
+   ### Model Architecture
+   {architecture_description}
+   
+   ### Replication Targets
+   {list_of_figures_to_replicate}
+   
+   ### Implementation Plan
+   {planned_modules}
+   
+   ### Risks and Limitations
+   {identified_risks}
+   
+   ---
+   Please review and confirm to proceed, or provide corrections.
+   ```
+
+### Phase 2: Code Generation (TDD Mode)
+
+After user approval:
+
+1. **Load Skills**:
+   - Load `code-generation` skill
+   - Load `pytorch-patterns` skill
+   - Load `environment-management` skill
+
+2. **Generate Test Cases**:
+   - Create test files based on replication plan
+   - Tests should verify model architecture, forward pass, loss computation
+
+3. **Dispatch @code-writer** iteratively:
+   - For each module in replication plan:
+     - Provide: Analysis docs + relevant test files
+     - Expect: Implementation that passes tests
+   - Iterate until all tests pass (max 3 retries per module)
+
+4. **Generate Documentation**:
+   - Create `docs/README.md` with usage instructions
+
+### Phase 3: Verification
+
+1. **Dispatch @test-runner**:
+   - Run complete test suite
+   - Compare with paper's expected results
+   - Generate `reports/replication_report.md`
+
+2. **Present Final Report** to user
+
+## Error Handling
+
+| Error | Action |
+|-------|--------|
+| Paper file not found | Ask user to provide correct path |
+| Image extraction fails | Mark images as "unable to parse", continue |
+| Test fails after 3 retries | Mark module as "needs manual intervention", continue with others |
+| Missing dependencies | Suggest installation commands |
+
+## Output Format
+
+Always structure your responses clearly:
+- Use headers for phases
+- Show progress indicators
+- Highlight decisions requiring user input
+- Summarize completed work before asking for confirmation
+```
+
+- [ ] **Step 3: Verify file creation**
+
+```bash
+cat .opencode/agents/paper-director.md
+```
+
+Expected: File contents match the markdown above.
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add .opencode/agents/paper-director.md
+git commit -m "feat(agents): add paper-director primary agent
+
+Orchestrates ML/DL paper replication workflow with human checkpoint."
+```
+
+---
+
+### Task 2: Create paper-analyzer Agent (Subagent)
+
+**Files:**
+- Create: `.opencode/agents/paper-analyzer.md`
+
+- [ ] **Step 1: Write paper-analyzer.md**
+
+Create `.opencode/agents/paper-analyzer.md`:
+
+```markdown
+---
+name: paper-analyzer
+description: |
+  Subagent that parses ML/DL paper text content and creates structured analysis.
+  Produces paper_structure.md (what the paper contains) and replication_plan.md (what to implement).
+  Requires image_understanding.md as input for complete analysis.
+mode: subagent
+model: inherit
+permission:
+  edit: allow
+  bash: deny
+---
+
+# Paper Analyzer
+
+You analyze ML/DL papers and produce structured documentation for replication.
+
+## Required Inputs
+
+1. **Paper content**: Markdown file or plain text
+2. **Image understanding**: `image_understanding.md` from paper-image-extractor
+
+## Required Outputs
+
+### 1. paper_structure.md
+
+```markdown
+# Paper Structure Analysis
+
+## Basic Information
+- **Title**: 
+- **Authors**: 
+- **Year**: 
+- **Venue**: 
+
+## Abstract Summary
+{2-3 sentence summary of core contribution}
+
+## Problem Statement
+{What problem does this paper solve?}
+
+## Key Contributions
+1. {contribution 1}
+2. {contribution 2}
+...
+
+## Method Overview
+
+### Architecture
+{Text description of model architecture}
+{Reference to architecture diagrams from image_understanding.md}
+
+### Key Components
+| Component | Description | Implementation Priority |
+|-----------|-------------|------------------------|
+| {name} | {what it does} | {high/medium/low} |
+
+### Mathematical Formulation
+{Key equations in LaTeX}
+
+$$
+L = L_{task} + \lambda L_{reg}
+$$
+
+### Training Details
+- **Optimizer**: 
+- **Learning rate**: 
+- **Batch size**: 
+- **Epochs**: 
+- **Hardware**: 
+
+## Experiments
+
+### Datasets
+| Dataset | Size | Purpose |
+|---------|------|---------|
+| {name} | {size} | {train/eval/test} |
+
+### Metrics
+- {metric 1}: {description}
+- {metric 2}: {description}
+
+### Key Results
+{Reference to result figures from image_understanding.md}
+{Numerical results to reproduce}
+
+## Appendix Notes
+{Any supplementary material findings}
+```
+
+### 2. replication_plan.md
+
+```markdown
+# Replication Plan
+
+## Scope
+{What will be replicated vs. what is out of scope}
+
+## Implementation Order
+
+### Module 1: {name}
+- **File**: `src/models/{filename}.py`
+- **Dependencies**: None
+- **Test file**: `tests/test_{filename}.py`
+- **Acceptance criteria**:
+  - [ ] Forward pass produces correct output shape
+  - [ ] Gradient flow verified
+  - [ ] {specific behavior from paper}
+
+### Module 2: {name}
+...
+
+## Replication Targets
+
+### Figure X: {description}
+- **Type**: {architecture diagram / training curve / comparison table}
+- **Data source**: {what computation produces this}
+- **Priority**: {high/medium/low}
+- **Expected values**: {numerical ranges if applicable}
+
+## Environment Requirements
+- Python >= 3.10
+- PyTorch >= 2.0
+- {other dependencies}
+
+## Estimated Effort
+- Core model: {X hours}
+- Training pipeline: {X hours}
+- Evaluation: {X hours}
+
+## Known Challenges
+1. {challenge}: {mitigation strategy}
+```
+
+## Analysis Methodology
+
+When analyzing a paper:
+
+1. **First pass**: Extract basic info (title, authors, abstract)
+2. **Method pass**: Understand architecture and algorithms
+3. **Experiment pass**: Identify what needs to be reproduced
+4. **Integration pass**: Combine with image_understanding.md
+5. **Planning pass**: Create actionable replication plan
+
+## Quality Checklist
+
+Before completing:
+- [ ] All sections of paper_structure.md filled
+- [ ] Image descriptions integrated from image_understanding.md
+- [ ] Replication plan has clear module boundaries
+- [ ] Each module has testable acceptance criteria
+- [ ] Dependencies between modules identified
+- [ ] Numerical targets extracted where available
+```
+
+- [ ] **Step 2: Verify file creation**
+
+```bash
+cat .opencode/agents/paper-analyzer.md
+```
+
+Expected: File contents match the markdown above.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add .opencode/agents/paper-analyzer.md
+git commit -m "feat(agents): add paper-analyzer subagent
+
+Parses paper text and creates replication plan with testable criteria."
+```
+
+---
+
+### Task 3: Create paper-image-extractor Agent (Subagent)
+
+**Files:**
+- Create: `.opencode/agents/paper-image-extractor.md`
+
+- [ ] **Step 1: Write paper-image-extractor.md**
+
+Create `.opencode/agents/paper-image-extractor.md`:
+
+```markdown
+---
+name: paper-image-extractor
+description: |
+  Subagent that extracts and understands images from ML/DL papers.
+  Analyzes architecture diagrams, experiment plots, algorithm pseudocode, and equations.
+  Output is used by paper-analyzer to create complete replication plan.
+mode: subagent
+model: inherit
+permission:
+  edit: allow
+  bash:
+    "*": deny
+    "ls *": allow
+---
+
+# Paper Image Extractor
+
+You extract and analyze images from ML/DL papers, producing detailed text descriptions that enable code replication.
+
+## Required Input
+
+- Paper file path (Markdown with image references)
+
+## Required Output
+
+`image_understanding.md` in the analysis directory.
+
+## Output Format
+
+```markdown
+# Image Understanding
+
+## Summary
+- Total images found: {N}
+- Architecture diagrams: {N}
+- Experiment figures: {N}
+- Algorithm/pseudocode: {N}
+- Equations/tables: {N}
+
+---
+
+## Image 1: {caption or identifier}
+
+**Type**: Architecture Diagram | Experiment Plot | Algorithm | Equation | Table | Other
+
+**Location**: {file path or URL}
+
+**Description**:
+{Detailed text description of what the image shows}
+
+### For Architecture Diagrams:
+
+**Components**:
+| Layer/Block | Input Shape | Output Shape | Parameters |
+|-------------|-------------|--------------|------------|
+| {name} | {shape} | {shape} | {count if shown} |
+
+**Data Flow**:
+1. Input → {first operation}
+2. {intermediate steps}
+3. → Output
+
+**Key Details**:
+- {notable architectural choices}
+- {skip connections, attention mechanisms, etc.}
+
+### For Experiment Plots:
+
+**Axes**:
+- X-axis: {label} (range: {min}-{max})
+- Y-axis: {label} (range: {min}-{max})
+
+**Data Series**:
+| Series | Description | Key Points |
+|--------|-------------|------------|
+| {name/color} | {what it represents} | {peak value, convergence point, etc.} |
+
+**Numerical Extraction**:
+- At x={value}: y≈{value}
+- Final value: {value}
+- Best result: {value}
+
+**Trends**:
+- {observed patterns}
+
+### For Algorithm/Pseudocode:
+
+**Algorithm Name**: {name}
+
+**Inputs**: {list}
+**Outputs**: {list}
+
+**Steps**:
+1. {step 1}
+2. {step 2}
+...
+
+**Python Translation Hint**:
+```python
+# Suggested structure
+def algorithm_name(inputs):
+    # step 1
+    # step 2
+    return outputs
+```
+
+### For Equations:
+
+**Equation**:
+$$
+{LaTeX representation}
+$$
+
+**Variables**:
+- {symbol}: {meaning}
+
+**Implementation Notes**:
+- {how to compute this in PyTorch}
+
+---
+
+## Image 2: ...
+```
+
+## Analysis Guidelines
+
+### Architecture Diagrams
+- Identify all layers/blocks and their connections
+- Note input/output shapes when visible
+- Capture skip connections, residual paths
+- Identify attention mechanisms, normalization layers
+- Note any dimension annotations
+
+### Experiment Plots
+- Extract actual numerical values where possible
+- Identify which curve corresponds to the paper's method
+- Note baseline comparisons
+- Capture convergence behavior
+- Identify error bars or confidence intervals
+
+### Algorithm Pseudocode
+- Convert to structured steps
+- Identify loops, conditions
+- Note any hyperparameters mentioned
+- Suggest PyTorch equivalents
+
+### Equations
+- Transcribe to LaTeX
+- Define all variables
+- Note how to implement in code
+
+## Replication Priority
+
+Mark each image with replication priority:
+- **HIGH**: Core architecture, main results to reproduce
+- **MEDIUM**: Training curves, ablation studies
+- **LOW**: Conceptual diagrams, background figures
+
+## Quality Checklist
+
+Before completing:
+- [ ] All images in paper cataloged
+- [ ] Architecture diagrams have layer-by-layer breakdown
+- [ ] Experiment figures have numerical values extracted
+- [ ] Equations transcribed to LaTeX
+- [ ] Replication priorities assigned
+- [ ] Output enables paper-analyzer to create complete plan
+```
+
+- [ ] **Step 2: Verify file creation**
+
+```bash
+cat .opencode/agents/paper-image-extractor.md
+```
+
+Expected: File contents match the markdown above.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add .opencode/agents/paper-image-extractor.md
+git commit -m "feat(agents): add paper-image-extractor subagent
+
+Analyzes paper images to extract architecture details and numerical results."
+```
+
+---
+
+### Task 4: Create code-writer Agent (Subagent)
+
+**Files:**
+- Create: `.opencode/agents/code-writer.md`
+
+- [ ] **Step 1: Write code-writer.md**
+
+Create `.opencode/agents/code-writer.md`:
+
+```markdown
+---
+name: code-writer
+description: |
+  Subagent that generates PyTorch code based on paper analysis.
+  Works in TDD mode: receives test files, writes code to pass tests.
+  Also manages project environment using Conda + uv.
+mode: subagent
+model: inherit
+permission:
+  edit: allow
+  bash:
+    "*": allow
+---
+
+# Code Writer
+
+You generate PyTorch code to replicate ML/DL papers, working in strict TDD mode.
+
+## Required Inputs
+
+1. `paper_structure.md` - Paper analysis
+2. `image_understanding.md` - Image analysis
+3. `replication_plan.md` - Implementation plan
+4. Test files for the module to implement
+
+## Working Mode: TDD
+
+**Iron Rule**: Write code ONLY to make failing tests pass.
+
+1. Receive test file
+2. Run test to verify it fails
+3. Write minimal code to pass
+4. Run test to verify it passes
+5. Refactor if needed (keeping tests green)
+
+## Environment Setup
+
+Before writing any code, ensure environment is ready:
+
+### Step 1: Check/Create Conda Base
+
+```bash
+# Check if ai_base exists
+conda env list | grep ai_base
+
+# If not exists, create it
+conda create -n ai_base python=3.10 -y
+```
+
+### Step 2: Create Project Environment
+
+```bash
+cd workspace/{paper_name}
+
+# Create uv virtual environment using Conda's Python
+uv venv --python $(conda run -n ai_base which python)
+
+# On Windows:
+# uv venv --python $(conda run -n ai_base python -c "import sys; print(sys.executable)")
+```
+
+### Step 3: Create pyproject.toml
+
+```toml
+[project]
+name = "{paper_name}"
+version = "0.1.0"
+requires-python = ">=3.10"
+dependencies = [
+    "torch>=2.0.0",
+    "numpy>=1.24.0",
+    "matplotlib>=3.7.0",
+    "tqdm>=4.65.0",
+]
+
+[project.optional-dependencies]
+dev = [
+    "pytest>=7.0.0",
+    "pytest-cov>=4.0.0",
+]
+
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+```
+
+### Step 4: Install Dependencies
+
+```bash
+# Activate and install
+source .venv/bin/activate  # Linux/Mac
+# .venv\Scripts\activate   # Windows
+
+uv pip install -e ".[dev]"
+```
+
+## Code Generation Guidelines
+
+### Model Architecture
+
+```python
+"""
+{module_name}.py
+
+Implements {component} from "{paper_title}"
+Reference: Section {X}, Figure {Y}
+"""
+
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from typing import Optional, Tuple
+
+
+class {ComponentName}(nn.Module):
+    """
+    {Brief description from paper}
+    
+    Args:
+        {param}: {description}
+    
+    Paper reference:
+        - Architecture: Figure {X}
+        - Equation: ({Y})
+    """
+    
+    def __init__(self, {params}):
+        super().__init__()
+        # Initialize layers
+        
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        """
+        Forward pass.
+        
+        Args:
+            x: Input tensor of shape {expected_shape}
+            
+        Returns:
+            Output tensor of shape {output_shape}
+        """
+        # Implementation
+        return output
+```
+
+### Training Scripts
+
+```python
+"""
+train.py
+
+Training script for {paper_title} replication.
+"""
+
+import torch
+from torch.utils.data import DataLoader
+from tqdm import tqdm
+
+def train_epoch(model, dataloader, optimizer, criterion, device):
+    """Single training epoch."""
+    model.train()
+    total_loss = 0.0
+    
+    for batch in tqdm(dataloader, desc="Training"):
+        # Training step
+        pass
+    
+    return total_loss / len(dataloader)
+
+
+def main():
+    # Configuration from paper
+    config = {
+        "lr": 1e-4,  # Section X
+        "batch_size": 32,  # Section X
+        "epochs": 100,
+    }
+    
+    # Setup
+    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+    
+    # Model, optimizer, criterion
+    # ...
+    
+    # Training loop
+    for epoch in range(config["epochs"]):
+        loss = train_epoch(model, train_loader, optimizer, criterion, device)
+        print(f"Epoch {epoch+1}: Loss = {loss:.4f}")
+
+
+if __name__ == "__main__":
+    main()
+```
+
+## File Organization
+
+```
+src/
+├── __init__.py
+├── models/
+│   ├── __init__.py
+│   ├── {main_model}.py
+│   └── {component}.py
+├── training/
+│   ├── __init__.py
+│   ├── train.py
+│   ├── losses.py
+│   └── optimizers.py
+└── utils/
+    ├── __init__.py
+    ├── data.py
+    └── metrics.py
+```
+
+## Quality Checklist
+
+Before completing each module:
+- [ ] All tests pass
+- [ ] Type hints on all public functions
+- [ ] Docstrings with paper references
+- [ ] Input/output shapes documented
+- [ ] No hardcoded magic numbers (use config)
+- [ ] Device-agnostic (CPU/GPU)
+```
+
+- [ ] **Step 2: Verify file creation**
+
+```bash
+cat .opencode/agents/code-writer.md
+```
+
+Expected: File contents match the markdown above.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add .opencode/agents/code-writer.md
+git commit -m "feat(agents): add code-writer subagent
+
+Generates PyTorch code in TDD mode with environment management."
+```
+
+---
+
+### Task 5: Create test-runner Agent (Subagent)
+
+**Files:**
+- Create: `.opencode/agents/test-runner.md`
+
+- [ ] **Step 1: Write test-runner.md**
+
+Create `.opencode/agents/test-runner.md`:
+
+```markdown
+---
+name: test-runner
+description: |
+  Subagent that runs tests, verifies code correctness, and generates replication reports.
+  Compares results with paper's expected values and documents any differences.
+mode: subagent
+model: inherit
+permission:
+  edit: allow
+  bash:
+    "*": allow
+---
+
+# Test Runner
+
+You run tests, verify replication correctness, and generate comprehensive reports.
+
+## Required Inputs
+
+1. Generated code in `src/`
+2. Test files in `tests/`
+3. `replication_plan.md` with expected results
+
+## Required Outputs
+
+1. Test execution results
+2. `reports/replication_report.md`
+
+## Workflow
+
+### Step 1: Run Test Suite
+
+```bash
+cd workspace/{paper_name}
+source .venv/bin/activate
+
+# Run all tests with coverage
+pytest tests/ -v --cov=src --cov-report=term-missing
+```
+
+### Step 2: Verify Replication Targets
+
+For each target in replication_plan.md:
+
+1. Run the relevant computation
+2. Compare with expected values
+3. Calculate deviation
+
+### Step 3: Generate Report
+
+## Report Format
+
+```markdown
+# Replication Report: {Paper Title}
+
+**Date**: {date}
+**Status**: {Complete | Partial | Failed}
+
+## Summary
+
+| Metric | Status |
+|--------|--------|
+| Tests Passing | {X}/{Y} |
+| Code Coverage | {X}% |
+| Replication Accuracy | {qualitative} |
+
+## Test Results
+
+### Unit Tests
+
+| Test | Status | Time |
+|------|--------|------|
+| test_model_forward | PASS | 0.1s |
+| test_loss_computation | PASS | 0.05s |
+| ... | ... | ... |
+
+### Failed Tests (if any)
+
+#### {test_name}
+- **Error**: {error message}
+- **Expected**: {expected}
+- **Actual**: {actual}
+- **Likely cause**: {analysis}
+
+## Replication Targets
+
+### Figure X: {description}
+
+**Status**: Replicated | Partially Replicated | Not Replicated
+
+**Paper Values**:
+| Metric | Paper | Ours | Deviation |
+|--------|-------|------|-----------|
+| {metric} | {value} | {value} | {%} |
+
+**Analysis**:
+{explanation of any differences}
+
+### Table Y: {description}
+
+...
+
+## Code Quality
+
+- **Type Safety**: {assessment}
+- **Documentation**: {assessment}
+- **Test Coverage**: {percentage}
+
+## Reproducibility Checklist
+
+- [ ] Environment setup documented
+- [ ] Random seeds set
+- [ ] Hyperparameters match paper
+- [ ] Data preprocessing matches paper
+- [ ] Evaluation metrics match paper
+
+## Known Differences from Paper
+
+1. **{difference}**: {explanation and justification}
+
+## Recommendations
+
+1. {recommendation for improvement}
+
+## Appendix: Full Test Output
+
+```
+{pytest output}
+```
+```
+
+## Deviation Thresholds
+
+| Deviation | Classification |
+|-----------|----------------|
+| < 1% | Excellent match |
+| 1-5% | Acceptable |
+| 5-10% | Needs investigation |
+| > 10% | Significant difference |
+
+## Analysis Guidelines
+
+When results differ from paper:
+
+1. Check implementation against paper equations
+2. Verify hyperparameters
+3. Check data preprocessing
+4. Consider numerical precision differences
+5. Note if paper has known errata
+
+## Quality Checklist
+
+Before completing:
+- [ ] All tests executed
+- [ ] Coverage report generated
+- [ ] Each replication target evaluated
+- [ ] Deviations analyzed and explained
+- [ ] Recommendations provided
+- [ ] Report is self-contained
+```
+
+- [ ] **Step 2: Verify file creation**
+
+```bash
+cat .opencode/agents/test-runner.md
+```
+
+Expected: File contents match the markdown above.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add .opencode/agents/test-runner.md
+git commit -m "feat(agents): add test-runner subagent
+
+Runs tests and generates comprehensive replication reports."
+```
+
+---
+
+### Task 6: Create paper-parsing Skill
+
+**Files:**
+- Create: `.opencode/skills/paper-parsing/SKILL.md`
+
+- [ ] **Step 1: Create skills directory**
+
+```bash
+mkdir -p .opencode/skills/paper-parsing
+```
+
+- [ ] **Step 2: Write SKILL.md**
+
+Create `.opencode/skills/paper-parsing/SKILL.md`:
+
+```markdown
+---
+name: paper-parsing
+description: Use when analyzing ML/DL papers to ensure comprehensive extraction of all relevant information
+---
+
+# Paper Parsing Methodology
+
+## Overview
+
+Systematic approach to parsing ML/DL papers for replication. Emphasizes **completeness** and **openness** to avoid missing critical details.
+
+**Announce at start:** "I'm using the paper-parsing skill to ensure comprehensive paper analysis."
+
+## Core Philosophy
+
+1. **Completeness over speed**: Better to extract too much than miss something
+2. **Open-ended discovery**: Papers contain unique insights; don't force into templates
+3. **Cross-reference**: Information appears in multiple places; cross-check
+4. **Explicit uncertainty**: Mark unclear items rather than guessing
+
+## Paper Sections Checklist
+
+### Abstract
+- [ ] Core contribution identified
+- [ ] Key results/numbers extracted
+- [ ] Problem domain understood
+
+### Introduction
+- [ ] Problem motivation clear
+- [ ] Gap in existing work identified
+- [ ] Proposed solution summarized
+- [ ] Claimed contributions listed
+
+### Related Work
+- [ ] Key prior methods identified
+- [ ] Differences from this work noted
+- [ ] Potential baselines for comparison
+
+### Method / Approach
+- [ ] Architecture fully described
+- [ ] All components identified
+- [ ] Mathematical formulation complete
+- [ ] Training procedure detailed
+- [ ] Loss functions specified
+- [ ] Hyperparameters listed
+
+### Experiments
+- [ ] Datasets listed with sizes
+- [ ] Evaluation metrics defined
+- [ ] Baseline comparisons noted
+- [ ] Ablation studies cataloged
+- [ ] Key numerical results extracted
+
+### Appendix / Supplementary
+- [ ] Additional implementation details
+- [ ] Extended results
+- [ ] Proofs or derivations
+- [ ] Code references
+
+## Information Extraction Patterns
+
+### Architecture Details
+
+Look for:
+- Layer types and configurations
+- Activation functions
+- Normalization methods
+- Attention mechanisms
+- Skip connections
+- Input/output dimensions
+
+Common locations:
+- Method section figures
+- Architecture diagrams
+- Table of hyperparameters
+- Appendix implementation details
+
+### Training Configuration
+
+| Parameter | Typical Locations |
+|-----------|-------------------|
+| Learning rate | Experiments, Appendix |
+| Batch size | Experiments, Appendix |
+| Optimizer | Method, Appendix |
+| Epochs | Experiments |
+| Hardware | Experiments, Appendix |
+| Training time | Experiments |
+
+### Numerical Results
+
+Extract from:
+- Main results tables
+- Comparison figures
+- Ablation tables
+- Training curves (approximate values)
+
+Format as:
+| Metric | Dataset | Value | Conditions |
+|--------|---------|-------|------------|
+| Accuracy | CIFAR-10 | 95.2% | ResNet-50 backbone |
+
+## Common Omissions to Watch For
+
+1. **Initialization**: Often in appendix or not mentioned
+2. **Data augmentation**: May be standard but unspecified
+3. **Early stopping criteria**: Often implied
+4. **Evaluation protocol**: Train/val/test split details
+5. **Random seeds**: Reproducibility details
+6. **Software versions**: PyTorch, CUDA versions
+
+## Quality Verification
+
+Before completing analysis:
+
+1. **Coverage check**: Every section reviewed?
+2. **Consistency check**: Numbers match across sections?
+3. **Completeness check**: Could someone implement from this?
+4. **Ambiguity check**: Unclear items marked?
+
+## Output Quality Markers
+
+Good analysis:
+- Specific numbers, not "good performance"
+- Exact layer configs, not "standard ResNet"
+- Explicit uncertainty markers
+- Cross-references between sections
+
+Poor analysis:
+- Vague descriptions
+- Missing hyperparameters
+- No numerical targets
+- Assumptions without noting them
+
+## Red Flags
+
+If you notice:
+- "Implementation details in code" → Check GitHub link
+- "Standard settings" → Look up the standard
+- "Following [citation]" → May need to read that paper
+- Inconsistent numbers → Note the discrepancy
+```
+
+- [ ] **Step 3: Verify file creation**
+
+```bash
+cat .opencode/skills/paper-parsing/SKILL.md
+```
+
+Expected: File contents match the markdown above.
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add .opencode/skills/paper-parsing/SKILL.md
+git commit -m "feat(skills): add paper-parsing skill
+
+Comprehensive methodology for ML/DL paper analysis."
+```
+
+---
+
+### Task 7: Create code-generation Skill
+
+**Files:**
+- Create: `.opencode/skills/code-generation/SKILL.md`
+
+- [ ] **Step 1: Create skill directory**
+
+```bash
+mkdir -p .opencode/skills/code-generation
+```
+
+- [ ] **Step 2: Write SKILL.md**
+
+Create `.opencode/skills/code-generation/SKILL.md`:
+
+```markdown
+---
+name: code-generation
+description: Use when generating PyTorch code from paper analysis to ensure correct mapping from paper to code
+---
+
+# Code Generation from Papers
+
+## Overview
+
+Guidelines for translating paper descriptions into working PyTorch code.
+
+**Announce at start:** "I'm using the code-generation skill to ensure accurate paper-to-code translation."
+
+## Core Principles
+
+1. **Traceability**: Every code block should reference paper section/equation
+2. **Testability**: Write code that can be unit tested
+3. **Readability**: Prefer clarity over cleverness
+4. **Modularity**: One component per file
+
+## Paper-to-Code Mapping
+
+### Architecture Diagrams → nn.Module
+
+| Diagram Element | PyTorch Equivalent |
+|-----------------|-------------------|
+| Box/Block | nn.Module subclass |
+| Arrow | forward() call chain |
+| Split | Multiple outputs / tuple |
+| Merge | torch.cat / torch.add |
+| Skip connection | Residual addition |
+
+### Equations → Tensor Operations
+
+| Notation | PyTorch |
+|----------|---------|
+| $Wx + b$ | `nn.Linear(in, out)` |
+| $\sigma(x)$ | `torch.sigmoid(x)` or `nn.Sigmoid()` |
+| $\text{softmax}(x)$ | `F.softmax(x, dim=-1)` |
+| $\|x\|_2$ | `torch.norm(x, p=2)` |
+| $x \odot y$ | `x * y` (element-wise) |
+| $x^T y$ | `torch.matmul(x.T, y)` or `x.T @ y` |
+| $\sum_i$ | `torch.sum(x, dim=i)` |
+| $\mathbb{E}[x]$ | `torch.mean(x)` |
+
+### Loss Functions
+
+| Paper Description | PyTorch |
+|-------------------|---------|
+| Cross-entropy | `nn.CrossEntropyLoss()` |
+| MSE / L2 | `nn.MSELoss()` |
+| L1 | `nn.L1Loss()` |
+| BCE | `nn.BCEWithLogitsLoss()` |
+| KL divergence | `nn.KLDivLoss()` |
+| Custom | Subclass or functional |
+
+## Code Structure Template
+
+```python
+"""
+{component_name}.py
+
+Implements {what} from "{paper_title}" ({year})
+
+Paper Reference:
+- Section: {section_number}
+- Equation: ({equation_number})
+- Figure: {figure_number}
+
+Author: Auto-generated for paper replication
+"""
+
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from typing import Optional, Tuple, List
+
+
+class {ComponentName}(nn.Module):
+    """
+    {One-line description}
+    
+    From paper: "{exact quote or paraphrase}"
+    
+    Args:
+        {param1}: {description} (paper: {where specified})
+        {param2}: {description}
+    
+    Shape:
+        - Input: {shape description}
+        - Output: {shape description}
+    
+    Example:
+        >>> layer = {ComponentName}(dim=512)
+        >>> x = torch.randn(32, 100, 512)
+        >>> out = layer(x)
+        >>> out.shape
+        torch.Size([32, 100, 512])
+    """
+    
+    def __init__(
+        self,
+        {param1}: {type},
+        {param2}: {type} = {default},
+    ):
+        super().__init__()
+        
+        # Paper Section X.Y: "{description}"
+        self.layer1 = nn.Linear(...)
+        
+        # Equation (N): ...
+        self.layer2 = nn.LayerNorm(...)
+        
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        """
+        Forward pass implementing Equation (N).
+        
+        Args:
+            x: Input tensor of shape (batch, seq, dim)
+            
+        Returns:
+            Output tensor of shape (batch, seq, dim)
+        """
+        # Step 1: ... (Eq. N, first term)
+        h = self.layer1(x)
+        
+        # Step 2: ... (Eq. N, second term)
+        out = self.layer2(h)
+        
+        return out
+```
+
+## Common Patterns
+
+### Residual Connection
+
+```python
+# Paper: "We add a residual connection"
+out = self.sublayer(x) + x
+```
+
+### Layer Normalization
+
+```python
+# Paper: "Pre-LN Transformer"
+x = self.norm(x)
+x = self.attention(x)
+
+# Paper: "Post-LN Transformer"
+x = x + self.attention(x)
+x = self.norm(x)
+```
+
+### Multi-Head Attention
+
+```python
+# Paper: "Standard multi-head attention with h heads"
+self.attention = nn.MultiheadAttention(
+    embed_dim=d_model,
+    num_heads=h,
+    dropout=dropout,
+    batch_first=True,
+)
+```
+
+### Custom Activation
+
+```python
+# Paper: "We use GELU activation"
+x = F.gelu(x)
+
+# Paper: "We use Swish/SiLU activation"
+x = F.silu(x)
+```
+
+## Handling Ambiguity
+
+When paper is unclear:
+
+1. **Check code repository** if available
+2. **Follow common practice** for the architecture type
+3. **Document assumption** in code comment
+4. **Add TODO** for verification
+
+```python
+# TODO: Paper unclear on initialization. Using PyTorch default.
+# See: https://github.com/paper/repo for reference implementation
+self.linear = nn.Linear(in_dim, out_dim)
+```
+
+## Verification Checklist
+
+Before completing a module:
+
+- [ ] All equations implemented
+- [ ] Shapes documented and verified
+- [ ] Paper references in comments
+- [ ] Type hints complete
+- [ ] Example in docstring works
+- [ ] No hardcoded dimensions (use params)
+- [ ] Gradient flow verified (no in-place ops breaking autograd)
+```
+
+- [ ] **Step 3: Verify file creation**
+
+```bash
+cat .opencode/skills/code-generation/SKILL.md
+```
+
+Expected: File contents match the markdown above.
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add .opencode/skills/code-generation/SKILL.md
+git commit -m "feat(skills): add code-generation skill
+
+Paper-to-PyTorch code translation guidelines."
+```
+
+---
+
+### Task 8: Create pytorch-patterns Skill
+
+**Files:**
+- Create: `.opencode/skills/pytorch-patterns/SKILL.md`
+
+- [ ] **Step 1: Create skill directory**
+
+```bash
+mkdir -p .opencode/skills/pytorch-patterns
+```
+
+- [ ] **Step 2: Write SKILL.md**
+
+Create `.opencode/skills/pytorch-patterns/SKILL.md`:
+
+```markdown
+---
+name: pytorch-patterns
+description: Use when writing PyTorch code to follow best practices and common patterns
+---
+
+# PyTorch Best Practices
+
+## Overview
+
+Established patterns for writing clean, efficient, and maintainable PyTorch code.
+
+**Announce at start:** "I'm using the pytorch-patterns skill for best practice code."
+
+## Model Definition
+
+### Basic Module
+
+```python
+import torch
+import torch.nn as nn
+from typing import Optional
+
+
+class MyModel(nn.Module):
+    def __init__(self, config: dict):
+        super().__init__()
+        self.config = config
+        
+        # Define layers
+        self.encoder = nn.Linear(config["input_dim"], config["hidden_dim"])
+        self.decoder = nn.Linear(config["hidden_dim"], config["output_dim"])
+        
+        # Initialize weights
+        self._init_weights()
+    
+    def _init_weights(self):
+        """Initialize weights following paper's specification."""
+        for module in self.modules():
+            if isinstance(module, nn.Linear):
+                nn.init.xavier_uniform_(module.weight)
+                if module.bias is not None:
+                    nn.init.zeros_(module.bias)
+    
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        h = self.encoder(x)
+        h = torch.relu(h)
+        out = self.decoder(h)
+        return out
+```
+
+### Model with Multiple Outputs
+
+```python
+from typing import Tuple, NamedTuple
+
+
+class ModelOutput(NamedTuple):
+    logits: torch.Tensor
+    hidden_states: torch.Tensor
+    attention_weights: Optional[torch.Tensor] = None
+
+
+class MultiOutputModel(nn.Module):
+    def forward(self, x: torch.Tensor) -> ModelOutput:
+        # ... computation ...
+        return ModelOutput(
+            logits=logits,
+            hidden_states=hidden,
+            attention_weights=attn if self.return_attention else None,
+        )
+```
+
+## Device Management
+
+### Automatic Device Handling
+
+```python
+class DeviceAwareModel(nn.Module):
+    @property
+    def device(self) -> torch.device:
+        """Get model's device from first parameter."""
+        return next(self.parameters()).device
+    
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        # Input automatically on correct device if caller handles it
+        # For internal tensors:
+        mask = torch.ones(x.size(0), device=self.device)
+        return x * mask
+```
+
+### Training Script Device Setup
+
+```python
+def get_device() -> torch.device:
+    """Get best available device."""
+    if torch.cuda.is_available():
+        return torch.device("cuda")
+    elif torch.backends.mps.is_available():
+        return torch.device("mps")
+    return torch.device("cpu")
+
+
+device = get_device()
+model = MyModel(config).to(device)
+
+# DataLoader handles device transfer
+for batch in dataloader:
+    inputs = batch["inputs"].to(device)
+    targets = batch["targets"].to(device)
+```
+
+## Training Loop
+
+### Standard Pattern
+
+```python
+def train_epoch(
+    model: nn.Module,
+    dataloader: DataLoader,
+    optimizer: torch.optim.Optimizer,
+    criterion: nn.Module,
+    device: torch.device,
+    scheduler: Optional[torch.optim.lr_scheduler._LRScheduler] = None,
+) -> float:
+    """Train for one epoch."""
+    model.train()
+    total_loss = 0.0
+    num_batches = 0
+    
+    for batch in tqdm(dataloader, desc="Training"):
+        # Move to device
+        inputs = batch["inputs"].to(device)
+        targets = batch["targets"].to(device)
+        
+        # Forward pass
+        optimizer.zero_grad()
+        outputs = model(inputs)
+        loss = criterion(outputs, targets)
+        
+        # Backward pass
+        loss.backward()
+        
+        # Gradient clipping (if needed)
+        torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
+        
+        # Update
+        optimizer.step()
+        if scheduler is not None:
+            scheduler.step()
+        
+        total_loss += loss.item()
+        num_batches += 1
+    
+    return total_loss / num_batches
+
+
+@torch.no_grad()
+def evaluate(
+    model: nn.Module,
+    dataloader: DataLoader,
+    criterion: nn.Module,
+    device: torch.device,
+) -> Tuple[float, float]:
+    """Evaluate model."""
+    model.eval()
+    total_loss = 0.0
+    correct = 0
+    total = 0
+    
+    for batch in dataloader:
+        inputs = batch["inputs"].to(device)
+        targets = batch["targets"].to(device)
+        
+        outputs = model(inputs)
+        loss = criterion(outputs, targets)
+        
+        total_loss += loss.item()
+        preds = outputs.argmax(dim=-1)
+        correct += (preds == targets).sum().item()
+        total += targets.size(0)
+    
+    return total_loss / len(dataloader), correct / total
+```
+
+## Data Loading
+
+### Custom Dataset
+
+```python
+from torch.utils.data import Dataset, DataLoader
+
+
+class PaperDataset(Dataset):
+    def __init__(self, data_path: str, transform=None):
+        self.data = self._load_data(data_path)
+        self.transform = transform
+    
+    def _load_data(self, path: str):
+        # Load from disk
+        pass
+    
+    def __len__(self) -> int:
+        return len(self.data)
+    
+    def __getitem__(self, idx: int) -> dict:
+        item = self.data[idx]
+        if self.transform:
+            item = self.transform(item)
+        return item
+
+
+def get_dataloader(
+    dataset: Dataset,
+    batch_size: int,
+    shuffle: bool = True,
+    num_workers: int = 4,
+) -> DataLoader:
+    return DataLoader(
+        dataset,
+        batch_size=batch_size,
+        shuffle=shuffle,
+        num_workers=num_workers,
+        pin_memory=True,  # Faster GPU transfer
+        drop_last=True,   # Consistent batch sizes
+    )
+```
+
+## Checkpointing
+
+### Save and Load
+
+```python
+def save_checkpoint(
+    model: nn.Module,
+    optimizer: torch.optim.Optimizer,
+    epoch: int,
+    loss: float,
+    path: str,
+):
+    """Save training checkpoint."""
+    torch.save({
+        "epoch": epoch,
+        "model_state_dict": model.state_dict(),
+        "optimizer_state_dict": optimizer.state_dict(),
+        "loss": loss,
+    }, path)
+
+
+def load_checkpoint(
+    path: str,
+    model: nn.Module,
+    optimizer: Optional[torch.optim.Optimizer] = None,
+) -> dict:
+    """Load training checkpoint."""
+    checkpoint = torch.load(path, weights_only=True)
+    model.load_state_dict(checkpoint["model_state_dict"])
+    if optimizer is not None:
+        optimizer.load_state_dict(checkpoint["optimizer_state_dict"])
+    return checkpoint
+```
+
+## Reproducibility
+
+### Set Seeds
+
+```python
+import random
+import numpy as np
+import torch
+
+
+def set_seed(seed: int = 42):
+    """Set all random seeds for reproducibility."""
+    random.seed(seed)
+    np.random.seed(seed)
+    torch.manual_seed(seed)
+    torch.cuda.manual_seed_all(seed)
+    
+    # For deterministic behavior (may impact performance)
+    torch.backends.cudnn.deterministic = True
+    torch.backends.cudnn.benchmark = False
+```
+
+## Common Gotchas
+
+### In-place Operations
+
+```python
+# BAD: Breaks autograd
+x += 1
+x[:, 0] = 0
+
+# GOOD: Creates new tensor
+x = x + 1
+x = torch.cat([torch.zeros_like(x[:, :1]), x[:, 1:]], dim=1)
+```
+
+### Detaching for Metrics
+
+```python
+# BAD: Keeps computation graph
+accuracy = (preds == targets).float().mean()
+all_accs.append(accuracy)  # Memory leak!
+
+# GOOD: Detach for logging
+accuracy = (preds == targets).float().mean().item()
+all_accs.append(accuracy)
+```
+
+### Mixed Precision
+
+```python
+from torch.cuda.amp import autocast, GradScaler
+
+scaler = GradScaler()
+
+for batch in dataloader:
+    optimizer.zero_grad()
+    
+    with autocast():
+        outputs = model(inputs)
+        loss = criterion(outputs, targets)
+    
+    scaler.scale(loss).backward()
+    scaler.step(optimizer)
+    scaler.update()
+```
+```
+
+- [ ] **Step 3: Verify file creation**
+
+```bash
+cat .opencode/skills/pytorch-patterns/SKILL.md
+```
+
+Expected: File contents match the markdown above.
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add .opencode/skills/pytorch-patterns/SKILL.md
+git commit -m "feat(skills): add pytorch-patterns skill
+
+PyTorch best practices and common patterns."
+```
+
+---
+
+### Task 9: Create verification Skill
+
+**Files:**
+- Create: `.opencode/skills/verification/SKILL.md`
+
+- [ ] **Step 1: Create skill directory**
+
+```bash
+mkdir -p .opencode/skills/verification
+```
+
+- [ ] **Step 2: Write SKILL.md**
+
+Create `.opencode/skills/verification/SKILL.md`:
+
+```markdown
+---
+name: verification
+description: Use when verifying replication results against paper's reported values
+---
+
+# Replication Verification
+
+## Overview
+
+Systematic approach to verifying that replicated code produces results matching the original paper.
+
+**Announce at start:** "I'm using the verification skill to validate replication accuracy."
+
+## Verification Levels
+
+### Level 1: Code Correctness
+- Unit tests pass
+- No runtime errors
+- Gradient flow works
+
+### Level 2: Behavioral Match
+- Output shapes correct
+- Value ranges reasonable
+- Edge cases handled
+
+### Level 3: Numerical Match
+- Results within tolerance of paper
+- Trends match (even if absolute values differ)
+- Statistical significance considered
+
+## Test Design for Replication
+
+### Shape Tests
+
+```python
+def test_model_output_shape():
+    """Verify model produces correct output shape per paper."""
+    model = MyModel(config)
+    x = torch.randn(batch_size, seq_len, input_dim)
+    out = model(x)
+    
+    # Paper Section 3.2: "Output dimension is 512"
+    assert out.shape == (batch_size, seq_len, 512)
+```
+
+### Value Range Tests
+
+```python
+def test_attention_weights_sum():
+    """Attention weights should sum to 1 (paper Eq. 3)."""
+    model = AttentionLayer(config)
+    x = torch.randn(batch_size, seq_len, dim)
+    _, attn_weights = model(x, return_attention=True)
+    
+    # Softmax output sums to 1
+    assert torch.allclose(attn_weights.sum(dim=-1), torch.ones(batch_size, seq_len))
+```
+
+### Gradient Tests
+
+```python
+def test_gradient_flow():
+    """Verify gradients flow through all parameters."""
+    model = MyModel(config)
+    x = torch.randn(batch_size, input_dim, requires_grad=True)
+    out = model(x)
+    loss = out.sum()
+    loss.backward()
+    
+    for name, param in model.named_parameters():
+        assert param.grad is not None, f"No gradient for {name}"
+        assert not torch.isnan(param.grad).any(), f"NaN gradient for {name}"
+```
+
+### Numerical Match Tests
+
+```python
+def test_loss_value_reasonable():
+    """Loss should be in expected range per paper Figure 2."""
+    model = MyModel(config)
+    # ... setup ...
+    
+    loss = compute_loss(model, data)
+    
+    # Paper reports initial loss ~2.3 (cross-entropy on 10 classes)
+    assert 2.0 < loss.item() < 3.0, f"Initial loss {loss.item()} outside expected range"
+```
+
+## Comparison Methodology
+
+### Absolute Comparison
+
+```python
+def compare_absolute(paper_value: float, our_value: float, tolerance: float = 0.01):
+    """Compare with absolute tolerance."""
+    diff = abs(paper_value - our_value)
+    return diff <= tolerance, diff
+```
+
+### Relative Comparison
+
+```python
+def compare_relative(paper_value: float, our_value: float, tolerance: float = 0.05):
+    """Compare with relative tolerance (5% default)."""
+    if paper_value == 0:
+        return our_value == 0, abs(our_value)
+    relative_diff = abs(paper_value - our_value) / abs(paper_value)
+    return relative_diff <= tolerance, relative_diff
+```
+
+### Statistical Comparison
+
+```python
+def compare_with_variance(
+    paper_mean: float,
+    paper_std: float,
+    our_values: List[float],
+    confidence: float = 0.95,
+):
+    """Compare considering paper's reported variance."""
+    our_mean = np.mean(our_values)
+    our_std = np.std(our_values)
+    
+    # Check if means are within 2 standard deviations
+    combined_std = np.sqrt(paper_std**2 + our_std**2)
+    z_score = abs(paper_mean - our_mean) / combined_std
+    
+    return z_score < 2.0, z_score
+```
+
+## Common Difference Sources
+
+### Acceptable Differences
+
+| Source | Typical Impact | Mitigation |
+|--------|---------------|------------|
+| Random seed | 1-2% | Run multiple seeds |
+| Floating point | < 0.1% | Use float64 for verification |
+| Framework differences | 1-3% | Document and accept |
+| Hardware differences | 0.5-1% | Note in report |
+
+### Concerning Differences
+
+| Source | Typical Impact | Action |
+|--------|---------------|--------|
+| Wrong architecture | > 10% | Review code vs paper |
+| Wrong hyperparameters | 5-20% | Verify all settings |
+| Data preprocessing | Variable | Match paper exactly |
+| Evaluation protocol | Variable | Check train/val/test split |
+
+## Verification Checklist
+
+### Before Comparison
+
+- [ ] Seeds set for reproducibility
+- [ ] Same evaluation data as paper
+- [ ] Same preprocessing pipeline
+- [ ] Same evaluation metrics
+
+### During Comparison
+
+- [ ] Run multiple times with different seeds
+- [ ] Record mean and standard deviation
+- [ ] Compare trends, not just final values
+- [ ] Check intermediate checkpoints if available
+
+### After Comparison
+
+- [ ] Document all differences
+- [ ] Explain likely causes
+- [ ] Determine if differences are acceptable
+- [ ] Suggest improvements if needed
+
+## Report Template
+
+```markdown
+## Verification Result: {Metric Name}
+
+**Paper Value**: {value} ± {std}
+**Our Value**: {value} ± {std}
+**Difference**: {absolute} ({relative}%)
+
+**Status**: MATCH | ACCEPTABLE | INVESTIGATE | MISMATCH
+
+**Analysis**:
+{explanation of difference}
+
+**Confidence**: {HIGH | MEDIUM | LOW}
+{reasoning for confidence level}
+```
+```
+
+- [ ] **Step 3: Verify file creation**
+
+```bash
+cat .opencode/skills/verification/SKILL.md
+```
+
+Expected: File contents match the markdown above.
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add .opencode/skills/verification/SKILL.md
+git commit -m "feat(skills): add verification skill
+
+Replication result verification methodology."
+```
+
+---
+
+### Task 10: Create environment-management Skill
+
+**Files:**
+- Create: `.opencode/skills/environment-management/SKILL.md`
+
+- [ ] **Step 1: Create skill directory**
+
+```bash
+mkdir -p .opencode/skills/environment-management
+```
+
+- [ ] **Step 2: Write SKILL.md**
+
+Create `.opencode/skills/environment-management/SKILL.md`:
+
+```markdown
+---
+name: environment-management
+description: Use when setting up Python environment for ML/DL paper replication using Conda + uv
+---
+
+# Environment Management (Conda + uv)
+
+## Overview
+
+Hybrid approach using Conda for system-level dependencies and uv for project isolation.
+
+**Announce at start:** "I'm using the environment-management skill for Conda + uv setup."
+
+## Architecture
+
+```
+┌─────────────────────────────────────────┐
+│           Conda (System Base)           │
+│  - Python interpreter                    │
+│  - CUDA toolkit                          │
+│  - System-level C++ libraries            │
+└─────────────────────────────────────────┘
+                    │
+                    │ provides Python
+                    ▼
+┌─────────────────────────────────────────┐
+│        uv (Project Isolation)           │
+│  - Per-project .venv                     │
+│  - Fast dependency resolution            │
+│  - Reproducible installs                 │
+└─────────────────────────────────────────┘
+```
+
+## Setup Commands
+
+### Step 1: Conda Base Environment
+
+Check if base exists:
+```bash
+conda env list | grep ai_base
+```
+
+Create if needed:
+```bash
+# Linux/Mac
+conda create -n ai_base python=3.10 cuda-toolkit=11.8 -y
+
+# Windows (CUDA from NVIDIA, not conda)
+conda create -n ai_base python=3.10 -y
+```
+
+### Step 2: Project Environment
+
+```bash
+cd workspace/{paper_name}
+
+# Get Conda Python path
+# Linux/Mac:
+PYTHON_PATH=$(conda run -n ai_base which python)
+
+# Windows:
+# PYTHON_PATH=$(conda run -n ai_base python -c "import sys; print(sys.executable)")
+
+# Create uv venv
+uv venv --python $PYTHON_PATH
+```
+
+### Step 3: Activate and Install
+
+```bash
+# Linux/Mac
+source .venv/bin/activate
+
+# Windows
+.venv\Scripts\activate
+
+# Install dependencies
+uv pip install -e ".[dev]"
+```
+
+## pyproject.toml Template
+
+```toml
+[project]
+name = "{paper_name}"
+version = "0.1.0"
+description = "Replication of {paper_title}"
+requires-python = ">=3.10"
+
+dependencies = [
+    # Core ML
+    "torch>=2.0.0",
+    "numpy>=1.24.0",
+    
+    # Visualization
+    "matplotlib>=3.7.0",
+    "seaborn>=0.12.0",
+    
+    # Utilities
+    "tqdm>=4.65.0",
+    "pyyaml>=6.0",
+]
+
+[project.optional-dependencies]
+dev = [
+    "pytest>=7.0.0",
+    "pytest-cov>=4.0.0",
+    "black>=23.0.0",
+    "ruff>=0.0.260",
+]
+
+# Add based on paper requirements
+vision = [
+    "torchvision>=0.15.0",
+    "pillow>=9.5.0",
+]
+
+nlp = [
+    "transformers>=4.30.0",
+    "tokenizers>=0.13.0",
+    "datasets>=2.12.0",
+]
+
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[tool.pytest.ini_options]
+testpaths = ["tests"]
+python_files = ["test_*.py"]
+addopts = "-v --tb=short"
+
+[tool.black]
+line-length = 88
+target-version = ["py310"]
+
+[tool.ruff]
+line-length = 88
+select = ["E", "F", "I", "N", "W"]
+```
+
+## PyTorch + CUDA Compatibility
+
+| CUDA Version | PyTorch Version | Install Command |
+|--------------|-----------------|-----------------|
+| 11.8 | 2.0+ | `uv pip install torch --index-url https://download.pytorch.org/whl/cu118` |
+| 12.1 | 2.1+ | `uv pip install torch --index-url https://download.pytorch.org/whl/cu121` |
+| CPU only | Any | `uv pip install torch --index-url https://download.pytorch.org/whl/cpu` |
+
+## Environment Verification
+
+```bash
+# Check Python
+python --version
+
+# Check PyTorch
+python -c "import torch; print(f'PyTorch {torch.__version__}')"
+
+# Check CUDA
+python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
+python -c "import torch; print(f'CUDA version: {torch.version.cuda}')"
+
+# Check GPU
+python -c "import torch; print(f'GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else \"N/A\"}')"
+```
+
+## Troubleshooting
+
+### CUDA Not Found
+
+```bash
+# Check NVIDIA driver
+nvidia-smi
+
+# Reinstall PyTorch with correct CUDA
+uv pip install torch --index-url https://download.pytorch.org/whl/cu118 --force-reinstall
+```
+
+### Dependency Conflicts
+
+```bash
+# Clear cache and reinstall
+uv cache clean
+uv pip install -e ".[dev]" --force-reinstall
+```
+
+### Permission Errors (Windows)
+
+```powershell
+# Run as Administrator or:
+Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
+```
+
+## Best Practices
+
+1. **One environment per paper**: Don't mix dependencies
+2. **Pin versions in pyproject.toml**: For reproducibility
+3. **Use dev dependencies**: Keep test tools separate
+4. **Document CUDA version**: In README.md
+5. **Commit pyproject.toml**: Not .venv/
+
+## Quick Reference
+
+```bash
+# Full setup sequence (Linux/Mac)
+conda activate ai_base || conda create -n ai_base python=3.10 cuda-toolkit=11.8 -y && conda activate ai_base
+cd workspace/{paper_name}
+uv venv --python $(which python)
+source .venv/bin/activate
+uv pip install -e ".[dev]"
+pytest tests/ -v
+```
+```
+
+- [ ] **Step 3: Verify file creation**
+
+```bash
+cat .opencode/skills/environment-management/SKILL.md
+```
+
+Expected: File contents match the markdown above.
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add .opencode/skills/environment-management/SKILL.md
+git commit -m "feat(skills): add environment-management skill
+
+Conda + uv hybrid environment setup for ML projects."
+```
+
+---
+
+### Task 11: Create /replicate Command
+
+**Files:**
+- Create: `.opencode/commands/replicate.md`
+
+- [ ] **Step 1: Create commands directory**
+
+```bash
+mkdir -p .opencode/commands
+```
+
+- [ ] **Step 2: Write replicate.md**
+
+Create `.opencode/commands/replicate.md`:
+
+```markdown
+---
+description: Start paper replication workflow
+agent: paper-director
+---
+
+Start the paper replication workflow for the specified paper.
+
+## Input
+
+Paper file: $ARGUMENTS
+
+If no file specified, ask the user to provide the path to a paper (Markdown file or paste text directly).
+
+## Workflow
+
+1. Validate paper file exists (if path provided)
+2. Extract paper name from filename or ask user
+3. Create workspace directory: `workspace/{paper_name}/`
+4. Begin Phase 1: Paper Analysis
+   - Dispatch @paper-image-extractor
+   - Dispatch @paper-analyzer
+5. Present Human Checkpoint with analysis summary
+6. After approval, begin Phase 2: Code Generation (TDD)
+7. Begin Phase 3: Verification
+8. Present final replication report
+
+## Example Usage
+
+```
+/replicate workspace/attention_is_all_you_need.md
+```
+
+Or without arguments:
+```
+/replicate
+> Please provide the path to your paper or paste the content directly.
+```
+```
+
+- [ ] **Step 3: Verify file creation**
+
+```bash
+cat .opencode/commands/replicate.md
+```
+
+Expected: File contents match the markdown above.
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add .opencode/commands/replicate.md
+git commit -m "feat(commands): add /replicate command
+
+Entry point for paper replication workflow."
+```
+
+---
+
+### Task 12: Create /verify Command
+
+**Files:**
+- Create: `.opencode/commands/verify.md`
+
+- [ ] **Step 1: Write verify.md**
+
+Create `.opencode/commands/verify.md`:
+
+```markdown
+---
+description: Verify replication results for a completed project
+agent: paper-director
+---
+
+Verify the replication results for an existing project.
+
+## Input
+
+Project directory: $ARGUMENTS
+
+If no directory specified, list available projects in workspace/ and ask user to select.
+
+## Workflow
+
+1. Validate project directory exists
+2. Check required files exist:
+   - `analysis/paper_structure.md`
+   - `analysis/replication_plan.md`
+   - `src/` with code
+   - `tests/` with tests
+3. Dispatch @test-runner to:
+   - Run test suite
+   - Compare results with paper
+   - Generate/update `reports/replication_report.md`
+4. Present verification summary
+
+## Example Usage
+
+```
+/verify workspace/attention_is_all_you_need/
+```
+
+Or without arguments:
+```
+/verify
+> Available projects:
+> 1. attention_is_all_you_need
+> 2. resnet
+> Please select a project to verify.
+```
+```
+
+- [ ] **Step 2: Verify file creation**
+
+```bash
+cat .opencode/commands/verify.md
+```
+
+Expected: File contents match the markdown above.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add .opencode/commands/verify.md
+git commit -m "feat(commands): add /verify command
+
+Entry point for verification of existing replication projects."
+```
+
+---
+
+### Task 13: Create opencode.json Configuration
+
+**Files:**
+- Create: `opencode.json`
+
+- [ ] **Step 1: Write opencode.json**
+
+Create `opencode.json`:
+
+```json
+{
+  "$schema": "https://opencode.ai/config.json",
+  "default_agent": "paper-director",
+  "agent": {
+    "paper-director": {
+      "mode": "primary"
+    },
+    "paper-analyzer": {
+      "mode": "subagent"
+    },
+    "paper-image-extractor": {
+      "mode": "subagent"
+    },
+    "code-writer": {
+      "mode": "subagent"
+    },
+    "test-runner": {
+      "mode": "subagent"
+    }
+  }
+}
+```
+
+- [ ] **Step 2: Verify file creation**
+
+```bash
+cat opencode.json
+```
+
+Expected: Valid JSON matching above.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add opencode.json
+git commit -m "feat: add opencode.json project configuration
+
+Sets paper-director as default agent with subagent definitions."
+```
+
+---
+
+### Task 14: Create Workspace Directory
+
+**Files:**
+- Create: `workspace/.gitkeep`
+
+- [ ] **Step 1: Create workspace directory**
+
+```bash
+mkdir -p workspace
+```
+
+- [ ] **Step 2: Create .gitkeep**
+
+```bash
+touch workspace/.gitkeep
+```
+
+Or on Windows:
+```powershell
+New-Item -ItemType File -Path workspace/.gitkeep -Force
+```
+
+- [ ] **Step 3: Verify directory creation**
+
+```bash
+ls -la workspace/
+```
+
+Expected: Directory exists with .gitkeep file.
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add workspace/.gitkeep
+git commit -m "feat: add workspace directory for paper replication projects
+
+Papers placed here will be processed by the replication agents."
+```
+
+---
+
+### Task 15: Final Verification
+
+**Files:**
+- Read: All created files
+
+- [ ] **Step 1: Verify directory structure**
+
+```bash
+find .opencode -type f -name "*.md" | sort
+```
+
+Expected output:
+```
+.opencode/agents/code-writer.md
+.opencode/agents/paper-analyzer.md
+.opencode/agents/paper-director.md
+.opencode/agents/paper-image-extractor.md
+.opencode/agents/test-runner.md
+.opencode/commands/replicate.md
+.opencode/commands/verify.md
+.opencode/skills/code-generation/SKILL.md
+.opencode/skills/environment-management/SKILL.md
+.opencode/skills/paper-parsing/SKILL.md
+.opencode/skills/pytorch-patterns/SKILL.md
+.opencode/skills/verification/SKILL.md
+```
+
+- [ ] **Step 2: Verify opencode.json**
+
+```bash
+cat opencode.json | python -m json.tool
+```
+
+Expected: Valid JSON output, no errors.
+
+- [ ] **Step 3: Verify workspace exists**
+
+```bash
+ls workspace/
+```
+
+Expected: .gitkeep file present.
+
+- [ ] **Step 4: Run OpenCode to verify agents load**
+
+```bash
+opencode --help
+```
+
+Then in OpenCode:
+```
+/help
+```
+
+Verify that `/replicate` and `/verify` commands appear.
+
+- [ ] **Step 5: Test agent switching**
+
+In OpenCode, press Tab to cycle agents. Verify `paper-director` is available.
+
+- [ ] **Step 6: Test subagent mention**
+
+```
+@paper-analyzer Can you help me?
+```
+
+Verify subagent responds.
+
+- [ ] **Step 7: Final commit summary**
+
+```bash
+git log --oneline -15
+```
+
+Expected: All feature commits present.
+
+---
+
+## Self-Review Checklist
+
+- [x] **Spec coverage**: All 5 agents, 5 skills, 2 commands, config file defined
+- [x] **No placeholders**: All code blocks complete
+- [x] **Consistent naming**: Agent/skill names match throughout
+- [x] **File paths exact**: All paths specified completely
+- [x] **Commits granular**: Each task has a commit step
diff --git a/docs/superpowers/specs/2026-03-31-paper-replication-agent-design.md b/docs/superpowers/specs/2026-03-31-paper-replication-agent-design.md
new file mode 100644
index 0000000..e8323fe
--- /dev/null
+++ b/docs/superpowers/specs/2026-03-31-paper-replication-agent-design.md
@@ -0,0 +1,757 @@
+# Paper Replication Agent Design Specification
+
+**Date**: 2026-03-31
+**Status**: Draft
+**Author**: OpenCode AI
+
+---
+
+## 1. Overview
+
+### 1.1 Purpose
+
+设计一个基于 OpenCode 平台的论文复现 Agent 系统，专注于机器学习/深度学习论文的自动化复现。该系统能够：
+
+- 解析论文内容和图片
+- 自动生成 PyTorch 复现代码
+- 运行测试验证代码正确性
+- 生成复现报告并对比原论文结果
+
+### 1.2 Scope
+
+- **目标论文类型**: 机器学习/深度学习论文
+- **输入格式**: Markdown 文件、纯文本内容
+- **输出框架**: PyTorch
+- **自动化程度**: 解析后人工核验，之后自动执行（TDD 模式）
+
+### 1.3 Success Criteria
+
+1. 能够准确解析论文结构和关键信息
+2. 能够理解论文中的架构图和实验图表
+3. 生成可运行的 PyTorch 代码
+4. 代码通过单元测试验证
+5. 生成复现报告，明确标注与原论文的差异
+
+---
+
+## 2. Architecture
+
+### 2.1 System Architecture
+
+采用 **主 Agent 编排 + 专业化 Subagent** 的架构模式。
+
+```
+┌─────────────────────────────────────────────────────────┐
+│                   paper-director                         │
+│                  (Primary Agent)                         │
+│           流程编排 / 质量控制 / 人工核验                  │
+└─────────────────────────────────────────────────────────┘
+                          │
+        ┌─────────────────┼─────────────────┐
+        ▼                 ▼                 ▼
+┌───────────────┐ ┌───────────────┐ ┌───────────────┐
+│paper-analyzer │ │paper-image-   │ │ code-writer   │
+│   (Subagent)  │ │extractor      │ │  (Subagent)   │
+│   论文解析    │ │  (Subagent)   │ │   代码生成    │
+└───────────────┘ │  图片理解     │ └───────────────┘
+                  └───────────────┘
+                                          │
+                                          ▼
+                                  ┌───────────────┐
+                                  │ test-runner   │
+                                  │  (Subagent)   │
+                                  │   测试验证    │
+                                  └───────────────┘
+```
+
+### 2.2 File Structure
+
+```
+PaperTool/
+├── .opencode/
+│   ├── agents/                    # Agent 定义
+│   │   ├── paper-director.md      # 主 Agent（编排者）
+│   │   ├── paper-analyzer.md      # 论文解析 Subagent
+│   │   ├── paper-image-extractor.md # 图片理解 Subagent
+│   │   ├── code-writer.md         # 代码生成 Subagent
+│   │   └── test-runner.md         # 测试验证 Subagent
+│   │
+│   ├── skills/                    # 项目级 Skills
+│   │   ├── paper-parsing/         # 论文解析技能
+│   │   │   └── SKILL.md
+│   │   ├── code-generation/       # 代码生成技能
+│   │   │   └── SKILL.md
+│   │   ├── pytorch-patterns/      # PyTorch 最佳实践
+│   │   │   └── SKILL.md
+│   │   ├── verification/          # 验证与对比技能
+│   │   │   └── SKILL.md
+│   │   └── environment-management/ # 环境管理技能（Conda + uv）
+│   │       └── SKILL.md
+│   │
+│   └── commands/                  # 快捷命令
+│       ├── replicate.md           # /replicate 启动复现流程
+│       └── verify.md              # /verify 验证复现结果
+│
+├── docs/superpowers/specs/        # 设计文档
+│
+├── workspace/                     # 工作区
+│   ├── paper_name.md              # 论文源文件（Markdown）
+│   └── {paper_name}/              # 每篇论文一个工作目录
+│       ├── .venv/                 # uv 创建的项目虚拟环境
+│       ├── pyproject.toml         # 项目依赖配置
+│       ├── analysis/              # 解析结果
+│       │   ├── paper_structure.md # 论文结构分析
+│       │   ├── image_understanding.md # 图片理解
+│       │   └── replication_plan.md # 复现计划
+│       ├── src/                   # 生成的代码
+│       │   ├── models/            # 模型定义
+│       │   ├── training/          # 训练脚本
+│       │   └── utils/             # 工具函数
+│       ├── tests/                 # 单元测试
+│       ├── docs/                  # 使用文档
+│       │   └── README.md
+│       └── reports/               # 复现报告
+│           └── replication_report.md
+│
+└── opencode.json                  # 项目配置
+```
+
+### 2.3 Environment Management Strategy
+
+采用 **Conda + uv** 混合模式管理 Python 环境：
+
+#### 2.3.1 Architecture
+
+```
+┌─────────────────────────────────────────────────────────┐
+│                    Conda (系统底座)                      │
+│  ├─ ai_base 环境                                        │
+│  │   ├─ Python 3.10+                                    │
+│  │   ├─ cuda-toolkit                                    │
+│  │   └─ 底层 C++ 编译链工具                              │
+│  │                                                      │
+│  └─ 不安装任何纯 Python 业务包                           │
+└─────────────────────────────────────────────────────────┘
+                          │
+                          │ 提供 Python 解释器
+                          ▼
+┌─────────────────────────────────────────────────────────┐
+│                    uv (项目隔离舱)                       │
+│  ├─ workspace/paper_a/.venv/                            │
+│  │   └─ torch, transformers, ...                        │
+│  │                                                      │
+│  ├─ workspace/paper_b/.venv/                            │
+│  │   └─ torch, jax, ...                                 │
+│  │                                                      │
+│  └─ 每个项目独立的轻量级虚拟环境                         │
+└─────────────────────────────────────────────────────────┘
+```
+
+#### 2.3.2 Environment Setup Flow
+
+```
+开始代码生成
+    │
+    ▼
+检查 Conda ai_base 环境是否存在？
+    │
+    ├─ 否 → 创建 Conda 环境：
+    │       conda create -n ai_base python=3.10 cuda-toolkit -y
+    │
+    ▼
+进入项目目录 workspace/{paper_name}/
+    │
+    ▼
+检查 .venv 是否存在？
+    │
+    ├─ 否 → 创建 uv 虚拟环境：
+    │       uv venv --python $(conda run -n ai_base which python)
+    │
+    ▼
+激活环境并安装依赖：
+    uv pip install -r requirements.txt
+    │
+    ▼
+继续代码生成/测试
+```
+
+#### 2.3.3 Implementation: Skill vs Subagent
+
+**推荐方案：使用 Skill**
+
+**`environment-management` Skill 职责**：
+
+1. 提供 Conda + uv 最佳实践指南
+2. 提供环境检测和创建的命令模板
+3. 提供 pyproject.toml 模板
+4. 提供常见依赖配置（PyTorch + CUDA）
+
+**由 `code-writer` Subagent 负责**：
+
+1. 在生成代码前调用 `environment-management` Skill
+2. 执行环境检测和创建命令
+3. 生成项目的 pyproject.toml
+4. 安装依赖
+
+---
+
+## 3. Workflow
+
+### 3.1 Complete Workflow
+
+```
+用户输入论文 (Markdown/Text)
+         │
+         ▼
+┌─────────────────────────────────────────────────────────┐
+│ 阶段 1：论文解析                                         │
+│ ├─ paper-director 创建工作区目录                         │
+│ ├─ 调用 @paper-image-extractor 提取并理解图片            │
+│ │   └─ 输出: image_understanding.md                      │
+│ ├─ 调用 @paper-analyzer 解析论文结构                     │
+│ │   └─ 输入: 论文 + image_understanding.md               │
+│ │   └─ 输出: paper_structure.md + replication_plan.md    │
+│ └─ 汇总解析结果                                          │
+└─────────────────────────────────────────────────────────┘
+         │
+         ▼
+┌─────────────────────────────────────────────────────────┐
+│ 🔴 人工核验点                                            │
+│ ├─ 展示论文结构分析                                      │
+│ ├─ 展示图片理解结果                                      │
+│ ├─ 展示复现计划和预期产出                                │
+│ ├─ 展示需要复现的实验图表清单                            │
+│ └─ 等待用户确认或修正                                    │
+└─────────────────────────────────────────────────────────┘
+         │
+         ▼
+┌─────────────────────────────────────────────────────────┐
+│ 阶段 2：代码生成（TDD 模式）                             │
+│ ├─ paper-director 根据复现计划生成测试用例               │
+│ ├─ 调用 @code-writer 生成模型代码                        │
+│ │   └─ 输入: 分析文档 + 测试用例                         │
+│ │   └─ 输出: src/models/*.py                             │
+│ ├─ 运行测试，验证代码正确性                              │
+│ ├─ 如果测试失败，迭代修复                                │
+│ ├─ 调用 @code-writer 生成训练脚本                        │
+│ │   └─ 输出: src/training/*.py                           │
+│ └─ 生成使用文档 docs/README.md                           │
+└─────────────────────────────────────────────────────────┘
+         │
+         ▼
+┌─────────────────────────────────────────────────────────┐
+│ 阶段 3：验证与报告                                       │
+│ ├─ 调用 @test-runner 运行完整测试套件                    │
+│ ├─ 尝试运行训练脚本（验证可执行性）                      │
+│ ├─ 对比论文中的预期结果                                  │
+│ ├─ 分析差异并说明原因                                    │
+│ └─ 生成 replication_report.md                            │
+└─────────────────────────────────────────────────────────┘
+         │
+         ▼
+      最终产出
+```
+
+### 3.2 Human Checkpoint Details
+
+人工核验点需要展示以下内容供用户确认：
+
+1. **论文基本信息**
+   
+   - 标题、作者、发表年份
+   - 核心贡献点摘要
+
+2. **模型架构理解**
+   
+   - 架构图的文字描述
+   - 关键组件清单
+
+3. **实验复现目标**
+   
+   - 需要复现的图表清单
+   - 每个图表的数据来源说明
+   - 预期的数值范围
+
+4. **复现计划**
+   
+   - 代码模块划分
+   - 实现顺序
+   - 预估工作量
+
+5. **风险点和限制**
+   
+   - 可能无法复现的部分
+   - 需要的外部资源（数据集等）
+
+---
+
+## 4. Agent Specifications
+
+### 4.1 paper-director (Primary Agent)
+
+**Mode**: `primary`
+
+**Description**: 论文复现项目的编排者和质量控制负责人
+
+**Responsibilities**:
+
+- 接收用户的论文复现请求
+- 创建和管理工作区目录
+- 规划复现流程并创建任务清单
+- 调度各个 Subagent 执行具体任务
+- 在人工核验点暂停，等待用户确认
+- 汇总结果并生成最终复现报告
+- 处理异常情况和错误恢复
+
+**Tools**: 全部启用
+
+**Model**: inherit（继承用户配置）
+
+### 4.2 paper-analyzer (Subagent)
+
+**Mode**: `subagent`
+
+**Description**: 解析论文文本内容，提取结构化信息，规划复现任务
+
+**Input**:
+
+- 论文 Markdown 文件或纯文本
+- `image_understanding.md`（图片理解结果）
+
+**Output**:
+
+- `paper_structure.md` - 论文结构分析
+- `replication_plan.md` - 复现计划
+
+**Output Content**:
+
+- 论文标题、作者、摘要
+- 核心贡献点
+- 方法论概述
+- 模型架构描述（文字）
+- 损失函数和优化器
+- 实验设置（数据集、超参数）
+- 关键公式（LaTeX）
+- **实验结果复现目标**（基于图片理解）
+  - 需要复现的图表清单
+  - 每个图表的数据来源
+  - 预期的数值范围
+  - 复现优先级
+- 待实现的功能清单
+
+**Tools**: read, write, edit
+
+**Model**: inherit
+
+### 4.3 paper-image-extractor (Subagent)
+
+**Mode**: `subagent`
+
+**Description**: 提取并理解论文中的图片内容
+
+**Input**:
+
+- 论文 Markdown 文件（含图片链接/路径）
+
+**Output**:
+
+- `image_understanding.md` - 图片理解文档
+
+**Output Content** (每张图片):
+
+- 图片类型识别（架构图/实验图/算法伪代码/公式）
+- 详细文字描述
+- 架构图: 数据流向、层结构、连接关系
+- 实验图: 数值提取、趋势分析、关键数据点
+- 算法伪代码: 文字化描述、步骤拆解
+- 关键信息提炼
+
+**Tools**: read, write, edit, bash
+
+**Model**: inherit
+
+### 4.4 code-writer (Subagent)
+
+**Mode**: `subagent`
+
+**Description**: 根据分析结果生成 PyTorch 代码，并管理项目环境
+
+**Input**:
+
+- `paper_structure.md` - 论文结构分析
+- `image_understanding.md` - 图片理解
+- 测试用例（TDD 模式）
+
+**Output**:
+
+- `.venv/` - uv 创建的项目虚拟环境
+- `pyproject.toml` - 项目依赖配置
+- `src/models/*.py` - 模型定义
+- `src/training/*.py` - 训练脚本
+- `src/utils/*.py` - 工具函数
+- `docs/README.md` - 使用文档
+
+**Working Mode**: TDD 驱动 + 环境管理
+
+1. **调用 `environment-management` Skill**
+2. 检测并创建 Conda 基础环境（如不存在）
+3. 创建项目 uv 虚拟环境
+4. 生成 pyproject.toml
+5. 安装依赖
+6. 接收测试用例
+7. 编写代码使测试通过
+8. 迭代修复直到所有测试通过
+
+**Tools**: read, write, edit, bash
+
+**Model**: inherit
+
+### 4.5 test-runner (Subagent)
+
+**Mode**: `subagent`
+
+**Description**: 运行测试并验证复现结果
+
+**Input**:
+
+- 生成的代码
+- 测试用例
+- 论文预期结果
+
+**Output**:
+
+- 测试运行结果
+- `replication_report.md` - 复现报告
+
+**Report Content**:
+
+- 测试通过率
+- 代码可运行性验证
+- 与论文结果的对比分析
+- 差异说明和原因分析
+- 改进建议
+
+**Tools**: read, write, edit, bash
+
+**Model**: inherit
+
+---
+
+## 5. Skills Specifications
+
+### 5.1 paper-parsing
+
+**Purpose**: 指导如何系统性、全面地解析 ML/DL 论文
+
+**Core Philosophy**: 强调**开放性**和**全面性**，避免漏解析和片面理解
+
+**Content**:
+
+1. **开放性提示框架**
+   
+   - 论文各部分可能包含的信息类型（非固定模板）
+   - 每个部分的"可能存在"检查清单
+   - 鼓励发现论文独特之处
+
+2. **全面性检查清单**
+   
+   - Abstract → 是否提取了核心贡献？
+   - Introduction → 是否理解了问题背景和动机？
+   - Related Work → 是否识别了与现有方法的关键差异？
+   - Method → 是否完整理解了方法细节？
+   - Experiments → 是否识别了所有需要复现的实验？
+   - Appendix → 是否检查了补充材料？
+
+3. **任务分解指南**
+   
+   - 如何将复杂论文拆分为可执行的子任务
+   - 依赖关系识别
+   - 优先级排序原则
+
+4. **示例模板**
+   
+   - 不同类型论文的解析示例
+   - 常见遗漏点提醒
+
+### 5.2 code-generation
+
+**Purpose**: 指导如何根据论文生成高质量代码
+
+**Content**:
+
+- 从论文描述到代码的映射规则
+- 模块化设计原则
+- 代码风格规范
+- 常见实现模式
+
+### 5.3 pytorch-patterns
+
+**Purpose**: 提供 PyTorch 代码的最佳实践模板
+
+**Content**:
+
+- 模型定义模板（nn.Module）
+- 训练循环模板
+- 常见层实现参考
+- 性能优化技巧
+- 设备管理（CPU/GPU）
+- 数据加载最佳实践
+
+### 5.4 verification
+
+**Purpose**: 指导如何验证复现结果
+
+**Content**:
+
+- 测试用例设计指南
+- 论文结果对比方法
+- 差异分析框架
+- 常见差异原因清单
+
+### 5.5 environment-management
+
+**Purpose**: 指导如何使用 Conda + uv 混合模式管理项目环境
+
+**Content**:
+
+1. **环境架构说明**
+   
+   - Conda 作为系统底座的职责
+   - uv 作为项目隔离的职责
+   - 两者的协作方式
+
+2. **Conda 基础环境配置**
+   
+   - ai_base 环境创建命令
+   - 必要的系统级依赖（cuda-toolkit 等）
+   - 环境检测脚本
+
+3. **uv 项目环境配置**
+   
+   - .venv 创建命令模板
+   - pyproject.toml 模板
+   - 常见 ML 依赖配置
+   - PyTorch + CUDA 版本对应关系
+
+4. **环境管理命令清单**
+   
+   ```bash
+   # 检查 Conda 环境
+   conda env list | grep ai_base
+   
+   # 创建 Conda 基础环境
+   conda create -n ai_base python=3.10 cuda-toolkit -y
+   
+   # 获取 Conda Python 路径
+   conda run -n ai_base which python
+   
+   # 创建 uv 虚拟环境
+   uv venv --python $(conda run -n ai_base which python)
+   
+   # 激活并安装依赖
+   source .venv/bin/activate  # Linux/Mac
+   .venv\Scripts\activate     # Windows
+   uv pip install -e .
+   ```
+
+5. **pyproject.toml 模板**
+   
+   ```toml
+   [project]
+   name = "{paper_name}"
+   version = "0.1.0"
+   requires-python = ">=3.10"
+   dependencies = [
+       "torch>=2.0.0",
+       "numpy>=1.24.0",
+       "matplotlib>=3.7.0",
+       "pytest>=7.0.0",
+   ]
+   
+   [project.optional-dependencies]
+   dev = ["pytest", "black", "ruff"]
+   ```
+
+---
+
+## 6. Commands Specifications
+
+### 6.1 /replicate
+
+**Purpose**: 一键启动论文复现流程
+
+**Usage**:
+
+```
+/replicate path/to/paper.md
+```
+
+**Behavior**:
+
+1. 验证输入文件存在
+2. 从文件名提取论文名称
+3. 创建 `workspace/{paper_name}/` 目录结构
+4. 切换到 paper-director 主 Agent
+5. 启动完整复现流程
+
+### 6.2 /verify
+
+**Purpose**: 验证已生成的复现代码
+
+**Usage**:
+
+```
+/verify workspace/paper_name/
+```
+
+**Behavior**:
+
+1. 检查工作区目录结构
+2. 运行所有测试
+3. 对比论文结果
+4. 生成/更新验证报告
+
+---
+
+## 7. Data Flow
+
+### 7.1 Analysis Phase Data Flow
+
+```
+论文.md
+   │
+   ├──────────────────────────────────┐
+   │                                  │
+   ▼                                  ▼
+paper-image-extractor          (等待图片理解完成)
+   │                                  │
+   └─> image_understanding.md ────────┘
+                                      │
+                                      ▼
+                              paper-analyzer
+                                      │
+                                      ├─> paper_structure.md
+                                      └─> replication_plan.md
+```
+
+### 7.2 Code Generation Phase Data Flow
+
+```
+paper_structure.md + image_understanding.md
+                    │
+                    ▼
+            paper-director
+         (生成测试用例)
+                    │
+                    ▼
+             code-writer
+                    │
+   ┌────────────────┼────────────────┐
+   ▼                ▼                ▼
+src/models/    src/training/    src/utils/
+   │                │                │
+   └────────────────┴────────────────┘
+                    │
+                    ▼
+             运行测试 (TDD)
+                    │
+          ┌────────┴────────┐
+          ▼                 ▼
+       通过            失败 → 回到 code-writer 修复
+          │
+          ▼
+      docs/README.md
+```
+
+### 7.3 Verification Phase Data Flow
+
+```
+src/* + tests/* + replication_plan.md
+                    │
+                    ▼
+              test-runner
+                    │
+   ┌────────────────┼────────────────┐
+   ▼                ▼                ▼
+运行测试      对比论文结果      分析差异
+   │                │                │
+   └────────────────┴────────────────┘
+                    │
+                    ▼
+         replication_report.md
+```
+
+---
+
+## 8. Error Handling
+
+### 8.1 Analysis Phase Errors
+
+| 错误类型    | 处理方式             |
+| ------- | ---------------- |
+| 论文文件不存在 | 提示用户检查路径         |
+| 图片无法访问  | 标记为无法解析，继续处理其他图片 |
+| 解析结果不完整 | 在人工核验点展示，请求用户补充  |
+
+### 8.2 Code Generation Phase Errors
+
+| 错误类型 | 处理方式                             |
+| ---- | -------------------------------- |
+| 测试失败 | 迭代修复，最多 3 次；如仍失败，标记为需人工干预并继续其他模块 |
+| 依赖缺失 | 提示用户安装                           |
+| 语法错误 | 自动修复                             |
+
+### 8.3 Verification Phase Errors
+
+| 错误类型   | 处理方式        |
+| ------ | ----------- |
+| 测试运行失败 | 记录错误，标注在报告中 |
+| 结果差异大  | 分析原因，在报告中说明 |
+
+---
+
+## 9. Configuration
+
+### 9.1 opencode.json
+
+```json
+{
+  "$schema": "https://opencode.ai/config.json",
+  "default_agent": "paper-director",
+  "agent": {
+    "paper-director": {
+      "mode": "primary"
+    }
+  }
+}
+```
+
+---
+
+## 10. Future Enhancements
+
+1. **支持更多输入格式**: PDF 直接解析、arXiv URL
+2. **支持更多框架**: TensorFlow、JAX
+3. **数据集自动准备**: 自动下载和预处理常用数据集
+4. **结果可视化**: 自动生成对比图表
+5. **增量复现**: 支持部分复现和断点续传
+
+---
+
+## Appendix A: Glossary
+
+| 术语            | 说明                        |
+| ------------- | ------------------------- |
+| Primary Agent | 主 Agent，用户直接交互的 Agent     |
+| Subagent      | 子 Agent，由主 Agent 调度执行特定任务 |
+| Skill         | 技能，为 Agent 提供特定领域的指导      |
+| TDD           | 测试驱动开发，先写测试再写代码           |
+| Workspace     | 工作区，存放每篇论文复现结果的目录         |
+
+---
+
+## Appendix B: References
+
+- OpenCode Agents Documentation: https://opencode.ai/docs/zh-cn/agents/
+- Superpowers Skills: https://github.com/obra/superpowers
+- PyTorch Documentation: https://pytorch.org/docs/