# Paper Replication Agent Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Build a Paper Replication Agent System that automates ML/DL paper reproduction with PyTorch code generation and TDD-driven validation. **Architecture:** Primary Agent (paper-director) orchestrates 4 subagents (paper-analyzer, paper-image-extractor, code-writer, test-runner) through file-based context handoff. Skills provide domain-specific guidance. Commands provide entry points. **Tech Stack:** OpenCode agents (Markdown), Skills (Markdown), Commands (Markdown), JSON config **Spec:** `docs/superpowers/specs/2026-03-31-paper-replication-agent-design.md` --- ## File Structure | File | Responsibility | Action | |------|----------------|--------| | `.opencode/agents/paper-director.md` | Primary agent - orchestrates workflow, manages checkpoints | Create | | `.opencode/agents/paper-analyzer.md` | Subagent - parses paper text, creates replication plan | Create | | `.opencode/agents/paper-image-extractor.md` | Subagent - extracts and understands paper images | Create | | `.opencode/agents/code-writer.md` | Subagent - generates PyTorch code with TDD | Create | | `.opencode/agents/test-runner.md` | Subagent - runs tests, creates replication report | Create | | `.opencode/skills/paper-parsing/SKILL.md` | Skill - paper analysis methodology | Create | | `.opencode/skills/code-generation/SKILL.md` | Skill - code generation from paper | Create | | `.opencode/skills/pytorch-patterns/SKILL.md` | Skill - PyTorch best practices | Create | | `.opencode/skills/verification/SKILL.md` | Skill - result verification methodology | Create | | `.opencode/skills/environment-management/SKILL.md` | Skill - Conda + uv environment setup | Create | | `.opencode/commands/replicate.md` | Command - /replicate entry point | Create | | `.opencode/commands/verify.md` | Command - /verify entry point | Create | | `opencode.json` | Project configuration | Create | | `workspace/.gitkeep` | Workspace directory placeholder | Create | --- ### Task 1: Create paper-director Agent (Primary) **Files:** - Create: `.opencode/agents/paper-director.md` - [ ] **Step 1: Create agents directory** ```bash mkdir -p .opencode/agents ``` - [ ] **Step 2: Write paper-director.md** Create `.opencode/agents/paper-director.md`: ```markdown --- name: paper-director description: | Primary agent for ML/DL paper replication. Orchestrates the complete workflow: 1. Creates workspace directories 2. Dispatches paper-image-extractor to analyze images 3. Dispatches paper-analyzer to parse paper and create replication plan 4. Presents human checkpoint for approval 5. Generates tests and dispatches code-writer 6. Dispatches test-runner for final verification Use when: User wants to replicate a paper, or runs /replicate command. mode: primary model: inherit --- # Paper Replication Director You are the orchestrator for ML/DL paper replication projects. Your role is to manage the complete workflow from paper analysis to working PyTorch code. ## Core Responsibilities 1. **Workspace Management**: Create and organize project directories 2. **Workflow Orchestration**: Dispatch subagents in correct sequence 3. **Quality Control**: Ensure outputs meet standards before proceeding 4. **Human Checkpoint**: Present analysis results for user approval 5. **Error Recovery**: Handle failures gracefully ## Workflow ### Phase 1: Paper Analysis When given a paper (Markdown file or text): 1. **Create workspace directory**: ``` workspace/{paper_name}/ ├── analysis/ ├── src/ │ ├── models/ │ ├── training/ │ └── utils/ ├── tests/ ├── docs/ └── reports/ ``` 2. **Dispatch @paper-image-extractor**: - Input: Paper file path - Output: `analysis/image_understanding.md` - Wait for completion before proceeding 3. **Dispatch @paper-analyzer**: - Input: Paper file + `analysis/image_understanding.md` - Output: `analysis/paper_structure.md` + `analysis/replication_plan.md` - Wait for completion before proceeding 4. **Human Checkpoint** - Present to user: ``` ## Paper Analysis Complete ### Basic Information - Title: {title} - Core contribution: {summary} ### Model Architecture {architecture_description} ### Replication Targets {list_of_figures_to_replicate} ### Implementation Plan {planned_modules} ### Risks and Limitations {identified_risks} --- Please review and confirm to proceed, or provide corrections. ``` ### Phase 2: Code Generation (TDD Mode) After user approval: 1. **Load Skills**: - Load `code-generation` skill - Load `pytorch-patterns` skill - Load `environment-management` skill 2. **Generate Test Cases**: - Create test files based on replication plan - Tests should verify model architecture, forward pass, loss computation 3. **Dispatch @code-writer** iteratively: - For each module in replication plan: - Provide: Analysis docs + relevant test files - Expect: Implementation that passes tests - Iterate until all tests pass (max 3 retries per module) 4. **Generate Documentation**: - Create `docs/README.md` with usage instructions ### Phase 3: Verification 1. **Dispatch @test-runner**: - Run complete test suite - Compare with paper's expected results - Generate `reports/replication_report.md` 2. **Present Final Report** to user ## Error Handling | Error | Action | |-------|--------| | Paper file not found | Ask user to provide correct path | | Image extraction fails | Mark images as "unable to parse", continue | | Test fails after 3 retries | Mark module as "needs manual intervention", continue with others | | Missing dependencies | Suggest installation commands | ## Output Format Always structure your responses clearly: - Use headers for phases - Show progress indicators - Highlight decisions requiring user input - Summarize completed work before asking for confirmation ``` - [ ] **Step 3: Verify file creation** ```bash cat .opencode/agents/paper-director.md ``` Expected: File contents match the markdown above. - [ ] **Step 4: Commit** ```bash git add .opencode/agents/paper-director.md git commit -m "feat(agents): add paper-director primary agent Orchestrates ML/DL paper replication workflow with human checkpoint." ``` --- ### Task 2: Create paper-analyzer Agent (Subagent) **Files:** - Create: `.opencode/agents/paper-analyzer.md` - [ ] **Step 1: Write paper-analyzer.md** Create `.opencode/agents/paper-analyzer.md`: ```markdown --- name: paper-analyzer description: | Subagent that parses ML/DL paper text content and creates structured analysis. Produces paper_structure.md (what the paper contains) and replication_plan.md (what to implement). Requires image_understanding.md as input for complete analysis. mode: subagent model: inherit permission: edit: allow bash: deny --- # Paper Analyzer You analyze ML/DL papers and produce structured documentation for replication. ## Required Inputs 1. **Paper content**: Markdown file or plain text 2. **Image understanding**: `image_understanding.md` from paper-image-extractor ## Required Outputs ### 1. paper_structure.md ```markdown # Paper Structure Analysis ## Basic Information - **Title**: - **Authors**: - **Year**: - **Venue**: ## Abstract Summary {2-3 sentence summary of core contribution} ## Problem Statement {What problem does this paper solve?} ## Key Contributions 1. {contribution 1} 2. {contribution 2} ... ## Method Overview ### Architecture {Text description of model architecture} {Reference to architecture diagrams from image_understanding.md} ### Key Components | Component | Description | Implementation Priority | |-----------|-------------|------------------------| | {name} | {what it does} | {high/medium/low} | ### Mathematical Formulation {Key equations in LaTeX} $$ L = L_{task} + \lambda L_{reg} $$ ### Training Details - **Optimizer**: - **Learning rate**: - **Batch size**: - **Epochs**: - **Hardware**: ## Experiments ### Datasets | Dataset | Size | Purpose | |---------|------|---------| | {name} | {size} | {train/eval/test} | ### Metrics - {metric 1}: {description} - {metric 2}: {description} ### Key Results {Reference to result figures from image_understanding.md} {Numerical results to reproduce} ## Appendix Notes {Any supplementary material findings} ``` ### 2. replication_plan.md ```markdown # Replication Plan ## Scope {What will be replicated vs. what is out of scope} ## Implementation Order ### Module 1: {name} - **File**: `src/models/{filename}.py` - **Dependencies**: None - **Test file**: `tests/test_{filename}.py` - **Acceptance criteria**: - [ ] Forward pass produces correct output shape - [ ] Gradient flow verified - [ ] {specific behavior from paper} ### Module 2: {name} ... ## Replication Targets ### Figure X: {description} - **Type**: {architecture diagram / training curve / comparison table} - **Data source**: {what computation produces this} - **Priority**: {high/medium/low} - **Expected values**: {numerical ranges if applicable} ## Environment Requirements - Python >= 3.10 - PyTorch >= 2.0 - {other dependencies} ## Estimated Effort - Core model: {X hours} - Training pipeline: {X hours} - Evaluation: {X hours} ## Known Challenges 1. {challenge}: {mitigation strategy} ``` ## Analysis Methodology When analyzing a paper: 1. **First pass**: Extract basic info (title, authors, abstract) 2. **Method pass**: Understand architecture and algorithms 3. **Experiment pass**: Identify what needs to be reproduced 4. **Integration pass**: Combine with image_understanding.md 5. **Planning pass**: Create actionable replication plan ## Quality Checklist Before completing: - [ ] All sections of paper_structure.md filled - [ ] Image descriptions integrated from image_understanding.md - [ ] Replication plan has clear module boundaries - [ ] Each module has testable acceptance criteria - [ ] Dependencies between modules identified - [ ] Numerical targets extracted where available ``` - [ ] **Step 2: Verify file creation** ```bash cat .opencode/agents/paper-analyzer.md ``` Expected: File contents match the markdown above. - [ ] **Step 3: Commit** ```bash git add .opencode/agents/paper-analyzer.md git commit -m "feat(agents): add paper-analyzer subagent Parses paper text and creates replication plan with testable criteria." ``` --- ### Task 3: Create paper-image-extractor Agent (Subagent) **Files:** - Create: `.opencode/agents/paper-image-extractor.md` - [ ] **Step 1: Write paper-image-extractor.md** Create `.opencode/agents/paper-image-extractor.md`: ```markdown --- name: paper-image-extractor description: | Subagent that extracts and understands images from ML/DL papers. Analyzes architecture diagrams, experiment plots, algorithm pseudocode, and equations. Output is used by paper-analyzer to create complete replication plan. mode: subagent model: inherit permission: edit: allow bash: "*": deny "ls *": allow --- # Paper Image Extractor You extract and analyze images from ML/DL papers, producing detailed text descriptions that enable code replication. ## Required Input - Paper file path (Markdown with image references) ## Required Output `image_understanding.md` in the analysis directory. ## Output Format ```markdown # Image Understanding ## Summary - Total images found: {N} - Architecture diagrams: {N} - Experiment figures: {N} - Algorithm/pseudocode: {N} - Equations/tables: {N} --- ## Image 1: {caption or identifier} **Type**: Architecture Diagram | Experiment Plot | Algorithm | Equation | Table | Other **Location**: {file path or URL} **Description**: {Detailed text description of what the image shows} ### For Architecture Diagrams: **Components**: | Layer/Block | Input Shape | Output Shape | Parameters | |-------------|-------------|--------------|------------| | {name} | {shape} | {shape} | {count if shown} | **Data Flow**: 1. Input → {first operation} 2. {intermediate steps} 3. → Output **Key Details**: - {notable architectural choices} - {skip connections, attention mechanisms, etc.} ### For Experiment Plots: **Axes**: - X-axis: {label} (range: {min}-{max}) - Y-axis: {label} (range: {min}-{max}) **Data Series**: | Series | Description | Key Points | |--------|-------------|------------| | {name/color} | {what it represents} | {peak value, convergence point, etc.} | **Numerical Extraction**: - At x={value}: y≈{value} - Final value: {value} - Best result: {value} **Trends**: - {observed patterns} ### For Algorithm/Pseudocode: **Algorithm Name**: {name} **Inputs**: {list} **Outputs**: {list} **Steps**: 1. {step 1} 2. {step 2} ... **Python Translation Hint**: ```python # Suggested structure def algorithm_name(inputs): # step 1 # step 2 return outputs ``` ### For Equations: **Equation**: $$ {LaTeX representation} $$ **Variables**: - {symbol}: {meaning} **Implementation Notes**: - {how to compute this in PyTorch} --- ## Image 2: ... ``` ## Analysis Guidelines ### Architecture Diagrams - Identify all layers/blocks and their connections - Note input/output shapes when visible - Capture skip connections, residual paths - Identify attention mechanisms, normalization layers - Note any dimension annotations ### Experiment Plots - Extract actual numerical values where possible - Identify which curve corresponds to the paper's method - Note baseline comparisons - Capture convergence behavior - Identify error bars or confidence intervals ### Algorithm Pseudocode - Convert to structured steps - Identify loops, conditions - Note any hyperparameters mentioned - Suggest PyTorch equivalents ### Equations - Transcribe to LaTeX - Define all variables - Note how to implement in code ## Replication Priority Mark each image with replication priority: - **HIGH**: Core architecture, main results to reproduce - **MEDIUM**: Training curves, ablation studies - **LOW**: Conceptual diagrams, background figures ## Quality Checklist Before completing: - [ ] All images in paper cataloged - [ ] Architecture diagrams have layer-by-layer breakdown - [ ] Experiment figures have numerical values extracted - [ ] Equations transcribed to LaTeX - [ ] Replication priorities assigned - [ ] Output enables paper-analyzer to create complete plan ``` - [ ] **Step 2: Verify file creation** ```bash cat .opencode/agents/paper-image-extractor.md ``` Expected: File contents match the markdown above. - [ ] **Step 3: Commit** ```bash git add .opencode/agents/paper-image-extractor.md git commit -m "feat(agents): add paper-image-extractor subagent Analyzes paper images to extract architecture details and numerical results." ``` --- ### Task 4: Create code-writer Agent (Subagent) **Files:** - Create: `.opencode/agents/code-writer.md` - [ ] **Step 1: Write code-writer.md** Create `.opencode/agents/code-writer.md`: ```markdown --- name: code-writer description: | Subagent that generates PyTorch code based on paper analysis. Works in TDD mode: receives test files, writes code to pass tests. Also manages project environment using Conda + uv. mode: subagent model: inherit permission: edit: allow bash: "*": allow --- # Code Writer You generate PyTorch code to replicate ML/DL papers, working in strict TDD mode. ## Required Inputs 1. `paper_structure.md` - Paper analysis 2. `image_understanding.md` - Image analysis 3. `replication_plan.md` - Implementation plan 4. Test files for the module to implement ## Working Mode: TDD **Iron Rule**: Write code ONLY to make failing tests pass. 1. Receive test file 2. Run test to verify it fails 3. Write minimal code to pass 4. Run test to verify it passes 5. Refactor if needed (keeping tests green) ## Environment Setup Before writing any code, ensure environment is ready: ### Step 1: Check/Create Conda Base ```bash # Check if ai_base exists conda env list | grep ai_base # If not exists, create it conda create -n ai_base python=3.10 -y ``` ### Step 2: Create Project Environment ```bash cd workspace/{paper_name} # Create uv virtual environment using Conda's Python uv venv --python $(conda run -n ai_base which python) # On Windows: # uv venv --python $(conda run -n ai_base python -c "import sys; print(sys.executable)") ``` ### Step 3: Create pyproject.toml ```toml [project] name = "{paper_name}" version = "0.1.0" requires-python = ">=3.10" dependencies = [ "torch>=2.0.0", "numpy>=1.24.0", "matplotlib>=3.7.0", "tqdm>=4.65.0", ] [project.optional-dependencies] dev = [ "pytest>=7.0.0", "pytest-cov>=4.0.0", ] [build-system] requires = ["hatchling"] build-backend = "hatchling.build" ``` ### Step 4: Install Dependencies ```bash # Activate and install source .venv/bin/activate # Linux/Mac # .venv\Scripts\activate # Windows uv pip install -e ".[dev]" ``` ## Code Generation Guidelines ### Model Architecture ```python """ {module_name}.py Implements {component} from "{paper_title}" Reference: Section {X}, Figure {Y} """ import torch import torch.nn as nn import torch.nn.functional as F from typing import Optional, Tuple class {ComponentName}(nn.Module): """ {Brief description from paper} Args: {param}: {description} Paper reference: - Architecture: Figure {X} - Equation: ({Y}) """ def __init__(self, {params}): super().__init__() # Initialize layers def forward(self, x: torch.Tensor) -> torch.Tensor: """ Forward pass. Args: x: Input tensor of shape {expected_shape} Returns: Output tensor of shape {output_shape} """ # Implementation return output ``` ### Training Scripts ```python """ train.py Training script for {paper_title} replication. """ import torch from torch.utils.data import DataLoader from tqdm import tqdm def train_epoch(model, dataloader, optimizer, criterion, device): """Single training epoch.""" model.train() total_loss = 0.0 for batch in tqdm(dataloader, desc="Training"): # Training step pass return total_loss / len(dataloader) def main(): # Configuration from paper config = { "lr": 1e-4, # Section X "batch_size": 32, # Section X "epochs": 100, } # Setup device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # Model, optimizer, criterion # ... # Training loop for epoch in range(config["epochs"]): loss = train_epoch(model, train_loader, optimizer, criterion, device) print(f"Epoch {epoch+1}: Loss = {loss:.4f}") if __name__ == "__main__": main() ``` ## File Organization ``` src/ ├── __init__.py ├── models/ │ ├── __init__.py │ ├── {main_model}.py │ └── {component}.py ├── training/ │ ├── __init__.py │ ├── train.py │ ├── losses.py │ └── optimizers.py └── utils/ ├── __init__.py ├── data.py └── metrics.py ``` ## Quality Checklist Before completing each module: - [ ] All tests pass - [ ] Type hints on all public functions - [ ] Docstrings with paper references - [ ] Input/output shapes documented - [ ] No hardcoded magic numbers (use config) - [ ] Device-agnostic (CPU/GPU) ``` - [ ] **Step 2: Verify file creation** ```bash cat .opencode/agents/code-writer.md ``` Expected: File contents match the markdown above. - [ ] **Step 3: Commit** ```bash git add .opencode/agents/code-writer.md git commit -m "feat(agents): add code-writer subagent Generates PyTorch code in TDD mode with environment management." ``` --- ### Task 5: Create test-runner Agent (Subagent) **Files:** - Create: `.opencode/agents/test-runner.md` - [ ] **Step 1: Write test-runner.md** Create `.opencode/agents/test-runner.md`: ```markdown --- name: test-runner description: | Subagent that runs tests, verifies code correctness, and generates replication reports. Compares results with paper's expected values and documents any differences. mode: subagent model: inherit permission: edit: allow bash: "*": allow --- # Test Runner You run tests, verify replication correctness, and generate comprehensive reports. ## Required Inputs 1. Generated code in `src/` 2. Test files in `tests/` 3. `replication_plan.md` with expected results ## Required Outputs 1. Test execution results 2. `reports/replication_report.md` ## Workflow ### Step 1: Run Test Suite ```bash cd workspace/{paper_name} source .venv/bin/activate # Run all tests with coverage pytest tests/ -v --cov=src --cov-report=term-missing ``` ### Step 2: Verify Replication Targets For each target in replication_plan.md: 1. Run the relevant computation 2. Compare with expected values 3. Calculate deviation ### Step 3: Generate Report ## Report Format ```markdown # Replication Report: {Paper Title} **Date**: {date} **Status**: {Complete | Partial | Failed} ## Summary | Metric | Status | |--------|--------| | Tests Passing | {X}/{Y} | | Code Coverage | {X}% | | Replication Accuracy | {qualitative} | ## Test Results ### Unit Tests | Test | Status | Time | |------|--------|------| | test_model_forward | PASS | 0.1s | | test_loss_computation | PASS | 0.05s | | ... | ... | ... | ### Failed Tests (if any) #### {test_name} - **Error**: {error message} - **Expected**: {expected} - **Actual**: {actual} - **Likely cause**: {analysis} ## Replication Targets ### Figure X: {description} **Status**: Replicated | Partially Replicated | Not Replicated **Paper Values**: | Metric | Paper | Ours | Deviation | |--------|-------|------|-----------| | {metric} | {value} | {value} | {%} | **Analysis**: {explanation of any differences} ### Table Y: {description} ... ## Code Quality - **Type Safety**: {assessment} - **Documentation**: {assessment} - **Test Coverage**: {percentage} ## Reproducibility Checklist - [ ] Environment setup documented - [ ] Random seeds set - [ ] Hyperparameters match paper - [ ] Data preprocessing matches paper - [ ] Evaluation metrics match paper ## Known Differences from Paper 1. **{difference}**: {explanation and justification} ## Recommendations 1. {recommendation for improvement} ## Appendix: Full Test Output ``` {pytest output} ``` ``` ## Deviation Thresholds | Deviation | Classification | |-----------|----------------| | < 1% | Excellent match | | 1-5% | Acceptable | | 5-10% | Needs investigation | | > 10% | Significant difference | ## Analysis Guidelines When results differ from paper: 1. Check implementation against paper equations 2. Verify hyperparameters 3. Check data preprocessing 4. Consider numerical precision differences 5. Note if paper has known errata ## Quality Checklist Before completing: - [ ] All tests executed - [ ] Coverage report generated - [ ] Each replication target evaluated - [ ] Deviations analyzed and explained - [ ] Recommendations provided - [ ] Report is self-contained ``` - [ ] **Step 2: Verify file creation** ```bash cat .opencode/agents/test-runner.md ``` Expected: File contents match the markdown above. - [ ] **Step 3: Commit** ```bash git add .opencode/agents/test-runner.md git commit -m "feat(agents): add test-runner subagent Runs tests and generates comprehensive replication reports." ``` --- ### Task 6: Create paper-parsing Skill **Files:** - Create: `.opencode/skills/paper-parsing/SKILL.md` - [ ] **Step 1: Create skills directory** ```bash mkdir -p .opencode/skills/paper-parsing ``` - [ ] **Step 2: Write SKILL.md** Create `.opencode/skills/paper-parsing/SKILL.md`: ```markdown --- name: paper-parsing description: Use when analyzing ML/DL papers to ensure comprehensive extraction of all relevant information --- # Paper Parsing Methodology ## Overview Systematic approach to parsing ML/DL papers for replication. Emphasizes **completeness** and **openness** to avoid missing critical details. **Announce at start:** "I'm using the paper-parsing skill to ensure comprehensive paper analysis." ## Core Philosophy 1. **Completeness over speed**: Better to extract too much than miss something 2. **Open-ended discovery**: Papers contain unique insights; don't force into templates 3. **Cross-reference**: Information appears in multiple places; cross-check 4. **Explicit uncertainty**: Mark unclear items rather than guessing ## Paper Sections Checklist ### Abstract - [ ] Core contribution identified - [ ] Key results/numbers extracted - [ ] Problem domain understood ### Introduction - [ ] Problem motivation clear - [ ] Gap in existing work identified - [ ] Proposed solution summarized - [ ] Claimed contributions listed ### Related Work - [ ] Key prior methods identified - [ ] Differences from this work noted - [ ] Potential baselines for comparison ### Method / Approach - [ ] Architecture fully described - [ ] All components identified - [ ] Mathematical formulation complete - [ ] Training procedure detailed - [ ] Loss functions specified - [ ] Hyperparameters listed ### Experiments - [ ] Datasets listed with sizes - [ ] Evaluation metrics defined - [ ] Baseline comparisons noted - [ ] Ablation studies cataloged - [ ] Key numerical results extracted ### Appendix / Supplementary - [ ] Additional implementation details - [ ] Extended results - [ ] Proofs or derivations - [ ] Code references ## Information Extraction Patterns ### Architecture Details Look for: - Layer types and configurations - Activation functions - Normalization methods - Attention mechanisms - Skip connections - Input/output dimensions Common locations: - Method section figures - Architecture diagrams - Table of hyperparameters - Appendix implementation details ### Training Configuration | Parameter | Typical Locations | |-----------|-------------------| | Learning rate | Experiments, Appendix | | Batch size | Experiments, Appendix | | Optimizer | Method, Appendix | | Epochs | Experiments | | Hardware | Experiments, Appendix | | Training time | Experiments | ### Numerical Results Extract from: - Main results tables - Comparison figures - Ablation tables - Training curves (approximate values) Format as: | Metric | Dataset | Value | Conditions | |--------|---------|-------|------------| | Accuracy | CIFAR-10 | 95.2% | ResNet-50 backbone | ## Common Omissions to Watch For 1. **Initialization**: Often in appendix or not mentioned 2. **Data augmentation**: May be standard but unspecified 3. **Early stopping criteria**: Often implied 4. **Evaluation protocol**: Train/val/test split details 5. **Random seeds**: Reproducibility details 6. **Software versions**: PyTorch, CUDA versions ## Quality Verification Before completing analysis: 1. **Coverage check**: Every section reviewed? 2. **Consistency check**: Numbers match across sections? 3. **Completeness check**: Could someone implement from this? 4. **Ambiguity check**: Unclear items marked? ## Output Quality Markers Good analysis: - Specific numbers, not "good performance" - Exact layer configs, not "standard ResNet" - Explicit uncertainty markers - Cross-references between sections Poor analysis: - Vague descriptions - Missing hyperparameters - No numerical targets - Assumptions without noting them ## Red Flags If you notice: - "Implementation details in code" → Check GitHub link - "Standard settings" → Look up the standard - "Following [citation]" → May need to read that paper - Inconsistent numbers → Note the discrepancy ``` - [ ] **Step 3: Verify file creation** ```bash cat .opencode/skills/paper-parsing/SKILL.md ``` Expected: File contents match the markdown above. - [ ] **Step 4: Commit** ```bash git add .opencode/skills/paper-parsing/SKILL.md git commit -m "feat(skills): add paper-parsing skill Comprehensive methodology for ML/DL paper analysis." ``` --- ### Task 7: Create code-generation Skill **Files:** - Create: `.opencode/skills/code-generation/SKILL.md` - [ ] **Step 1: Create skill directory** ```bash mkdir -p .opencode/skills/code-generation ``` - [ ] **Step 2: Write SKILL.md** Create `.opencode/skills/code-generation/SKILL.md`: ```markdown --- name: code-generation description: Use when generating PyTorch code from paper analysis to ensure correct mapping from paper to code --- # Code Generation from Papers ## Overview Guidelines for translating paper descriptions into working PyTorch code. **Announce at start:** "I'm using the code-generation skill to ensure accurate paper-to-code translation." ## Core Principles 1. **Traceability**: Every code block should reference paper section/equation 2. **Testability**: Write code that can be unit tested 3. **Readability**: Prefer clarity over cleverness 4. **Modularity**: One component per file ## Paper-to-Code Mapping ### Architecture Diagrams → nn.Module | Diagram Element | PyTorch Equivalent | |-----------------|-------------------| | Box/Block | nn.Module subclass | | Arrow | forward() call chain | | Split | Multiple outputs / tuple | | Merge | torch.cat / torch.add | | Skip connection | Residual addition | ### Equations → Tensor Operations | Notation | PyTorch | |----------|---------| | $Wx + b$ | `nn.Linear(in, out)` | | $\sigma(x)$ | `torch.sigmoid(x)` or `nn.Sigmoid()` | | $\text{softmax}(x)$ | `F.softmax(x, dim=-1)` | | $\|x\|_2$ | `torch.norm(x, p=2)` | | $x \odot y$ | `x * y` (element-wise) | | $x^T y$ | `torch.matmul(x.T, y)` or `x.T @ y` | | $\sum_i$ | `torch.sum(x, dim=i)` | | $\mathbb{E}[x]$ | `torch.mean(x)` | ### Loss Functions | Paper Description | PyTorch | |-------------------|---------| | Cross-entropy | `nn.CrossEntropyLoss()` | | MSE / L2 | `nn.MSELoss()` | | L1 | `nn.L1Loss()` | | BCE | `nn.BCEWithLogitsLoss()` | | KL divergence | `nn.KLDivLoss()` | | Custom | Subclass or functional | ## Code Structure Template ```python """ {component_name}.py Implements {what} from "{paper_title}" ({year}) Paper Reference: - Section: {section_number} - Equation: ({equation_number}) - Figure: {figure_number} Author: Auto-generated for paper replication """ import torch import torch.nn as nn import torch.nn.functional as F from typing import Optional, Tuple, List class {ComponentName}(nn.Module): """ {One-line description} From paper: "{exact quote or paraphrase}" Args: {param1}: {description} (paper: {where specified}) {param2}: {description} Shape: - Input: {shape description} - Output: {shape description} Example: >>> layer = {ComponentName}(dim=512) >>> x = torch.randn(32, 100, 512) >>> out = layer(x) >>> out.shape torch.Size([32, 100, 512]) """ def __init__( self, {param1}: {type}, {param2}: {type} = {default}, ): super().__init__() # Paper Section X.Y: "{description}" self.layer1 = nn.Linear(...) # Equation (N): ... self.layer2 = nn.LayerNorm(...) def forward(self, x: torch.Tensor) -> torch.Tensor: """ Forward pass implementing Equation (N). Args: x: Input tensor of shape (batch, seq, dim) Returns: Output tensor of shape (batch, seq, dim) """ # Step 1: ... (Eq. N, first term) h = self.layer1(x) # Step 2: ... (Eq. N, second term) out = self.layer2(h) return out ``` ## Common Patterns ### Residual Connection ```python # Paper: "We add a residual connection" out = self.sublayer(x) + x ``` ### Layer Normalization ```python # Paper: "Pre-LN Transformer" x = self.norm(x) x = self.attention(x) # Paper: "Post-LN Transformer" x = x + self.attention(x) x = self.norm(x) ``` ### Multi-Head Attention ```python # Paper: "Standard multi-head attention with h heads" self.attention = nn.MultiheadAttention( embed_dim=d_model, num_heads=h, dropout=dropout, batch_first=True, ) ``` ### Custom Activation ```python # Paper: "We use GELU activation" x = F.gelu(x) # Paper: "We use Swish/SiLU activation" x = F.silu(x) ``` ## Handling Ambiguity When paper is unclear: 1. **Check code repository** if available 2. **Follow common practice** for the architecture type 3. **Document assumption** in code comment 4. **Add TODO** for verification ```python # TODO: Paper unclear on initialization. Using PyTorch default. # See: https://github.com/paper/repo for reference implementation self.linear = nn.Linear(in_dim, out_dim) ``` ## Verification Checklist Before completing a module: - [ ] All equations implemented - [ ] Shapes documented and verified - [ ] Paper references in comments - [ ] Type hints complete - [ ] Example in docstring works - [ ] No hardcoded dimensions (use params) - [ ] Gradient flow verified (no in-place ops breaking autograd) ``` - [ ] **Step 3: Verify file creation** ```bash cat .opencode/skills/code-generation/SKILL.md ``` Expected: File contents match the markdown above. - [ ] **Step 4: Commit** ```bash git add .opencode/skills/code-generation/SKILL.md git commit -m "feat(skills): add code-generation skill Paper-to-PyTorch code translation guidelines." ``` --- ### Task 8: Create pytorch-patterns Skill **Files:** - Create: `.opencode/skills/pytorch-patterns/SKILL.md` - [ ] **Step 1: Create skill directory** ```bash mkdir -p .opencode/skills/pytorch-patterns ``` - [ ] **Step 2: Write SKILL.md** Create `.opencode/skills/pytorch-patterns/SKILL.md`: ```markdown --- name: pytorch-patterns description: Use when writing PyTorch code to follow best practices and common patterns --- # PyTorch Best Practices ## Overview Established patterns for writing clean, efficient, and maintainable PyTorch code. **Announce at start:** "I'm using the pytorch-patterns skill for best practice code." ## Model Definition ### Basic Module ```python import torch import torch.nn as nn from typing import Optional class MyModel(nn.Module): def __init__(self, config: dict): super().__init__() self.config = config # Define layers self.encoder = nn.Linear(config["input_dim"], config["hidden_dim"]) self.decoder = nn.Linear(config["hidden_dim"], config["output_dim"]) # Initialize weights self._init_weights() def _init_weights(self): """Initialize weights following paper's specification.""" for module in self.modules(): if isinstance(module, nn.Linear): nn.init.xavier_uniform_(module.weight) if module.bias is not None: nn.init.zeros_(module.bias) def forward(self, x: torch.Tensor) -> torch.Tensor: h = self.encoder(x) h = torch.relu(h) out = self.decoder(h) return out ``` ### Model with Multiple Outputs ```python from typing import Tuple, NamedTuple class ModelOutput(NamedTuple): logits: torch.Tensor hidden_states: torch.Tensor attention_weights: Optional[torch.Tensor] = None class MultiOutputModel(nn.Module): def forward(self, x: torch.Tensor) -> ModelOutput: # ... computation ... return ModelOutput( logits=logits, hidden_states=hidden, attention_weights=attn if self.return_attention else None, ) ``` ## Device Management ### Automatic Device Handling ```python class DeviceAwareModel(nn.Module): @property def device(self) -> torch.device: """Get model's device from first parameter.""" return next(self.parameters()).device def forward(self, x: torch.Tensor) -> torch.Tensor: # Input automatically on correct device if caller handles it # For internal tensors: mask = torch.ones(x.size(0), device=self.device) return x * mask ``` ### Training Script Device Setup ```python def get_device() -> torch.device: """Get best available device.""" if torch.cuda.is_available(): return torch.device("cuda") elif torch.backends.mps.is_available(): return torch.device("mps") return torch.device("cpu") device = get_device() model = MyModel(config).to(device) # DataLoader handles device transfer for batch in dataloader: inputs = batch["inputs"].to(device) targets = batch["targets"].to(device) ``` ## Training Loop ### Standard Pattern ```python def train_epoch( model: nn.Module, dataloader: DataLoader, optimizer: torch.optim.Optimizer, criterion: nn.Module, device: torch.device, scheduler: Optional[torch.optim.lr_scheduler._LRScheduler] = None, ) -> float: """Train for one epoch.""" model.train() total_loss = 0.0 num_batches = 0 for batch in tqdm(dataloader, desc="Training"): # Move to device inputs = batch["inputs"].to(device) targets = batch["targets"].to(device) # Forward pass optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, targets) # Backward pass loss.backward() # Gradient clipping (if needed) torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0) # Update optimizer.step() if scheduler is not None: scheduler.step() total_loss += loss.item() num_batches += 1 return total_loss / num_batches @torch.no_grad() def evaluate( model: nn.Module, dataloader: DataLoader, criterion: nn.Module, device: torch.device, ) -> Tuple[float, float]: """Evaluate model.""" model.eval() total_loss = 0.0 correct = 0 total = 0 for batch in dataloader: inputs = batch["inputs"].to(device) targets = batch["targets"].to(device) outputs = model(inputs) loss = criterion(outputs, targets) total_loss += loss.item() preds = outputs.argmax(dim=-1) correct += (preds == targets).sum().item() total += targets.size(0) return total_loss / len(dataloader), correct / total ``` ## Data Loading ### Custom Dataset ```python from torch.utils.data import Dataset, DataLoader class PaperDataset(Dataset): def __init__(self, data_path: str, transform=None): self.data = self._load_data(data_path) self.transform = transform def _load_data(self, path: str): # Load from disk pass def __len__(self) -> int: return len(self.data) def __getitem__(self, idx: int) -> dict: item = self.data[idx] if self.transform: item = self.transform(item) return item def get_dataloader( dataset: Dataset, batch_size: int, shuffle: bool = True, num_workers: int = 4, ) -> DataLoader: return DataLoader( dataset, batch_size=batch_size, shuffle=shuffle, num_workers=num_workers, pin_memory=True, # Faster GPU transfer drop_last=True, # Consistent batch sizes ) ``` ## Checkpointing ### Save and Load ```python def save_checkpoint( model: nn.Module, optimizer: torch.optim.Optimizer, epoch: int, loss: float, path: str, ): """Save training checkpoint.""" torch.save({ "epoch": epoch, "model_state_dict": model.state_dict(), "optimizer_state_dict": optimizer.state_dict(), "loss": loss, }, path) def load_checkpoint( path: str, model: nn.Module, optimizer: Optional[torch.optim.Optimizer] = None, ) -> dict: """Load training checkpoint.""" checkpoint = torch.load(path, weights_only=True) model.load_state_dict(checkpoint["model_state_dict"]) if optimizer is not None: optimizer.load_state_dict(checkpoint["optimizer_state_dict"]) return checkpoint ``` ## Reproducibility ### Set Seeds ```python import random import numpy as np import torch def set_seed(seed: int = 42): """Set all random seeds for reproducibility.""" random.seed(seed) np.random.seed(seed) torch.manual_seed(seed) torch.cuda.manual_seed_all(seed) # For deterministic behavior (may impact performance) torch.backends.cudnn.deterministic = True torch.backends.cudnn.benchmark = False ``` ## Common Gotchas ### In-place Operations ```python # BAD: Breaks autograd x += 1 x[:, 0] = 0 # GOOD: Creates new tensor x = x + 1 x = torch.cat([torch.zeros_like(x[:, :1]), x[:, 1:]], dim=1) ``` ### Detaching for Metrics ```python # BAD: Keeps computation graph accuracy = (preds == targets).float().mean() all_accs.append(accuracy) # Memory leak! # GOOD: Detach for logging accuracy = (preds == targets).float().mean().item() all_accs.append(accuracy) ``` ### Mixed Precision ```python from torch.cuda.amp import autocast, GradScaler scaler = GradScaler() for batch in dataloader: optimizer.zero_grad() with autocast(): outputs = model(inputs) loss = criterion(outputs, targets) scaler.scale(loss).backward() scaler.step(optimizer) scaler.update() ``` ``` - [ ] **Step 3: Verify file creation** ```bash cat .opencode/skills/pytorch-patterns/SKILL.md ``` Expected: File contents match the markdown above. - [ ] **Step 4: Commit** ```bash git add .opencode/skills/pytorch-patterns/SKILL.md git commit -m "feat(skills): add pytorch-patterns skill PyTorch best practices and common patterns." ``` --- ### Task 9: Create verification Skill **Files:** - Create: `.opencode/skills/verification/SKILL.md` - [ ] **Step 1: Create skill directory** ```bash mkdir -p .opencode/skills/verification ``` - [ ] **Step 2: Write SKILL.md** Create `.opencode/skills/verification/SKILL.md`: ```markdown --- name: verification description: Use when verifying replication results against paper's reported values --- # Replication Verification ## Overview Systematic approach to verifying that replicated code produces results matching the original paper. **Announce at start:** "I'm using the verification skill to validate replication accuracy." ## Verification Levels ### Level 1: Code Correctness - Unit tests pass - No runtime errors - Gradient flow works ### Level 2: Behavioral Match - Output shapes correct - Value ranges reasonable - Edge cases handled ### Level 3: Numerical Match - Results within tolerance of paper - Trends match (even if absolute values differ) - Statistical significance considered ## Test Design for Replication ### Shape Tests ```python def test_model_output_shape(): """Verify model produces correct output shape per paper.""" model = MyModel(config) x = torch.randn(batch_size, seq_len, input_dim) out = model(x) # Paper Section 3.2: "Output dimension is 512" assert out.shape == (batch_size, seq_len, 512) ``` ### Value Range Tests ```python def test_attention_weights_sum(): """Attention weights should sum to 1 (paper Eq. 3).""" model = AttentionLayer(config) x = torch.randn(batch_size, seq_len, dim) _, attn_weights = model(x, return_attention=True) # Softmax output sums to 1 assert torch.allclose(attn_weights.sum(dim=-1), torch.ones(batch_size, seq_len)) ``` ### Gradient Tests ```python def test_gradient_flow(): """Verify gradients flow through all parameters.""" model = MyModel(config) x = torch.randn(batch_size, input_dim, requires_grad=True) out = model(x) loss = out.sum() loss.backward() for name, param in model.named_parameters(): assert param.grad is not None, f"No gradient for {name}" assert not torch.isnan(param.grad).any(), f"NaN gradient for {name}" ``` ### Numerical Match Tests ```python def test_loss_value_reasonable(): """Loss should be in expected range per paper Figure 2.""" model = MyModel(config) # ... setup ... loss = compute_loss(model, data) # Paper reports initial loss ~2.3 (cross-entropy on 10 classes) assert 2.0 < loss.item() < 3.0, f"Initial loss {loss.item()} outside expected range" ``` ## Comparison Methodology ### Absolute Comparison ```python def compare_absolute(paper_value: float, our_value: float, tolerance: float = 0.01): """Compare with absolute tolerance.""" diff = abs(paper_value - our_value) return diff <= tolerance, diff ``` ### Relative Comparison ```python def compare_relative(paper_value: float, our_value: float, tolerance: float = 0.05): """Compare with relative tolerance (5% default).""" if paper_value == 0: return our_value == 0, abs(our_value) relative_diff = abs(paper_value - our_value) / abs(paper_value) return relative_diff <= tolerance, relative_diff ``` ### Statistical Comparison ```python def compare_with_variance( paper_mean: float, paper_std: float, our_values: List[float], confidence: float = 0.95, ): """Compare considering paper's reported variance.""" our_mean = np.mean(our_values) our_std = np.std(our_values) # Check if means are within 2 standard deviations combined_std = np.sqrt(paper_std**2 + our_std**2) z_score = abs(paper_mean - our_mean) / combined_std return z_score < 2.0, z_score ``` ## Common Difference Sources ### Acceptable Differences | Source | Typical Impact | Mitigation | |--------|---------------|------------| | Random seed | 1-2% | Run multiple seeds | | Floating point | < 0.1% | Use float64 for verification | | Framework differences | 1-3% | Document and accept | | Hardware differences | 0.5-1% | Note in report | ### Concerning Differences | Source | Typical Impact | Action | |--------|---------------|--------| | Wrong architecture | > 10% | Review code vs paper | | Wrong hyperparameters | 5-20% | Verify all settings | | Data preprocessing | Variable | Match paper exactly | | Evaluation protocol | Variable | Check train/val/test split | ## Verification Checklist ### Before Comparison - [ ] Seeds set for reproducibility - [ ] Same evaluation data as paper - [ ] Same preprocessing pipeline - [ ] Same evaluation metrics ### During Comparison - [ ] Run multiple times with different seeds - [ ] Record mean and standard deviation - [ ] Compare trends, not just final values - [ ] Check intermediate checkpoints if available ### After Comparison - [ ] Document all differences - [ ] Explain likely causes - [ ] Determine if differences are acceptable - [ ] Suggest improvements if needed ## Report Template ```markdown ## Verification Result: {Metric Name} **Paper Value**: {value} ± {std} **Our Value**: {value} ± {std} **Difference**: {absolute} ({relative}%) **Status**: MATCH | ACCEPTABLE | INVESTIGATE | MISMATCH **Analysis**: {explanation of difference} **Confidence**: {HIGH | MEDIUM | LOW} {reasoning for confidence level} ``` ``` - [ ] **Step 3: Verify file creation** ```bash cat .opencode/skills/verification/SKILL.md ``` Expected: File contents match the markdown above. - [ ] **Step 4: Commit** ```bash git add .opencode/skills/verification/SKILL.md git commit -m "feat(skills): add verification skill Replication result verification methodology." ``` --- ### Task 10: Create environment-management Skill **Files:** - Create: `.opencode/skills/environment-management/SKILL.md` - [ ] **Step 1: Create skill directory** ```bash mkdir -p .opencode/skills/environment-management ``` - [ ] **Step 2: Write SKILL.md** Create `.opencode/skills/environment-management/SKILL.md`: ```markdown --- name: environment-management description: Use when setting up Python environment for ML/DL paper replication using Conda + uv --- # Environment Management (Conda + uv) ## Overview Hybrid approach using Conda for system-level dependencies and uv for project isolation. **Announce at start:** "I'm using the environment-management skill for Conda + uv setup." ## Architecture ``` ┌─────────────────────────────────────────┐ │ Conda (System Base) │ │ - Python interpreter │ │ - CUDA toolkit │ │ - System-level C++ libraries │ └─────────────────────────────────────────┘ │ │ provides Python ▼ ┌─────────────────────────────────────────┐ │ uv (Project Isolation) │ │ - Per-project .venv │ │ - Fast dependency resolution │ │ - Reproducible installs │ └─────────────────────────────────────────┘ ``` ## Setup Commands ### Step 1: Conda Base Environment Check if base exists: ```bash conda env list | grep ai_base ``` Create if needed: ```bash # Linux/Mac conda create -n ai_base python=3.10 cuda-toolkit=11.8 -y # Windows (CUDA from NVIDIA, not conda) conda create -n ai_base python=3.10 -y ``` ### Step 2: Project Environment ```bash cd workspace/{paper_name} # Get Conda Python path # Linux/Mac: PYTHON_PATH=$(conda run -n ai_base which python) # Windows: # PYTHON_PATH=$(conda run -n ai_base python -c "import sys; print(sys.executable)") # Create uv venv uv venv --python $PYTHON_PATH ``` ### Step 3: Activate and Install ```bash # Linux/Mac source .venv/bin/activate # Windows .venv\Scripts\activate # Install dependencies uv pip install -e ".[dev]" ``` ## pyproject.toml Template ```toml [project] name = "{paper_name}" version = "0.1.0" description = "Replication of {paper_title}" requires-python = ">=3.10" dependencies = [ # Core ML "torch>=2.0.0", "numpy>=1.24.0", # Visualization "matplotlib>=3.7.0", "seaborn>=0.12.0", # Utilities "tqdm>=4.65.0", "pyyaml>=6.0", ] [project.optional-dependencies] dev = [ "pytest>=7.0.0", "pytest-cov>=4.0.0", "black>=23.0.0", "ruff>=0.0.260", ] # Add based on paper requirements vision = [ "torchvision>=0.15.0", "pillow>=9.5.0", ] nlp = [ "transformers>=4.30.0", "tokenizers>=0.13.0", "datasets>=2.12.0", ] [build-system] requires = ["hatchling"] build-backend = "hatchling.build" [tool.pytest.ini_options] testpaths = ["tests"] python_files = ["test_*.py"] addopts = "-v --tb=short" [tool.black] line-length = 88 target-version = ["py310"] [tool.ruff] line-length = 88 select = ["E", "F", "I", "N", "W"] ``` ## PyTorch + CUDA Compatibility | CUDA Version | PyTorch Version | Install Command | |--------------|-----------------|-----------------| | 11.8 | 2.0+ | `uv pip install torch --index-url https://download.pytorch.org/whl/cu118` | | 12.1 | 2.1+ | `uv pip install torch --index-url https://download.pytorch.org/whl/cu121` | | CPU only | Any | `uv pip install torch --index-url https://download.pytorch.org/whl/cpu` | ## Environment Verification ```bash # Check Python python --version # Check PyTorch python -c "import torch; print(f'PyTorch {torch.__version__}')" # Check CUDA python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')" python -c "import torch; print(f'CUDA version: {torch.version.cuda}')" # Check GPU python -c "import torch; print(f'GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else \"N/A\"}')" ``` ## Troubleshooting ### CUDA Not Found ```bash # Check NVIDIA driver nvidia-smi # Reinstall PyTorch with correct CUDA uv pip install torch --index-url https://download.pytorch.org/whl/cu118 --force-reinstall ``` ### Dependency Conflicts ```bash # Clear cache and reinstall uv cache clean uv pip install -e ".[dev]" --force-reinstall ``` ### Permission Errors (Windows) ```powershell # Run as Administrator or: Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser ``` ## Best Practices 1. **One environment per paper**: Don't mix dependencies 2. **Pin versions in pyproject.toml**: For reproducibility 3. **Use dev dependencies**: Keep test tools separate 4. **Document CUDA version**: In README.md 5. **Commit pyproject.toml**: Not .venv/ ## Quick Reference ```bash # Full setup sequence (Linux/Mac) conda activate ai_base || conda create -n ai_base python=3.10 cuda-toolkit=11.8 -y && conda activate ai_base cd workspace/{paper_name} uv venv --python $(which python) source .venv/bin/activate uv pip install -e ".[dev]" pytest tests/ -v ``` ``` - [ ] **Step 3: Verify file creation** ```bash cat .opencode/skills/environment-management/SKILL.md ``` Expected: File contents match the markdown above. - [ ] **Step 4: Commit** ```bash git add .opencode/skills/environment-management/SKILL.md git commit -m "feat(skills): add environment-management skill Conda + uv hybrid environment setup for ML projects." ``` --- ### Task 11: Create /replicate Command **Files:** - Create: `.opencode/commands/replicate.md` - [ ] **Step 1: Create commands directory** ```bash mkdir -p .opencode/commands ``` - [ ] **Step 2: Write replicate.md** Create `.opencode/commands/replicate.md`: ```markdown --- description: Start paper replication workflow agent: paper-director --- Start the paper replication workflow for the specified paper. ## Input Paper file: $ARGUMENTS If no file specified, ask the user to provide the path to a paper (Markdown file or paste text directly). ## Workflow 1. Validate paper file exists (if path provided) 2. Extract paper name from filename or ask user 3. Create workspace directory: `workspace/{paper_name}/` 4. Begin Phase 1: Paper Analysis - Dispatch @paper-image-extractor - Dispatch @paper-analyzer 5. Present Human Checkpoint with analysis summary 6. After approval, begin Phase 2: Code Generation (TDD) 7. Begin Phase 3: Verification 8. Present final replication report ## Example Usage ``` /replicate workspace/attention_is_all_you_need.md ``` Or without arguments: ``` /replicate > Please provide the path to your paper or paste the content directly. ``` ``` - [ ] **Step 3: Verify file creation** ```bash cat .opencode/commands/replicate.md ``` Expected: File contents match the markdown above. - [ ] **Step 4: Commit** ```bash git add .opencode/commands/replicate.md git commit -m "feat(commands): add /replicate command Entry point for paper replication workflow." ``` --- ### Task 12: Create /verify Command **Files:** - Create: `.opencode/commands/verify.md` - [ ] **Step 1: Write verify.md** Create `.opencode/commands/verify.md`: ```markdown --- description: Verify replication results for a completed project agent: paper-director --- Verify the replication results for an existing project. ## Input Project directory: $ARGUMENTS If no directory specified, list available projects in workspace/ and ask user to select. ## Workflow 1. Validate project directory exists 2. Check required files exist: - `analysis/paper_structure.md` - `analysis/replication_plan.md` - `src/` with code - `tests/` with tests 3. Dispatch @test-runner to: - Run test suite - Compare results with paper - Generate/update `reports/replication_report.md` 4. Present verification summary ## Example Usage ``` /verify workspace/attention_is_all_you_need/ ``` Or without arguments: ``` /verify > Available projects: > 1. attention_is_all_you_need > 2. resnet > Please select a project to verify. ``` ``` - [ ] **Step 2: Verify file creation** ```bash cat .opencode/commands/verify.md ``` Expected: File contents match the markdown above. - [ ] **Step 3: Commit** ```bash git add .opencode/commands/verify.md git commit -m "feat(commands): add /verify command Entry point for verification of existing replication projects." ``` --- ### Task 13: Create opencode.json Configuration **Files:** - Create: `opencode.json` - [ ] **Step 1: Write opencode.json** Create `opencode.json`: ```json { "$schema": "https://opencode.ai/config.json", "default_agent": "paper-director", "agent": { "paper-director": { "mode": "primary" }, "paper-analyzer": { "mode": "subagent" }, "paper-image-extractor": { "mode": "subagent" }, "code-writer": { "mode": "subagent" }, "test-runner": { "mode": "subagent" } } } ``` - [ ] **Step 2: Verify file creation** ```bash cat opencode.json ``` Expected: Valid JSON matching above. - [ ] **Step 3: Commit** ```bash git add opencode.json git commit -m "feat: add opencode.json project configuration Sets paper-director as default agent with subagent definitions." ``` --- ### Task 14: Create Workspace Directory **Files:** - Create: `workspace/.gitkeep` - [ ] **Step 1: Create workspace directory** ```bash mkdir -p workspace ``` - [ ] **Step 2: Create .gitkeep** ```bash touch workspace/.gitkeep ``` Or on Windows: ```powershell New-Item -ItemType File -Path workspace/.gitkeep -Force ``` - [ ] **Step 3: Verify directory creation** ```bash ls -la workspace/ ``` Expected: Directory exists with .gitkeep file. - [ ] **Step 4: Commit** ```bash git add workspace/.gitkeep git commit -m "feat: add workspace directory for paper replication projects Papers placed here will be processed by the replication agents." ``` --- ### Task 15: Final Verification **Files:** - Read: All created files - [ ] **Step 1: Verify directory structure** ```bash find .opencode -type f -name "*.md" | sort ``` Expected output: ``` .opencode/agents/code-writer.md .opencode/agents/paper-analyzer.md .opencode/agents/paper-director.md .opencode/agents/paper-image-extractor.md .opencode/agents/test-runner.md .opencode/commands/replicate.md .opencode/commands/verify.md .opencode/skills/code-generation/SKILL.md .opencode/skills/environment-management/SKILL.md .opencode/skills/paper-parsing/SKILL.md .opencode/skills/pytorch-patterns/SKILL.md .opencode/skills/verification/SKILL.md ``` - [ ] **Step 2: Verify opencode.json** ```bash cat opencode.json | python -m json.tool ``` Expected: Valid JSON output, no errors. - [ ] **Step 3: Verify workspace exists** ```bash ls workspace/ ``` Expected: .gitkeep file present. - [ ] **Step 4: Run OpenCode to verify agents load** ```bash opencode --help ``` Then in OpenCode: ``` /help ``` Verify that `/replicate` and `/verify` commands appear. - [ ] **Step 5: Test agent switching** In OpenCode, press Tab to cycle agents. Verify `paper-director` is available. - [ ] **Step 6: Test subagent mention** ``` @paper-analyzer Can you help me? ``` Verify subagent responds. - [ ] **Step 7: Final commit summary** ```bash git log --oneline -15 ``` Expected: All feature commits present. --- ## Self-Review Checklist - [x] **Spec coverage**: All 5 agents, 5 skills, 2 commands, config file defined - [x] **No placeholders**: All code blocks complete - [x] **Consistent naming**: Agent/skill names match throughout - [x] **File paths exact**: All paths specified completely - [x] **Commits granular**: Each task has a commit step