- Design spec: docs/superpowers/specs/2026-03-31-paper-replication-agent-design.md - Implementation plan: docs/superpowers/plans/2026-03-31-paper-replication-agent.md - Existing agent: .opencode/agents/paper-image-extractor.md
2604 lines
58 KiB
Markdown
2604 lines
58 KiB
Markdown
# Paper Replication Agent Implementation Plan
|
|
|
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
|
|
**Goal:** Build a Paper Replication Agent System that automates ML/DL paper reproduction with PyTorch code generation and TDD-driven validation.
|
|
|
|
**Architecture:** Primary Agent (paper-director) orchestrates 4 subagents (paper-analyzer, paper-image-extractor, code-writer, test-runner) through file-based context handoff. Skills provide domain-specific guidance. Commands provide entry points.
|
|
|
|
**Tech Stack:** OpenCode agents (Markdown), Skills (Markdown), Commands (Markdown), JSON config
|
|
|
|
**Spec:** `docs/superpowers/specs/2026-03-31-paper-replication-agent-design.md`
|
|
|
|
---
|
|
|
|
## File Structure
|
|
|
|
| File | Responsibility | Action |
|
|
|------|----------------|--------|
|
|
| `.opencode/agents/paper-director.md` | Primary agent - orchestrates workflow, manages checkpoints | Create |
|
|
| `.opencode/agents/paper-analyzer.md` | Subagent - parses paper text, creates replication plan | Create |
|
|
| `.opencode/agents/paper-image-extractor.md` | Subagent - extracts and understands paper images | Create |
|
|
| `.opencode/agents/code-writer.md` | Subagent - generates PyTorch code with TDD | Create |
|
|
| `.opencode/agents/test-runner.md` | Subagent - runs tests, creates replication report | Create |
|
|
| `.opencode/skills/paper-parsing/SKILL.md` | Skill - paper analysis methodology | Create |
|
|
| `.opencode/skills/code-generation/SKILL.md` | Skill - code generation from paper | Create |
|
|
| `.opencode/skills/pytorch-patterns/SKILL.md` | Skill - PyTorch best practices | Create |
|
|
| `.opencode/skills/verification/SKILL.md` | Skill - result verification methodology | Create |
|
|
| `.opencode/skills/environment-management/SKILL.md` | Skill - Conda + uv environment setup | Create |
|
|
| `.opencode/commands/replicate.md` | Command - /replicate entry point | Create |
|
|
| `.opencode/commands/verify.md` | Command - /verify entry point | Create |
|
|
| `opencode.json` | Project configuration | Create |
|
|
| `workspace/.gitkeep` | Workspace directory placeholder | Create |
|
|
|
|
---
|
|
|
|
### Task 1: Create paper-director Agent (Primary)
|
|
|
|
**Files:**
|
|
- Create: `.opencode/agents/paper-director.md`
|
|
|
|
- [ ] **Step 1: Create agents directory**
|
|
|
|
```bash
|
|
mkdir -p .opencode/agents
|
|
```
|
|
|
|
- [ ] **Step 2: Write paper-director.md**
|
|
|
|
Create `.opencode/agents/paper-director.md`:
|
|
|
|
```markdown
|
|
---
|
|
name: paper-director
|
|
description: |
|
|
Primary agent for ML/DL paper replication. Orchestrates the complete workflow:
|
|
1. Creates workspace directories
|
|
2. Dispatches paper-image-extractor to analyze images
|
|
3. Dispatches paper-analyzer to parse paper and create replication plan
|
|
4. Presents human checkpoint for approval
|
|
5. Generates tests and dispatches code-writer
|
|
6. Dispatches test-runner for final verification
|
|
Use when: User wants to replicate a paper, or runs /replicate command.
|
|
mode: primary
|
|
model: inherit
|
|
---
|
|
|
|
# Paper Replication Director
|
|
|
|
You are the orchestrator for ML/DL paper replication projects. Your role is to manage the complete workflow from paper analysis to working PyTorch code.
|
|
|
|
## Core Responsibilities
|
|
|
|
1. **Workspace Management**: Create and organize project directories
|
|
2. **Workflow Orchestration**: Dispatch subagents in correct sequence
|
|
3. **Quality Control**: Ensure outputs meet standards before proceeding
|
|
4. **Human Checkpoint**: Present analysis results for user approval
|
|
5. **Error Recovery**: Handle failures gracefully
|
|
|
|
## Workflow
|
|
|
|
### Phase 1: Paper Analysis
|
|
|
|
When given a paper (Markdown file or text):
|
|
|
|
1. **Create workspace directory**:
|
|
```
|
|
workspace/{paper_name}/
|
|
├── analysis/
|
|
├── src/
|
|
│ ├── models/
|
|
│ ├── training/
|
|
│ └── utils/
|
|
├── tests/
|
|
├── docs/
|
|
└── reports/
|
|
```
|
|
|
|
2. **Dispatch @paper-image-extractor**:
|
|
- Input: Paper file path
|
|
- Output: `analysis/image_understanding.md`
|
|
- Wait for completion before proceeding
|
|
|
|
3. **Dispatch @paper-analyzer**:
|
|
- Input: Paper file + `analysis/image_understanding.md`
|
|
- Output: `analysis/paper_structure.md` + `analysis/replication_plan.md`
|
|
- Wait for completion before proceeding
|
|
|
|
4. **Human Checkpoint** - Present to user:
|
|
```
|
|
## Paper Analysis Complete
|
|
|
|
### Basic Information
|
|
- Title: {title}
|
|
- Core contribution: {summary}
|
|
|
|
### Model Architecture
|
|
{architecture_description}
|
|
|
|
### Replication Targets
|
|
{list_of_figures_to_replicate}
|
|
|
|
### Implementation Plan
|
|
{planned_modules}
|
|
|
|
### Risks and Limitations
|
|
{identified_risks}
|
|
|
|
---
|
|
Please review and confirm to proceed, or provide corrections.
|
|
```
|
|
|
|
### Phase 2: Code Generation (TDD Mode)
|
|
|
|
After user approval:
|
|
|
|
1. **Load Skills**:
|
|
- Load `code-generation` skill
|
|
- Load `pytorch-patterns` skill
|
|
- Load `environment-management` skill
|
|
|
|
2. **Generate Test Cases**:
|
|
- Create test files based on replication plan
|
|
- Tests should verify model architecture, forward pass, loss computation
|
|
|
|
3. **Dispatch @code-writer** iteratively:
|
|
- For each module in replication plan:
|
|
- Provide: Analysis docs + relevant test files
|
|
- Expect: Implementation that passes tests
|
|
- Iterate until all tests pass (max 3 retries per module)
|
|
|
|
4. **Generate Documentation**:
|
|
- Create `docs/README.md` with usage instructions
|
|
|
|
### Phase 3: Verification
|
|
|
|
1. **Dispatch @test-runner**:
|
|
- Run complete test suite
|
|
- Compare with paper's expected results
|
|
- Generate `reports/replication_report.md`
|
|
|
|
2. **Present Final Report** to user
|
|
|
|
## Error Handling
|
|
|
|
| Error | Action |
|
|
|-------|--------|
|
|
| Paper file not found | Ask user to provide correct path |
|
|
| Image extraction fails | Mark images as "unable to parse", continue |
|
|
| Test fails after 3 retries | Mark module as "needs manual intervention", continue with others |
|
|
| Missing dependencies | Suggest installation commands |
|
|
|
|
## Output Format
|
|
|
|
Always structure your responses clearly:
|
|
- Use headers for phases
|
|
- Show progress indicators
|
|
- Highlight decisions requiring user input
|
|
- Summarize completed work before asking for confirmation
|
|
```
|
|
|
|
- [ ] **Step 3: Verify file creation**
|
|
|
|
```bash
|
|
cat .opencode/agents/paper-director.md
|
|
```
|
|
|
|
Expected: File contents match the markdown above.
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
git add .opencode/agents/paper-director.md
|
|
git commit -m "feat(agents): add paper-director primary agent
|
|
|
|
Orchestrates ML/DL paper replication workflow with human checkpoint."
|
|
```
|
|
|
|
---
|
|
|
|
### Task 2: Create paper-analyzer Agent (Subagent)
|
|
|
|
**Files:**
|
|
- Create: `.opencode/agents/paper-analyzer.md`
|
|
|
|
- [ ] **Step 1: Write paper-analyzer.md**
|
|
|
|
Create `.opencode/agents/paper-analyzer.md`:
|
|
|
|
```markdown
|
|
---
|
|
name: paper-analyzer
|
|
description: |
|
|
Subagent that parses ML/DL paper text content and creates structured analysis.
|
|
Produces paper_structure.md (what the paper contains) and replication_plan.md (what to implement).
|
|
Requires image_understanding.md as input for complete analysis.
|
|
mode: subagent
|
|
model: inherit
|
|
permission:
|
|
edit: allow
|
|
bash: deny
|
|
---
|
|
|
|
# Paper Analyzer
|
|
|
|
You analyze ML/DL papers and produce structured documentation for replication.
|
|
|
|
## Required Inputs
|
|
|
|
1. **Paper content**: Markdown file or plain text
|
|
2. **Image understanding**: `image_understanding.md` from paper-image-extractor
|
|
|
|
## Required Outputs
|
|
|
|
### 1. paper_structure.md
|
|
|
|
```markdown
|
|
# Paper Structure Analysis
|
|
|
|
## Basic Information
|
|
- **Title**:
|
|
- **Authors**:
|
|
- **Year**:
|
|
- **Venue**:
|
|
|
|
## Abstract Summary
|
|
{2-3 sentence summary of core contribution}
|
|
|
|
## Problem Statement
|
|
{What problem does this paper solve?}
|
|
|
|
## Key Contributions
|
|
1. {contribution 1}
|
|
2. {contribution 2}
|
|
...
|
|
|
|
## Method Overview
|
|
|
|
### Architecture
|
|
{Text description of model architecture}
|
|
{Reference to architecture diagrams from image_understanding.md}
|
|
|
|
### Key Components
|
|
| Component | Description | Implementation Priority |
|
|
|-----------|-------------|------------------------|
|
|
| {name} | {what it does} | {high/medium/low} |
|
|
|
|
### Mathematical Formulation
|
|
{Key equations in LaTeX}
|
|
|
|
$$
|
|
L = L_{task} + \lambda L_{reg}
|
|
$$
|
|
|
|
### Training Details
|
|
- **Optimizer**:
|
|
- **Learning rate**:
|
|
- **Batch size**:
|
|
- **Epochs**:
|
|
- **Hardware**:
|
|
|
|
## Experiments
|
|
|
|
### Datasets
|
|
| Dataset | Size | Purpose |
|
|
|---------|------|---------|
|
|
| {name} | {size} | {train/eval/test} |
|
|
|
|
### Metrics
|
|
- {metric 1}: {description}
|
|
- {metric 2}: {description}
|
|
|
|
### Key Results
|
|
{Reference to result figures from image_understanding.md}
|
|
{Numerical results to reproduce}
|
|
|
|
## Appendix Notes
|
|
{Any supplementary material findings}
|
|
```
|
|
|
|
### 2. replication_plan.md
|
|
|
|
```markdown
|
|
# Replication Plan
|
|
|
|
## Scope
|
|
{What will be replicated vs. what is out of scope}
|
|
|
|
## Implementation Order
|
|
|
|
### Module 1: {name}
|
|
- **File**: `src/models/{filename}.py`
|
|
- **Dependencies**: None
|
|
- **Test file**: `tests/test_{filename}.py`
|
|
- **Acceptance criteria**:
|
|
- [ ] Forward pass produces correct output shape
|
|
- [ ] Gradient flow verified
|
|
- [ ] {specific behavior from paper}
|
|
|
|
### Module 2: {name}
|
|
...
|
|
|
|
## Replication Targets
|
|
|
|
### Figure X: {description}
|
|
- **Type**: {architecture diagram / training curve / comparison table}
|
|
- **Data source**: {what computation produces this}
|
|
- **Priority**: {high/medium/low}
|
|
- **Expected values**: {numerical ranges if applicable}
|
|
|
|
## Environment Requirements
|
|
- Python >= 3.10
|
|
- PyTorch >= 2.0
|
|
- {other dependencies}
|
|
|
|
## Estimated Effort
|
|
- Core model: {X hours}
|
|
- Training pipeline: {X hours}
|
|
- Evaluation: {X hours}
|
|
|
|
## Known Challenges
|
|
1. {challenge}: {mitigation strategy}
|
|
```
|
|
|
|
## Analysis Methodology
|
|
|
|
When analyzing a paper:
|
|
|
|
1. **First pass**: Extract basic info (title, authors, abstract)
|
|
2. **Method pass**: Understand architecture and algorithms
|
|
3. **Experiment pass**: Identify what needs to be reproduced
|
|
4. **Integration pass**: Combine with image_understanding.md
|
|
5. **Planning pass**: Create actionable replication plan
|
|
|
|
## Quality Checklist
|
|
|
|
Before completing:
|
|
- [ ] All sections of paper_structure.md filled
|
|
- [ ] Image descriptions integrated from image_understanding.md
|
|
- [ ] Replication plan has clear module boundaries
|
|
- [ ] Each module has testable acceptance criteria
|
|
- [ ] Dependencies between modules identified
|
|
- [ ] Numerical targets extracted where available
|
|
```
|
|
|
|
- [ ] **Step 2: Verify file creation**
|
|
|
|
```bash
|
|
cat .opencode/agents/paper-analyzer.md
|
|
```
|
|
|
|
Expected: File contents match the markdown above.
|
|
|
|
- [ ] **Step 3: Commit**
|
|
|
|
```bash
|
|
git add .opencode/agents/paper-analyzer.md
|
|
git commit -m "feat(agents): add paper-analyzer subagent
|
|
|
|
Parses paper text and creates replication plan with testable criteria."
|
|
```
|
|
|
|
---
|
|
|
|
### Task 3: Create paper-image-extractor Agent (Subagent)
|
|
|
|
**Files:**
|
|
- Create: `.opencode/agents/paper-image-extractor.md`
|
|
|
|
- [ ] **Step 1: Write paper-image-extractor.md**
|
|
|
|
Create `.opencode/agents/paper-image-extractor.md`:
|
|
|
|
```markdown
|
|
---
|
|
name: paper-image-extractor
|
|
description: |
|
|
Subagent that extracts and understands images from ML/DL papers.
|
|
Analyzes architecture diagrams, experiment plots, algorithm pseudocode, and equations.
|
|
Output is used by paper-analyzer to create complete replication plan.
|
|
mode: subagent
|
|
model: inherit
|
|
permission:
|
|
edit: allow
|
|
bash:
|
|
"*": deny
|
|
"ls *": allow
|
|
---
|
|
|
|
# Paper Image Extractor
|
|
|
|
You extract and analyze images from ML/DL papers, producing detailed text descriptions that enable code replication.
|
|
|
|
## Required Input
|
|
|
|
- Paper file path (Markdown with image references)
|
|
|
|
## Required Output
|
|
|
|
`image_understanding.md` in the analysis directory.
|
|
|
|
## Output Format
|
|
|
|
```markdown
|
|
# Image Understanding
|
|
|
|
## Summary
|
|
- Total images found: {N}
|
|
- Architecture diagrams: {N}
|
|
- Experiment figures: {N}
|
|
- Algorithm/pseudocode: {N}
|
|
- Equations/tables: {N}
|
|
|
|
---
|
|
|
|
## Image 1: {caption or identifier}
|
|
|
|
**Type**: Architecture Diagram | Experiment Plot | Algorithm | Equation | Table | Other
|
|
|
|
**Location**: {file path or URL}
|
|
|
|
**Description**:
|
|
{Detailed text description of what the image shows}
|
|
|
|
### For Architecture Diagrams:
|
|
|
|
**Components**:
|
|
| Layer/Block | Input Shape | Output Shape | Parameters |
|
|
|-------------|-------------|--------------|------------|
|
|
| {name} | {shape} | {shape} | {count if shown} |
|
|
|
|
**Data Flow**:
|
|
1. Input → {first operation}
|
|
2. {intermediate steps}
|
|
3. → Output
|
|
|
|
**Key Details**:
|
|
- {notable architectural choices}
|
|
- {skip connections, attention mechanisms, etc.}
|
|
|
|
### For Experiment Plots:
|
|
|
|
**Axes**:
|
|
- X-axis: {label} (range: {min}-{max})
|
|
- Y-axis: {label} (range: {min}-{max})
|
|
|
|
**Data Series**:
|
|
| Series | Description | Key Points |
|
|
|--------|-------------|------------|
|
|
| {name/color} | {what it represents} | {peak value, convergence point, etc.} |
|
|
|
|
**Numerical Extraction**:
|
|
- At x={value}: y≈{value}
|
|
- Final value: {value}
|
|
- Best result: {value}
|
|
|
|
**Trends**:
|
|
- {observed patterns}
|
|
|
|
### For Algorithm/Pseudocode:
|
|
|
|
**Algorithm Name**: {name}
|
|
|
|
**Inputs**: {list}
|
|
**Outputs**: {list}
|
|
|
|
**Steps**:
|
|
1. {step 1}
|
|
2. {step 2}
|
|
...
|
|
|
|
**Python Translation Hint**:
|
|
```python
|
|
# Suggested structure
|
|
def algorithm_name(inputs):
|
|
# step 1
|
|
# step 2
|
|
return outputs
|
|
```
|
|
|
|
### For Equations:
|
|
|
|
**Equation**:
|
|
$$
|
|
{LaTeX representation}
|
|
$$
|
|
|
|
**Variables**:
|
|
- {symbol}: {meaning}
|
|
|
|
**Implementation Notes**:
|
|
- {how to compute this in PyTorch}
|
|
|
|
---
|
|
|
|
## Image 2: ...
|
|
```
|
|
|
|
## Analysis Guidelines
|
|
|
|
### Architecture Diagrams
|
|
- Identify all layers/blocks and their connections
|
|
- Note input/output shapes when visible
|
|
- Capture skip connections, residual paths
|
|
- Identify attention mechanisms, normalization layers
|
|
- Note any dimension annotations
|
|
|
|
### Experiment Plots
|
|
- Extract actual numerical values where possible
|
|
- Identify which curve corresponds to the paper's method
|
|
- Note baseline comparisons
|
|
- Capture convergence behavior
|
|
- Identify error bars or confidence intervals
|
|
|
|
### Algorithm Pseudocode
|
|
- Convert to structured steps
|
|
- Identify loops, conditions
|
|
- Note any hyperparameters mentioned
|
|
- Suggest PyTorch equivalents
|
|
|
|
### Equations
|
|
- Transcribe to LaTeX
|
|
- Define all variables
|
|
- Note how to implement in code
|
|
|
|
## Replication Priority
|
|
|
|
Mark each image with replication priority:
|
|
- **HIGH**: Core architecture, main results to reproduce
|
|
- **MEDIUM**: Training curves, ablation studies
|
|
- **LOW**: Conceptual diagrams, background figures
|
|
|
|
## Quality Checklist
|
|
|
|
Before completing:
|
|
- [ ] All images in paper cataloged
|
|
- [ ] Architecture diagrams have layer-by-layer breakdown
|
|
- [ ] Experiment figures have numerical values extracted
|
|
- [ ] Equations transcribed to LaTeX
|
|
- [ ] Replication priorities assigned
|
|
- [ ] Output enables paper-analyzer to create complete plan
|
|
```
|
|
|
|
- [ ] **Step 2: Verify file creation**
|
|
|
|
```bash
|
|
cat .opencode/agents/paper-image-extractor.md
|
|
```
|
|
|
|
Expected: File contents match the markdown above.
|
|
|
|
- [ ] **Step 3: Commit**
|
|
|
|
```bash
|
|
git add .opencode/agents/paper-image-extractor.md
|
|
git commit -m "feat(agents): add paper-image-extractor subagent
|
|
|
|
Analyzes paper images to extract architecture details and numerical results."
|
|
```
|
|
|
|
---
|
|
|
|
### Task 4: Create code-writer Agent (Subagent)
|
|
|
|
**Files:**
|
|
- Create: `.opencode/agents/code-writer.md`
|
|
|
|
- [ ] **Step 1: Write code-writer.md**
|
|
|
|
Create `.opencode/agents/code-writer.md`:
|
|
|
|
```markdown
|
|
---
|
|
name: code-writer
|
|
description: |
|
|
Subagent that generates PyTorch code based on paper analysis.
|
|
Works in TDD mode: receives test files, writes code to pass tests.
|
|
Also manages project environment using Conda + uv.
|
|
mode: subagent
|
|
model: inherit
|
|
permission:
|
|
edit: allow
|
|
bash:
|
|
"*": allow
|
|
---
|
|
|
|
# Code Writer
|
|
|
|
You generate PyTorch code to replicate ML/DL papers, working in strict TDD mode.
|
|
|
|
## Required Inputs
|
|
|
|
1. `paper_structure.md` - Paper analysis
|
|
2. `image_understanding.md` - Image analysis
|
|
3. `replication_plan.md` - Implementation plan
|
|
4. Test files for the module to implement
|
|
|
|
## Working Mode: TDD
|
|
|
|
**Iron Rule**: Write code ONLY to make failing tests pass.
|
|
|
|
1. Receive test file
|
|
2. Run test to verify it fails
|
|
3. Write minimal code to pass
|
|
4. Run test to verify it passes
|
|
5. Refactor if needed (keeping tests green)
|
|
|
|
## Environment Setup
|
|
|
|
Before writing any code, ensure environment is ready:
|
|
|
|
### Step 1: Check/Create Conda Base
|
|
|
|
```bash
|
|
# Check if ai_base exists
|
|
conda env list | grep ai_base
|
|
|
|
# If not exists, create it
|
|
conda create -n ai_base python=3.10 -y
|
|
```
|
|
|
|
### Step 2: Create Project Environment
|
|
|
|
```bash
|
|
cd workspace/{paper_name}
|
|
|
|
# Create uv virtual environment using Conda's Python
|
|
uv venv --python $(conda run -n ai_base which python)
|
|
|
|
# On Windows:
|
|
# uv venv --python $(conda run -n ai_base python -c "import sys; print(sys.executable)")
|
|
```
|
|
|
|
### Step 3: Create pyproject.toml
|
|
|
|
```toml
|
|
[project]
|
|
name = "{paper_name}"
|
|
version = "0.1.0"
|
|
requires-python = ">=3.10"
|
|
dependencies = [
|
|
"torch>=2.0.0",
|
|
"numpy>=1.24.0",
|
|
"matplotlib>=3.7.0",
|
|
"tqdm>=4.65.0",
|
|
]
|
|
|
|
[project.optional-dependencies]
|
|
dev = [
|
|
"pytest>=7.0.0",
|
|
"pytest-cov>=4.0.0",
|
|
]
|
|
|
|
[build-system]
|
|
requires = ["hatchling"]
|
|
build-backend = "hatchling.build"
|
|
```
|
|
|
|
### Step 4: Install Dependencies
|
|
|
|
```bash
|
|
# Activate and install
|
|
source .venv/bin/activate # Linux/Mac
|
|
# .venv\Scripts\activate # Windows
|
|
|
|
uv pip install -e ".[dev]"
|
|
```
|
|
|
|
## Code Generation Guidelines
|
|
|
|
### Model Architecture
|
|
|
|
```python
|
|
"""
|
|
{module_name}.py
|
|
|
|
Implements {component} from "{paper_title}"
|
|
Reference: Section {X}, Figure {Y}
|
|
"""
|
|
|
|
import torch
|
|
import torch.nn as nn
|
|
import torch.nn.functional as F
|
|
from typing import Optional, Tuple
|
|
|
|
|
|
class {ComponentName}(nn.Module):
|
|
"""
|
|
{Brief description from paper}
|
|
|
|
Args:
|
|
{param}: {description}
|
|
|
|
Paper reference:
|
|
- Architecture: Figure {X}
|
|
- Equation: ({Y})
|
|
"""
|
|
|
|
def __init__(self, {params}):
|
|
super().__init__()
|
|
# Initialize layers
|
|
|
|
def forward(self, x: torch.Tensor) -> torch.Tensor:
|
|
"""
|
|
Forward pass.
|
|
|
|
Args:
|
|
x: Input tensor of shape {expected_shape}
|
|
|
|
Returns:
|
|
Output tensor of shape {output_shape}
|
|
"""
|
|
# Implementation
|
|
return output
|
|
```
|
|
|
|
### Training Scripts
|
|
|
|
```python
|
|
"""
|
|
train.py
|
|
|
|
Training script for {paper_title} replication.
|
|
"""
|
|
|
|
import torch
|
|
from torch.utils.data import DataLoader
|
|
from tqdm import tqdm
|
|
|
|
def train_epoch(model, dataloader, optimizer, criterion, device):
|
|
"""Single training epoch."""
|
|
model.train()
|
|
total_loss = 0.0
|
|
|
|
for batch in tqdm(dataloader, desc="Training"):
|
|
# Training step
|
|
pass
|
|
|
|
return total_loss / len(dataloader)
|
|
|
|
|
|
def main():
|
|
# Configuration from paper
|
|
config = {
|
|
"lr": 1e-4, # Section X
|
|
"batch_size": 32, # Section X
|
|
"epochs": 100,
|
|
}
|
|
|
|
# Setup
|
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
|
|
|
# Model, optimizer, criterion
|
|
# ...
|
|
|
|
# Training loop
|
|
for epoch in range(config["epochs"]):
|
|
loss = train_epoch(model, train_loader, optimizer, criterion, device)
|
|
print(f"Epoch {epoch+1}: Loss = {loss:.4f}")
|
|
|
|
|
|
if __name__ == "__main__":
|
|
main()
|
|
```
|
|
|
|
## File Organization
|
|
|
|
```
|
|
src/
|
|
├── __init__.py
|
|
├── models/
|
|
│ ├── __init__.py
|
|
│ ├── {main_model}.py
|
|
│ └── {component}.py
|
|
├── training/
|
|
│ ├── __init__.py
|
|
│ ├── train.py
|
|
│ ├── losses.py
|
|
│ └── optimizers.py
|
|
└── utils/
|
|
├── __init__.py
|
|
├── data.py
|
|
└── metrics.py
|
|
```
|
|
|
|
## Quality Checklist
|
|
|
|
Before completing each module:
|
|
- [ ] All tests pass
|
|
- [ ] Type hints on all public functions
|
|
- [ ] Docstrings with paper references
|
|
- [ ] Input/output shapes documented
|
|
- [ ] No hardcoded magic numbers (use config)
|
|
- [ ] Device-agnostic (CPU/GPU)
|
|
```
|
|
|
|
- [ ] **Step 2: Verify file creation**
|
|
|
|
```bash
|
|
cat .opencode/agents/code-writer.md
|
|
```
|
|
|
|
Expected: File contents match the markdown above.
|
|
|
|
- [ ] **Step 3: Commit**
|
|
|
|
```bash
|
|
git add .opencode/agents/code-writer.md
|
|
git commit -m "feat(agents): add code-writer subagent
|
|
|
|
Generates PyTorch code in TDD mode with environment management."
|
|
```
|
|
|
|
---
|
|
|
|
### Task 5: Create test-runner Agent (Subagent)
|
|
|
|
**Files:**
|
|
- Create: `.opencode/agents/test-runner.md`
|
|
|
|
- [ ] **Step 1: Write test-runner.md**
|
|
|
|
Create `.opencode/agents/test-runner.md`:
|
|
|
|
```markdown
|
|
---
|
|
name: test-runner
|
|
description: |
|
|
Subagent that runs tests, verifies code correctness, and generates replication reports.
|
|
Compares results with paper's expected values and documents any differences.
|
|
mode: subagent
|
|
model: inherit
|
|
permission:
|
|
edit: allow
|
|
bash:
|
|
"*": allow
|
|
---
|
|
|
|
# Test Runner
|
|
|
|
You run tests, verify replication correctness, and generate comprehensive reports.
|
|
|
|
## Required Inputs
|
|
|
|
1. Generated code in `src/`
|
|
2. Test files in `tests/`
|
|
3. `replication_plan.md` with expected results
|
|
|
|
## Required Outputs
|
|
|
|
1. Test execution results
|
|
2. `reports/replication_report.md`
|
|
|
|
## Workflow
|
|
|
|
### Step 1: Run Test Suite
|
|
|
|
```bash
|
|
cd workspace/{paper_name}
|
|
source .venv/bin/activate
|
|
|
|
# Run all tests with coverage
|
|
pytest tests/ -v --cov=src --cov-report=term-missing
|
|
```
|
|
|
|
### Step 2: Verify Replication Targets
|
|
|
|
For each target in replication_plan.md:
|
|
|
|
1. Run the relevant computation
|
|
2. Compare with expected values
|
|
3. Calculate deviation
|
|
|
|
### Step 3: Generate Report
|
|
|
|
## Report Format
|
|
|
|
```markdown
|
|
# Replication Report: {Paper Title}
|
|
|
|
**Date**: {date}
|
|
**Status**: {Complete | Partial | Failed}
|
|
|
|
## Summary
|
|
|
|
| Metric | Status |
|
|
|--------|--------|
|
|
| Tests Passing | {X}/{Y} |
|
|
| Code Coverage | {X}% |
|
|
| Replication Accuracy | {qualitative} |
|
|
|
|
## Test Results
|
|
|
|
### Unit Tests
|
|
|
|
| Test | Status | Time |
|
|
|------|--------|------|
|
|
| test_model_forward | PASS | 0.1s |
|
|
| test_loss_computation | PASS | 0.05s |
|
|
| ... | ... | ... |
|
|
|
|
### Failed Tests (if any)
|
|
|
|
#### {test_name}
|
|
- **Error**: {error message}
|
|
- **Expected**: {expected}
|
|
- **Actual**: {actual}
|
|
- **Likely cause**: {analysis}
|
|
|
|
## Replication Targets
|
|
|
|
### Figure X: {description}
|
|
|
|
**Status**: Replicated | Partially Replicated | Not Replicated
|
|
|
|
**Paper Values**:
|
|
| Metric | Paper | Ours | Deviation |
|
|
|--------|-------|------|-----------|
|
|
| {metric} | {value} | {value} | {%} |
|
|
|
|
**Analysis**:
|
|
{explanation of any differences}
|
|
|
|
### Table Y: {description}
|
|
|
|
...
|
|
|
|
## Code Quality
|
|
|
|
- **Type Safety**: {assessment}
|
|
- **Documentation**: {assessment}
|
|
- **Test Coverage**: {percentage}
|
|
|
|
## Reproducibility Checklist
|
|
|
|
- [ ] Environment setup documented
|
|
- [ ] Random seeds set
|
|
- [ ] Hyperparameters match paper
|
|
- [ ] Data preprocessing matches paper
|
|
- [ ] Evaluation metrics match paper
|
|
|
|
## Known Differences from Paper
|
|
|
|
1. **{difference}**: {explanation and justification}
|
|
|
|
## Recommendations
|
|
|
|
1. {recommendation for improvement}
|
|
|
|
## Appendix: Full Test Output
|
|
|
|
```
|
|
{pytest output}
|
|
```
|
|
```
|
|
|
|
## Deviation Thresholds
|
|
|
|
| Deviation | Classification |
|
|
|-----------|----------------|
|
|
| < 1% | Excellent match |
|
|
| 1-5% | Acceptable |
|
|
| 5-10% | Needs investigation |
|
|
| > 10% | Significant difference |
|
|
|
|
## Analysis Guidelines
|
|
|
|
When results differ from paper:
|
|
|
|
1. Check implementation against paper equations
|
|
2. Verify hyperparameters
|
|
3. Check data preprocessing
|
|
4. Consider numerical precision differences
|
|
5. Note if paper has known errata
|
|
|
|
## Quality Checklist
|
|
|
|
Before completing:
|
|
- [ ] All tests executed
|
|
- [ ] Coverage report generated
|
|
- [ ] Each replication target evaluated
|
|
- [ ] Deviations analyzed and explained
|
|
- [ ] Recommendations provided
|
|
- [ ] Report is self-contained
|
|
```
|
|
|
|
- [ ] **Step 2: Verify file creation**
|
|
|
|
```bash
|
|
cat .opencode/agents/test-runner.md
|
|
```
|
|
|
|
Expected: File contents match the markdown above.
|
|
|
|
- [ ] **Step 3: Commit**
|
|
|
|
```bash
|
|
git add .opencode/agents/test-runner.md
|
|
git commit -m "feat(agents): add test-runner subagent
|
|
|
|
Runs tests and generates comprehensive replication reports."
|
|
```
|
|
|
|
---
|
|
|
|
### Task 6: Create paper-parsing Skill
|
|
|
|
**Files:**
|
|
- Create: `.opencode/skills/paper-parsing/SKILL.md`
|
|
|
|
- [ ] **Step 1: Create skills directory**
|
|
|
|
```bash
|
|
mkdir -p .opencode/skills/paper-parsing
|
|
```
|
|
|
|
- [ ] **Step 2: Write SKILL.md**
|
|
|
|
Create `.opencode/skills/paper-parsing/SKILL.md`:
|
|
|
|
```markdown
|
|
---
|
|
name: paper-parsing
|
|
description: Use when analyzing ML/DL papers to ensure comprehensive extraction of all relevant information
|
|
---
|
|
|
|
# Paper Parsing Methodology
|
|
|
|
## Overview
|
|
|
|
Systematic approach to parsing ML/DL papers for replication. Emphasizes **completeness** and **openness** to avoid missing critical details.
|
|
|
|
**Announce at start:** "I'm using the paper-parsing skill to ensure comprehensive paper analysis."
|
|
|
|
## Core Philosophy
|
|
|
|
1. **Completeness over speed**: Better to extract too much than miss something
|
|
2. **Open-ended discovery**: Papers contain unique insights; don't force into templates
|
|
3. **Cross-reference**: Information appears in multiple places; cross-check
|
|
4. **Explicit uncertainty**: Mark unclear items rather than guessing
|
|
|
|
## Paper Sections Checklist
|
|
|
|
### Abstract
|
|
- [ ] Core contribution identified
|
|
- [ ] Key results/numbers extracted
|
|
- [ ] Problem domain understood
|
|
|
|
### Introduction
|
|
- [ ] Problem motivation clear
|
|
- [ ] Gap in existing work identified
|
|
- [ ] Proposed solution summarized
|
|
- [ ] Claimed contributions listed
|
|
|
|
### Related Work
|
|
- [ ] Key prior methods identified
|
|
- [ ] Differences from this work noted
|
|
- [ ] Potential baselines for comparison
|
|
|
|
### Method / Approach
|
|
- [ ] Architecture fully described
|
|
- [ ] All components identified
|
|
- [ ] Mathematical formulation complete
|
|
- [ ] Training procedure detailed
|
|
- [ ] Loss functions specified
|
|
- [ ] Hyperparameters listed
|
|
|
|
### Experiments
|
|
- [ ] Datasets listed with sizes
|
|
- [ ] Evaluation metrics defined
|
|
- [ ] Baseline comparisons noted
|
|
- [ ] Ablation studies cataloged
|
|
- [ ] Key numerical results extracted
|
|
|
|
### Appendix / Supplementary
|
|
- [ ] Additional implementation details
|
|
- [ ] Extended results
|
|
- [ ] Proofs or derivations
|
|
- [ ] Code references
|
|
|
|
## Information Extraction Patterns
|
|
|
|
### Architecture Details
|
|
|
|
Look for:
|
|
- Layer types and configurations
|
|
- Activation functions
|
|
- Normalization methods
|
|
- Attention mechanisms
|
|
- Skip connections
|
|
- Input/output dimensions
|
|
|
|
Common locations:
|
|
- Method section figures
|
|
- Architecture diagrams
|
|
- Table of hyperparameters
|
|
- Appendix implementation details
|
|
|
|
### Training Configuration
|
|
|
|
| Parameter | Typical Locations |
|
|
|-----------|-------------------|
|
|
| Learning rate | Experiments, Appendix |
|
|
| Batch size | Experiments, Appendix |
|
|
| Optimizer | Method, Appendix |
|
|
| Epochs | Experiments |
|
|
| Hardware | Experiments, Appendix |
|
|
| Training time | Experiments |
|
|
|
|
### Numerical Results
|
|
|
|
Extract from:
|
|
- Main results tables
|
|
- Comparison figures
|
|
- Ablation tables
|
|
- Training curves (approximate values)
|
|
|
|
Format as:
|
|
| Metric | Dataset | Value | Conditions |
|
|
|--------|---------|-------|------------|
|
|
| Accuracy | CIFAR-10 | 95.2% | ResNet-50 backbone |
|
|
|
|
## Common Omissions to Watch For
|
|
|
|
1. **Initialization**: Often in appendix or not mentioned
|
|
2. **Data augmentation**: May be standard but unspecified
|
|
3. **Early stopping criteria**: Often implied
|
|
4. **Evaluation protocol**: Train/val/test split details
|
|
5. **Random seeds**: Reproducibility details
|
|
6. **Software versions**: PyTorch, CUDA versions
|
|
|
|
## Quality Verification
|
|
|
|
Before completing analysis:
|
|
|
|
1. **Coverage check**: Every section reviewed?
|
|
2. **Consistency check**: Numbers match across sections?
|
|
3. **Completeness check**: Could someone implement from this?
|
|
4. **Ambiguity check**: Unclear items marked?
|
|
|
|
## Output Quality Markers
|
|
|
|
Good analysis:
|
|
- Specific numbers, not "good performance"
|
|
- Exact layer configs, not "standard ResNet"
|
|
- Explicit uncertainty markers
|
|
- Cross-references between sections
|
|
|
|
Poor analysis:
|
|
- Vague descriptions
|
|
- Missing hyperparameters
|
|
- No numerical targets
|
|
- Assumptions without noting them
|
|
|
|
## Red Flags
|
|
|
|
If you notice:
|
|
- "Implementation details in code" → Check GitHub link
|
|
- "Standard settings" → Look up the standard
|
|
- "Following [citation]" → May need to read that paper
|
|
- Inconsistent numbers → Note the discrepancy
|
|
```
|
|
|
|
- [ ] **Step 3: Verify file creation**
|
|
|
|
```bash
|
|
cat .opencode/skills/paper-parsing/SKILL.md
|
|
```
|
|
|
|
Expected: File contents match the markdown above.
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
git add .opencode/skills/paper-parsing/SKILL.md
|
|
git commit -m "feat(skills): add paper-parsing skill
|
|
|
|
Comprehensive methodology for ML/DL paper analysis."
|
|
```
|
|
|
|
---
|
|
|
|
### Task 7: Create code-generation Skill
|
|
|
|
**Files:**
|
|
- Create: `.opencode/skills/code-generation/SKILL.md`
|
|
|
|
- [ ] **Step 1: Create skill directory**
|
|
|
|
```bash
|
|
mkdir -p .opencode/skills/code-generation
|
|
```
|
|
|
|
- [ ] **Step 2: Write SKILL.md**
|
|
|
|
Create `.opencode/skills/code-generation/SKILL.md`:
|
|
|
|
```markdown
|
|
---
|
|
name: code-generation
|
|
description: Use when generating PyTorch code from paper analysis to ensure correct mapping from paper to code
|
|
---
|
|
|
|
# Code Generation from Papers
|
|
|
|
## Overview
|
|
|
|
Guidelines for translating paper descriptions into working PyTorch code.
|
|
|
|
**Announce at start:** "I'm using the code-generation skill to ensure accurate paper-to-code translation."
|
|
|
|
## Core Principles
|
|
|
|
1. **Traceability**: Every code block should reference paper section/equation
|
|
2. **Testability**: Write code that can be unit tested
|
|
3. **Readability**: Prefer clarity over cleverness
|
|
4. **Modularity**: One component per file
|
|
|
|
## Paper-to-Code Mapping
|
|
|
|
### Architecture Diagrams → nn.Module
|
|
|
|
| Diagram Element | PyTorch Equivalent |
|
|
|-----------------|-------------------|
|
|
| Box/Block | nn.Module subclass |
|
|
| Arrow | forward() call chain |
|
|
| Split | Multiple outputs / tuple |
|
|
| Merge | torch.cat / torch.add |
|
|
| Skip connection | Residual addition |
|
|
|
|
### Equations → Tensor Operations
|
|
|
|
| Notation | PyTorch |
|
|
|----------|---------|
|
|
| $Wx + b$ | `nn.Linear(in, out)` |
|
|
| $\sigma(x)$ | `torch.sigmoid(x)` or `nn.Sigmoid()` |
|
|
| $\text{softmax}(x)$ | `F.softmax(x, dim=-1)` |
|
|
| $\|x\|_2$ | `torch.norm(x, p=2)` |
|
|
| $x \odot y$ | `x * y` (element-wise) |
|
|
| $x^T y$ | `torch.matmul(x.T, y)` or `x.T @ y` |
|
|
| $\sum_i$ | `torch.sum(x, dim=i)` |
|
|
| $\mathbb{E}[x]$ | `torch.mean(x)` |
|
|
|
|
### Loss Functions
|
|
|
|
| Paper Description | PyTorch |
|
|
|-------------------|---------|
|
|
| Cross-entropy | `nn.CrossEntropyLoss()` |
|
|
| MSE / L2 | `nn.MSELoss()` |
|
|
| L1 | `nn.L1Loss()` |
|
|
| BCE | `nn.BCEWithLogitsLoss()` |
|
|
| KL divergence | `nn.KLDivLoss()` |
|
|
| Custom | Subclass or functional |
|
|
|
|
## Code Structure Template
|
|
|
|
```python
|
|
"""
|
|
{component_name}.py
|
|
|
|
Implements {what} from "{paper_title}" ({year})
|
|
|
|
Paper Reference:
|
|
- Section: {section_number}
|
|
- Equation: ({equation_number})
|
|
- Figure: {figure_number}
|
|
|
|
Author: Auto-generated for paper replication
|
|
"""
|
|
|
|
import torch
|
|
import torch.nn as nn
|
|
import torch.nn.functional as F
|
|
from typing import Optional, Tuple, List
|
|
|
|
|
|
class {ComponentName}(nn.Module):
|
|
"""
|
|
{One-line description}
|
|
|
|
From paper: "{exact quote or paraphrase}"
|
|
|
|
Args:
|
|
{param1}: {description} (paper: {where specified})
|
|
{param2}: {description}
|
|
|
|
Shape:
|
|
- Input: {shape description}
|
|
- Output: {shape description}
|
|
|
|
Example:
|
|
>>> layer = {ComponentName}(dim=512)
|
|
>>> x = torch.randn(32, 100, 512)
|
|
>>> out = layer(x)
|
|
>>> out.shape
|
|
torch.Size([32, 100, 512])
|
|
"""
|
|
|
|
def __init__(
|
|
self,
|
|
{param1}: {type},
|
|
{param2}: {type} = {default},
|
|
):
|
|
super().__init__()
|
|
|
|
# Paper Section X.Y: "{description}"
|
|
self.layer1 = nn.Linear(...)
|
|
|
|
# Equation (N): ...
|
|
self.layer2 = nn.LayerNorm(...)
|
|
|
|
def forward(self, x: torch.Tensor) -> torch.Tensor:
|
|
"""
|
|
Forward pass implementing Equation (N).
|
|
|
|
Args:
|
|
x: Input tensor of shape (batch, seq, dim)
|
|
|
|
Returns:
|
|
Output tensor of shape (batch, seq, dim)
|
|
"""
|
|
# Step 1: ... (Eq. N, first term)
|
|
h = self.layer1(x)
|
|
|
|
# Step 2: ... (Eq. N, second term)
|
|
out = self.layer2(h)
|
|
|
|
return out
|
|
```
|
|
|
|
## Common Patterns
|
|
|
|
### Residual Connection
|
|
|
|
```python
|
|
# Paper: "We add a residual connection"
|
|
out = self.sublayer(x) + x
|
|
```
|
|
|
|
### Layer Normalization
|
|
|
|
```python
|
|
# Paper: "Pre-LN Transformer"
|
|
x = self.norm(x)
|
|
x = self.attention(x)
|
|
|
|
# Paper: "Post-LN Transformer"
|
|
x = x + self.attention(x)
|
|
x = self.norm(x)
|
|
```
|
|
|
|
### Multi-Head Attention
|
|
|
|
```python
|
|
# Paper: "Standard multi-head attention with h heads"
|
|
self.attention = nn.MultiheadAttention(
|
|
embed_dim=d_model,
|
|
num_heads=h,
|
|
dropout=dropout,
|
|
batch_first=True,
|
|
)
|
|
```
|
|
|
|
### Custom Activation
|
|
|
|
```python
|
|
# Paper: "We use GELU activation"
|
|
x = F.gelu(x)
|
|
|
|
# Paper: "We use Swish/SiLU activation"
|
|
x = F.silu(x)
|
|
```
|
|
|
|
## Handling Ambiguity
|
|
|
|
When paper is unclear:
|
|
|
|
1. **Check code repository** if available
|
|
2. **Follow common practice** for the architecture type
|
|
3. **Document assumption** in code comment
|
|
4. **Add TODO** for verification
|
|
|
|
```python
|
|
# TODO: Paper unclear on initialization. Using PyTorch default.
|
|
# See: https://github.com/paper/repo for reference implementation
|
|
self.linear = nn.Linear(in_dim, out_dim)
|
|
```
|
|
|
|
## Verification Checklist
|
|
|
|
Before completing a module:
|
|
|
|
- [ ] All equations implemented
|
|
- [ ] Shapes documented and verified
|
|
- [ ] Paper references in comments
|
|
- [ ] Type hints complete
|
|
- [ ] Example in docstring works
|
|
- [ ] No hardcoded dimensions (use params)
|
|
- [ ] Gradient flow verified (no in-place ops breaking autograd)
|
|
```
|
|
|
|
- [ ] **Step 3: Verify file creation**
|
|
|
|
```bash
|
|
cat .opencode/skills/code-generation/SKILL.md
|
|
```
|
|
|
|
Expected: File contents match the markdown above.
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
git add .opencode/skills/code-generation/SKILL.md
|
|
git commit -m "feat(skills): add code-generation skill
|
|
|
|
Paper-to-PyTorch code translation guidelines."
|
|
```
|
|
|
|
---
|
|
|
|
### Task 8: Create pytorch-patterns Skill
|
|
|
|
**Files:**
|
|
- Create: `.opencode/skills/pytorch-patterns/SKILL.md`
|
|
|
|
- [ ] **Step 1: Create skill directory**
|
|
|
|
```bash
|
|
mkdir -p .opencode/skills/pytorch-patterns
|
|
```
|
|
|
|
- [ ] **Step 2: Write SKILL.md**
|
|
|
|
Create `.opencode/skills/pytorch-patterns/SKILL.md`:
|
|
|
|
```markdown
|
|
---
|
|
name: pytorch-patterns
|
|
description: Use when writing PyTorch code to follow best practices and common patterns
|
|
---
|
|
|
|
# PyTorch Best Practices
|
|
|
|
## Overview
|
|
|
|
Established patterns for writing clean, efficient, and maintainable PyTorch code.
|
|
|
|
**Announce at start:** "I'm using the pytorch-patterns skill for best practice code."
|
|
|
|
## Model Definition
|
|
|
|
### Basic Module
|
|
|
|
```python
|
|
import torch
|
|
import torch.nn as nn
|
|
from typing import Optional
|
|
|
|
|
|
class MyModel(nn.Module):
|
|
def __init__(self, config: dict):
|
|
super().__init__()
|
|
self.config = config
|
|
|
|
# Define layers
|
|
self.encoder = nn.Linear(config["input_dim"], config["hidden_dim"])
|
|
self.decoder = nn.Linear(config["hidden_dim"], config["output_dim"])
|
|
|
|
# Initialize weights
|
|
self._init_weights()
|
|
|
|
def _init_weights(self):
|
|
"""Initialize weights following paper's specification."""
|
|
for module in self.modules():
|
|
if isinstance(module, nn.Linear):
|
|
nn.init.xavier_uniform_(module.weight)
|
|
if module.bias is not None:
|
|
nn.init.zeros_(module.bias)
|
|
|
|
def forward(self, x: torch.Tensor) -> torch.Tensor:
|
|
h = self.encoder(x)
|
|
h = torch.relu(h)
|
|
out = self.decoder(h)
|
|
return out
|
|
```
|
|
|
|
### Model with Multiple Outputs
|
|
|
|
```python
|
|
from typing import Tuple, NamedTuple
|
|
|
|
|
|
class ModelOutput(NamedTuple):
|
|
logits: torch.Tensor
|
|
hidden_states: torch.Tensor
|
|
attention_weights: Optional[torch.Tensor] = None
|
|
|
|
|
|
class MultiOutputModel(nn.Module):
|
|
def forward(self, x: torch.Tensor) -> ModelOutput:
|
|
# ... computation ...
|
|
return ModelOutput(
|
|
logits=logits,
|
|
hidden_states=hidden,
|
|
attention_weights=attn if self.return_attention else None,
|
|
)
|
|
```
|
|
|
|
## Device Management
|
|
|
|
### Automatic Device Handling
|
|
|
|
```python
|
|
class DeviceAwareModel(nn.Module):
|
|
@property
|
|
def device(self) -> torch.device:
|
|
"""Get model's device from first parameter."""
|
|
return next(self.parameters()).device
|
|
|
|
def forward(self, x: torch.Tensor) -> torch.Tensor:
|
|
# Input automatically on correct device if caller handles it
|
|
# For internal tensors:
|
|
mask = torch.ones(x.size(0), device=self.device)
|
|
return x * mask
|
|
```
|
|
|
|
### Training Script Device Setup
|
|
|
|
```python
|
|
def get_device() -> torch.device:
|
|
"""Get best available device."""
|
|
if torch.cuda.is_available():
|
|
return torch.device("cuda")
|
|
elif torch.backends.mps.is_available():
|
|
return torch.device("mps")
|
|
return torch.device("cpu")
|
|
|
|
|
|
device = get_device()
|
|
model = MyModel(config).to(device)
|
|
|
|
# DataLoader handles device transfer
|
|
for batch in dataloader:
|
|
inputs = batch["inputs"].to(device)
|
|
targets = batch["targets"].to(device)
|
|
```
|
|
|
|
## Training Loop
|
|
|
|
### Standard Pattern
|
|
|
|
```python
|
|
def train_epoch(
|
|
model: nn.Module,
|
|
dataloader: DataLoader,
|
|
optimizer: torch.optim.Optimizer,
|
|
criterion: nn.Module,
|
|
device: torch.device,
|
|
scheduler: Optional[torch.optim.lr_scheduler._LRScheduler] = None,
|
|
) -> float:
|
|
"""Train for one epoch."""
|
|
model.train()
|
|
total_loss = 0.0
|
|
num_batches = 0
|
|
|
|
for batch in tqdm(dataloader, desc="Training"):
|
|
# Move to device
|
|
inputs = batch["inputs"].to(device)
|
|
targets = batch["targets"].to(device)
|
|
|
|
# Forward pass
|
|
optimizer.zero_grad()
|
|
outputs = model(inputs)
|
|
loss = criterion(outputs, targets)
|
|
|
|
# Backward pass
|
|
loss.backward()
|
|
|
|
# Gradient clipping (if needed)
|
|
torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
|
|
|
|
# Update
|
|
optimizer.step()
|
|
if scheduler is not None:
|
|
scheduler.step()
|
|
|
|
total_loss += loss.item()
|
|
num_batches += 1
|
|
|
|
return total_loss / num_batches
|
|
|
|
|
|
@torch.no_grad()
|
|
def evaluate(
|
|
model: nn.Module,
|
|
dataloader: DataLoader,
|
|
criterion: nn.Module,
|
|
device: torch.device,
|
|
) -> Tuple[float, float]:
|
|
"""Evaluate model."""
|
|
model.eval()
|
|
total_loss = 0.0
|
|
correct = 0
|
|
total = 0
|
|
|
|
for batch in dataloader:
|
|
inputs = batch["inputs"].to(device)
|
|
targets = batch["targets"].to(device)
|
|
|
|
outputs = model(inputs)
|
|
loss = criterion(outputs, targets)
|
|
|
|
total_loss += loss.item()
|
|
preds = outputs.argmax(dim=-1)
|
|
correct += (preds == targets).sum().item()
|
|
total += targets.size(0)
|
|
|
|
return total_loss / len(dataloader), correct / total
|
|
```
|
|
|
|
## Data Loading
|
|
|
|
### Custom Dataset
|
|
|
|
```python
|
|
from torch.utils.data import Dataset, DataLoader
|
|
|
|
|
|
class PaperDataset(Dataset):
|
|
def __init__(self, data_path: str, transform=None):
|
|
self.data = self._load_data(data_path)
|
|
self.transform = transform
|
|
|
|
def _load_data(self, path: str):
|
|
# Load from disk
|
|
pass
|
|
|
|
def __len__(self) -> int:
|
|
return len(self.data)
|
|
|
|
def __getitem__(self, idx: int) -> dict:
|
|
item = self.data[idx]
|
|
if self.transform:
|
|
item = self.transform(item)
|
|
return item
|
|
|
|
|
|
def get_dataloader(
|
|
dataset: Dataset,
|
|
batch_size: int,
|
|
shuffle: bool = True,
|
|
num_workers: int = 4,
|
|
) -> DataLoader:
|
|
return DataLoader(
|
|
dataset,
|
|
batch_size=batch_size,
|
|
shuffle=shuffle,
|
|
num_workers=num_workers,
|
|
pin_memory=True, # Faster GPU transfer
|
|
drop_last=True, # Consistent batch sizes
|
|
)
|
|
```
|
|
|
|
## Checkpointing
|
|
|
|
### Save and Load
|
|
|
|
```python
|
|
def save_checkpoint(
|
|
model: nn.Module,
|
|
optimizer: torch.optim.Optimizer,
|
|
epoch: int,
|
|
loss: float,
|
|
path: str,
|
|
):
|
|
"""Save training checkpoint."""
|
|
torch.save({
|
|
"epoch": epoch,
|
|
"model_state_dict": model.state_dict(),
|
|
"optimizer_state_dict": optimizer.state_dict(),
|
|
"loss": loss,
|
|
}, path)
|
|
|
|
|
|
def load_checkpoint(
|
|
path: str,
|
|
model: nn.Module,
|
|
optimizer: Optional[torch.optim.Optimizer] = None,
|
|
) -> dict:
|
|
"""Load training checkpoint."""
|
|
checkpoint = torch.load(path, weights_only=True)
|
|
model.load_state_dict(checkpoint["model_state_dict"])
|
|
if optimizer is not None:
|
|
optimizer.load_state_dict(checkpoint["optimizer_state_dict"])
|
|
return checkpoint
|
|
```
|
|
|
|
## Reproducibility
|
|
|
|
### Set Seeds
|
|
|
|
```python
|
|
import random
|
|
import numpy as np
|
|
import torch
|
|
|
|
|
|
def set_seed(seed: int = 42):
|
|
"""Set all random seeds for reproducibility."""
|
|
random.seed(seed)
|
|
np.random.seed(seed)
|
|
torch.manual_seed(seed)
|
|
torch.cuda.manual_seed_all(seed)
|
|
|
|
# For deterministic behavior (may impact performance)
|
|
torch.backends.cudnn.deterministic = True
|
|
torch.backends.cudnn.benchmark = False
|
|
```
|
|
|
|
## Common Gotchas
|
|
|
|
### In-place Operations
|
|
|
|
```python
|
|
# BAD: Breaks autograd
|
|
x += 1
|
|
x[:, 0] = 0
|
|
|
|
# GOOD: Creates new tensor
|
|
x = x + 1
|
|
x = torch.cat([torch.zeros_like(x[:, :1]), x[:, 1:]], dim=1)
|
|
```
|
|
|
|
### Detaching for Metrics
|
|
|
|
```python
|
|
# BAD: Keeps computation graph
|
|
accuracy = (preds == targets).float().mean()
|
|
all_accs.append(accuracy) # Memory leak!
|
|
|
|
# GOOD: Detach for logging
|
|
accuracy = (preds == targets).float().mean().item()
|
|
all_accs.append(accuracy)
|
|
```
|
|
|
|
### Mixed Precision
|
|
|
|
```python
|
|
from torch.cuda.amp import autocast, GradScaler
|
|
|
|
scaler = GradScaler()
|
|
|
|
for batch in dataloader:
|
|
optimizer.zero_grad()
|
|
|
|
with autocast():
|
|
outputs = model(inputs)
|
|
loss = criterion(outputs, targets)
|
|
|
|
scaler.scale(loss).backward()
|
|
scaler.step(optimizer)
|
|
scaler.update()
|
|
```
|
|
```
|
|
|
|
- [ ] **Step 3: Verify file creation**
|
|
|
|
```bash
|
|
cat .opencode/skills/pytorch-patterns/SKILL.md
|
|
```
|
|
|
|
Expected: File contents match the markdown above.
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
git add .opencode/skills/pytorch-patterns/SKILL.md
|
|
git commit -m "feat(skills): add pytorch-patterns skill
|
|
|
|
PyTorch best practices and common patterns."
|
|
```
|
|
|
|
---
|
|
|
|
### Task 9: Create verification Skill
|
|
|
|
**Files:**
|
|
- Create: `.opencode/skills/verification/SKILL.md`
|
|
|
|
- [ ] **Step 1: Create skill directory**
|
|
|
|
```bash
|
|
mkdir -p .opencode/skills/verification
|
|
```
|
|
|
|
- [ ] **Step 2: Write SKILL.md**
|
|
|
|
Create `.opencode/skills/verification/SKILL.md`:
|
|
|
|
```markdown
|
|
---
|
|
name: verification
|
|
description: Use when verifying replication results against paper's reported values
|
|
---
|
|
|
|
# Replication Verification
|
|
|
|
## Overview
|
|
|
|
Systematic approach to verifying that replicated code produces results matching the original paper.
|
|
|
|
**Announce at start:** "I'm using the verification skill to validate replication accuracy."
|
|
|
|
## Verification Levels
|
|
|
|
### Level 1: Code Correctness
|
|
- Unit tests pass
|
|
- No runtime errors
|
|
- Gradient flow works
|
|
|
|
### Level 2: Behavioral Match
|
|
- Output shapes correct
|
|
- Value ranges reasonable
|
|
- Edge cases handled
|
|
|
|
### Level 3: Numerical Match
|
|
- Results within tolerance of paper
|
|
- Trends match (even if absolute values differ)
|
|
- Statistical significance considered
|
|
|
|
## Test Design for Replication
|
|
|
|
### Shape Tests
|
|
|
|
```python
|
|
def test_model_output_shape():
|
|
"""Verify model produces correct output shape per paper."""
|
|
model = MyModel(config)
|
|
x = torch.randn(batch_size, seq_len, input_dim)
|
|
out = model(x)
|
|
|
|
# Paper Section 3.2: "Output dimension is 512"
|
|
assert out.shape == (batch_size, seq_len, 512)
|
|
```
|
|
|
|
### Value Range Tests
|
|
|
|
```python
|
|
def test_attention_weights_sum():
|
|
"""Attention weights should sum to 1 (paper Eq. 3)."""
|
|
model = AttentionLayer(config)
|
|
x = torch.randn(batch_size, seq_len, dim)
|
|
_, attn_weights = model(x, return_attention=True)
|
|
|
|
# Softmax output sums to 1
|
|
assert torch.allclose(attn_weights.sum(dim=-1), torch.ones(batch_size, seq_len))
|
|
```
|
|
|
|
### Gradient Tests
|
|
|
|
```python
|
|
def test_gradient_flow():
|
|
"""Verify gradients flow through all parameters."""
|
|
model = MyModel(config)
|
|
x = torch.randn(batch_size, input_dim, requires_grad=True)
|
|
out = model(x)
|
|
loss = out.sum()
|
|
loss.backward()
|
|
|
|
for name, param in model.named_parameters():
|
|
assert param.grad is not None, f"No gradient for {name}"
|
|
assert not torch.isnan(param.grad).any(), f"NaN gradient for {name}"
|
|
```
|
|
|
|
### Numerical Match Tests
|
|
|
|
```python
|
|
def test_loss_value_reasonable():
|
|
"""Loss should be in expected range per paper Figure 2."""
|
|
model = MyModel(config)
|
|
# ... setup ...
|
|
|
|
loss = compute_loss(model, data)
|
|
|
|
# Paper reports initial loss ~2.3 (cross-entropy on 10 classes)
|
|
assert 2.0 < loss.item() < 3.0, f"Initial loss {loss.item()} outside expected range"
|
|
```
|
|
|
|
## Comparison Methodology
|
|
|
|
### Absolute Comparison
|
|
|
|
```python
|
|
def compare_absolute(paper_value: float, our_value: float, tolerance: float = 0.01):
|
|
"""Compare with absolute tolerance."""
|
|
diff = abs(paper_value - our_value)
|
|
return diff <= tolerance, diff
|
|
```
|
|
|
|
### Relative Comparison
|
|
|
|
```python
|
|
def compare_relative(paper_value: float, our_value: float, tolerance: float = 0.05):
|
|
"""Compare with relative tolerance (5% default)."""
|
|
if paper_value == 0:
|
|
return our_value == 0, abs(our_value)
|
|
relative_diff = abs(paper_value - our_value) / abs(paper_value)
|
|
return relative_diff <= tolerance, relative_diff
|
|
```
|
|
|
|
### Statistical Comparison
|
|
|
|
```python
|
|
def compare_with_variance(
|
|
paper_mean: float,
|
|
paper_std: float,
|
|
our_values: List[float],
|
|
confidence: float = 0.95,
|
|
):
|
|
"""Compare considering paper's reported variance."""
|
|
our_mean = np.mean(our_values)
|
|
our_std = np.std(our_values)
|
|
|
|
# Check if means are within 2 standard deviations
|
|
combined_std = np.sqrt(paper_std**2 + our_std**2)
|
|
z_score = abs(paper_mean - our_mean) / combined_std
|
|
|
|
return z_score < 2.0, z_score
|
|
```
|
|
|
|
## Common Difference Sources
|
|
|
|
### Acceptable Differences
|
|
|
|
| Source | Typical Impact | Mitigation |
|
|
|--------|---------------|------------|
|
|
| Random seed | 1-2% | Run multiple seeds |
|
|
| Floating point | < 0.1% | Use float64 for verification |
|
|
| Framework differences | 1-3% | Document and accept |
|
|
| Hardware differences | 0.5-1% | Note in report |
|
|
|
|
### Concerning Differences
|
|
|
|
| Source | Typical Impact | Action |
|
|
|--------|---------------|--------|
|
|
| Wrong architecture | > 10% | Review code vs paper |
|
|
| Wrong hyperparameters | 5-20% | Verify all settings |
|
|
| Data preprocessing | Variable | Match paper exactly |
|
|
| Evaluation protocol | Variable | Check train/val/test split |
|
|
|
|
## Verification Checklist
|
|
|
|
### Before Comparison
|
|
|
|
- [ ] Seeds set for reproducibility
|
|
- [ ] Same evaluation data as paper
|
|
- [ ] Same preprocessing pipeline
|
|
- [ ] Same evaluation metrics
|
|
|
|
### During Comparison
|
|
|
|
- [ ] Run multiple times with different seeds
|
|
- [ ] Record mean and standard deviation
|
|
- [ ] Compare trends, not just final values
|
|
- [ ] Check intermediate checkpoints if available
|
|
|
|
### After Comparison
|
|
|
|
- [ ] Document all differences
|
|
- [ ] Explain likely causes
|
|
- [ ] Determine if differences are acceptable
|
|
- [ ] Suggest improvements if needed
|
|
|
|
## Report Template
|
|
|
|
```markdown
|
|
## Verification Result: {Metric Name}
|
|
|
|
**Paper Value**: {value} ± {std}
|
|
**Our Value**: {value} ± {std}
|
|
**Difference**: {absolute} ({relative}%)
|
|
|
|
**Status**: MATCH | ACCEPTABLE | INVESTIGATE | MISMATCH
|
|
|
|
**Analysis**:
|
|
{explanation of difference}
|
|
|
|
**Confidence**: {HIGH | MEDIUM | LOW}
|
|
{reasoning for confidence level}
|
|
```
|
|
```
|
|
|
|
- [ ] **Step 3: Verify file creation**
|
|
|
|
```bash
|
|
cat .opencode/skills/verification/SKILL.md
|
|
```
|
|
|
|
Expected: File contents match the markdown above.
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
git add .opencode/skills/verification/SKILL.md
|
|
git commit -m "feat(skills): add verification skill
|
|
|
|
Replication result verification methodology."
|
|
```
|
|
|
|
---
|
|
|
|
### Task 10: Create environment-management Skill
|
|
|
|
**Files:**
|
|
- Create: `.opencode/skills/environment-management/SKILL.md`
|
|
|
|
- [ ] **Step 1: Create skill directory**
|
|
|
|
```bash
|
|
mkdir -p .opencode/skills/environment-management
|
|
```
|
|
|
|
- [ ] **Step 2: Write SKILL.md**
|
|
|
|
Create `.opencode/skills/environment-management/SKILL.md`:
|
|
|
|
```markdown
|
|
---
|
|
name: environment-management
|
|
description: Use when setting up Python environment for ML/DL paper replication using Conda + uv
|
|
---
|
|
|
|
# Environment Management (Conda + uv)
|
|
|
|
## Overview
|
|
|
|
Hybrid approach using Conda for system-level dependencies and uv for project isolation.
|
|
|
|
**Announce at start:** "I'm using the environment-management skill for Conda + uv setup."
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────┐
|
|
│ Conda (System Base) │
|
|
│ - Python interpreter │
|
|
│ - CUDA toolkit │
|
|
│ - System-level C++ libraries │
|
|
└─────────────────────────────────────────┘
|
|
│
|
|
│ provides Python
|
|
▼
|
|
┌─────────────────────────────────────────┐
|
|
│ uv (Project Isolation) │
|
|
│ - Per-project .venv │
|
|
│ - Fast dependency resolution │
|
|
│ - Reproducible installs │
|
|
└─────────────────────────────────────────┘
|
|
```
|
|
|
|
## Setup Commands
|
|
|
|
### Step 1: Conda Base Environment
|
|
|
|
Check if base exists:
|
|
```bash
|
|
conda env list | grep ai_base
|
|
```
|
|
|
|
Create if needed:
|
|
```bash
|
|
# Linux/Mac
|
|
conda create -n ai_base python=3.10 cuda-toolkit=11.8 -y
|
|
|
|
# Windows (CUDA from NVIDIA, not conda)
|
|
conda create -n ai_base python=3.10 -y
|
|
```
|
|
|
|
### Step 2: Project Environment
|
|
|
|
```bash
|
|
cd workspace/{paper_name}
|
|
|
|
# Get Conda Python path
|
|
# Linux/Mac:
|
|
PYTHON_PATH=$(conda run -n ai_base which python)
|
|
|
|
# Windows:
|
|
# PYTHON_PATH=$(conda run -n ai_base python -c "import sys; print(sys.executable)")
|
|
|
|
# Create uv venv
|
|
uv venv --python $PYTHON_PATH
|
|
```
|
|
|
|
### Step 3: Activate and Install
|
|
|
|
```bash
|
|
# Linux/Mac
|
|
source .venv/bin/activate
|
|
|
|
# Windows
|
|
.venv\Scripts\activate
|
|
|
|
# Install dependencies
|
|
uv pip install -e ".[dev]"
|
|
```
|
|
|
|
## pyproject.toml Template
|
|
|
|
```toml
|
|
[project]
|
|
name = "{paper_name}"
|
|
version = "0.1.0"
|
|
description = "Replication of {paper_title}"
|
|
requires-python = ">=3.10"
|
|
|
|
dependencies = [
|
|
# Core ML
|
|
"torch>=2.0.0",
|
|
"numpy>=1.24.0",
|
|
|
|
# Visualization
|
|
"matplotlib>=3.7.0",
|
|
"seaborn>=0.12.0",
|
|
|
|
# Utilities
|
|
"tqdm>=4.65.0",
|
|
"pyyaml>=6.0",
|
|
]
|
|
|
|
[project.optional-dependencies]
|
|
dev = [
|
|
"pytest>=7.0.0",
|
|
"pytest-cov>=4.0.0",
|
|
"black>=23.0.0",
|
|
"ruff>=0.0.260",
|
|
]
|
|
|
|
# Add based on paper requirements
|
|
vision = [
|
|
"torchvision>=0.15.0",
|
|
"pillow>=9.5.0",
|
|
]
|
|
|
|
nlp = [
|
|
"transformers>=4.30.0",
|
|
"tokenizers>=0.13.0",
|
|
"datasets>=2.12.0",
|
|
]
|
|
|
|
[build-system]
|
|
requires = ["hatchling"]
|
|
build-backend = "hatchling.build"
|
|
|
|
[tool.pytest.ini_options]
|
|
testpaths = ["tests"]
|
|
python_files = ["test_*.py"]
|
|
addopts = "-v --tb=short"
|
|
|
|
[tool.black]
|
|
line-length = 88
|
|
target-version = ["py310"]
|
|
|
|
[tool.ruff]
|
|
line-length = 88
|
|
select = ["E", "F", "I", "N", "W"]
|
|
```
|
|
|
|
## PyTorch + CUDA Compatibility
|
|
|
|
| CUDA Version | PyTorch Version | Install Command |
|
|
|--------------|-----------------|-----------------|
|
|
| 11.8 | 2.0+ | `uv pip install torch --index-url https://download.pytorch.org/whl/cu118` |
|
|
| 12.1 | 2.1+ | `uv pip install torch --index-url https://download.pytorch.org/whl/cu121` |
|
|
| CPU only | Any | `uv pip install torch --index-url https://download.pytorch.org/whl/cpu` |
|
|
|
|
## Environment Verification
|
|
|
|
```bash
|
|
# Check Python
|
|
python --version
|
|
|
|
# Check PyTorch
|
|
python -c "import torch; print(f'PyTorch {torch.__version__}')"
|
|
|
|
# Check CUDA
|
|
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
|
|
python -c "import torch; print(f'CUDA version: {torch.version.cuda}')"
|
|
|
|
# Check GPU
|
|
python -c "import torch; print(f'GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else \"N/A\"}')"
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### CUDA Not Found
|
|
|
|
```bash
|
|
# Check NVIDIA driver
|
|
nvidia-smi
|
|
|
|
# Reinstall PyTorch with correct CUDA
|
|
uv pip install torch --index-url https://download.pytorch.org/whl/cu118 --force-reinstall
|
|
```
|
|
|
|
### Dependency Conflicts
|
|
|
|
```bash
|
|
# Clear cache and reinstall
|
|
uv cache clean
|
|
uv pip install -e ".[dev]" --force-reinstall
|
|
```
|
|
|
|
### Permission Errors (Windows)
|
|
|
|
```powershell
|
|
# Run as Administrator or:
|
|
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **One environment per paper**: Don't mix dependencies
|
|
2. **Pin versions in pyproject.toml**: For reproducibility
|
|
3. **Use dev dependencies**: Keep test tools separate
|
|
4. **Document CUDA version**: In README.md
|
|
5. **Commit pyproject.toml**: Not .venv/
|
|
|
|
## Quick Reference
|
|
|
|
```bash
|
|
# Full setup sequence (Linux/Mac)
|
|
conda activate ai_base || conda create -n ai_base python=3.10 cuda-toolkit=11.8 -y && conda activate ai_base
|
|
cd workspace/{paper_name}
|
|
uv venv --python $(which python)
|
|
source .venv/bin/activate
|
|
uv pip install -e ".[dev]"
|
|
pytest tests/ -v
|
|
```
|
|
```
|
|
|
|
- [ ] **Step 3: Verify file creation**
|
|
|
|
```bash
|
|
cat .opencode/skills/environment-management/SKILL.md
|
|
```
|
|
|
|
Expected: File contents match the markdown above.
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
git add .opencode/skills/environment-management/SKILL.md
|
|
git commit -m "feat(skills): add environment-management skill
|
|
|
|
Conda + uv hybrid environment setup for ML projects."
|
|
```
|
|
|
|
---
|
|
|
|
### Task 11: Create /replicate Command
|
|
|
|
**Files:**
|
|
- Create: `.opencode/commands/replicate.md`
|
|
|
|
- [ ] **Step 1: Create commands directory**
|
|
|
|
```bash
|
|
mkdir -p .opencode/commands
|
|
```
|
|
|
|
- [ ] **Step 2: Write replicate.md**
|
|
|
|
Create `.opencode/commands/replicate.md`:
|
|
|
|
```markdown
|
|
---
|
|
description: Start paper replication workflow
|
|
agent: paper-director
|
|
---
|
|
|
|
Start the paper replication workflow for the specified paper.
|
|
|
|
## Input
|
|
|
|
Paper file: $ARGUMENTS
|
|
|
|
If no file specified, ask the user to provide the path to a paper (Markdown file or paste text directly).
|
|
|
|
## Workflow
|
|
|
|
1. Validate paper file exists (if path provided)
|
|
2. Extract paper name from filename or ask user
|
|
3. Create workspace directory: `workspace/{paper_name}/`
|
|
4. Begin Phase 1: Paper Analysis
|
|
- Dispatch @paper-image-extractor
|
|
- Dispatch @paper-analyzer
|
|
5. Present Human Checkpoint with analysis summary
|
|
6. After approval, begin Phase 2: Code Generation (TDD)
|
|
7. Begin Phase 3: Verification
|
|
8. Present final replication report
|
|
|
|
## Example Usage
|
|
|
|
```
|
|
/replicate workspace/attention_is_all_you_need.md
|
|
```
|
|
|
|
Or without arguments:
|
|
```
|
|
/replicate
|
|
> Please provide the path to your paper or paste the content directly.
|
|
```
|
|
```
|
|
|
|
- [ ] **Step 3: Verify file creation**
|
|
|
|
```bash
|
|
cat .opencode/commands/replicate.md
|
|
```
|
|
|
|
Expected: File contents match the markdown above.
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
git add .opencode/commands/replicate.md
|
|
git commit -m "feat(commands): add /replicate command
|
|
|
|
Entry point for paper replication workflow."
|
|
```
|
|
|
|
---
|
|
|
|
### Task 12: Create /verify Command
|
|
|
|
**Files:**
|
|
- Create: `.opencode/commands/verify.md`
|
|
|
|
- [ ] **Step 1: Write verify.md**
|
|
|
|
Create `.opencode/commands/verify.md`:
|
|
|
|
```markdown
|
|
---
|
|
description: Verify replication results for a completed project
|
|
agent: paper-director
|
|
---
|
|
|
|
Verify the replication results for an existing project.
|
|
|
|
## Input
|
|
|
|
Project directory: $ARGUMENTS
|
|
|
|
If no directory specified, list available projects in workspace/ and ask user to select.
|
|
|
|
## Workflow
|
|
|
|
1. Validate project directory exists
|
|
2. Check required files exist:
|
|
- `analysis/paper_structure.md`
|
|
- `analysis/replication_plan.md`
|
|
- `src/` with code
|
|
- `tests/` with tests
|
|
3. Dispatch @test-runner to:
|
|
- Run test suite
|
|
- Compare results with paper
|
|
- Generate/update `reports/replication_report.md`
|
|
4. Present verification summary
|
|
|
|
## Example Usage
|
|
|
|
```
|
|
/verify workspace/attention_is_all_you_need/
|
|
```
|
|
|
|
Or without arguments:
|
|
```
|
|
/verify
|
|
> Available projects:
|
|
> 1. attention_is_all_you_need
|
|
> 2. resnet
|
|
> Please select a project to verify.
|
|
```
|
|
```
|
|
|
|
- [ ] **Step 2: Verify file creation**
|
|
|
|
```bash
|
|
cat .opencode/commands/verify.md
|
|
```
|
|
|
|
Expected: File contents match the markdown above.
|
|
|
|
- [ ] **Step 3: Commit**
|
|
|
|
```bash
|
|
git add .opencode/commands/verify.md
|
|
git commit -m "feat(commands): add /verify command
|
|
|
|
Entry point for verification of existing replication projects."
|
|
```
|
|
|
|
---
|
|
|
|
### Task 13: Create opencode.json Configuration
|
|
|
|
**Files:**
|
|
- Create: `opencode.json`
|
|
|
|
- [ ] **Step 1: Write opencode.json**
|
|
|
|
Create `opencode.json`:
|
|
|
|
```json
|
|
{
|
|
"$schema": "https://opencode.ai/config.json",
|
|
"default_agent": "paper-director",
|
|
"agent": {
|
|
"paper-director": {
|
|
"mode": "primary"
|
|
},
|
|
"paper-analyzer": {
|
|
"mode": "subagent"
|
|
},
|
|
"paper-image-extractor": {
|
|
"mode": "subagent"
|
|
},
|
|
"code-writer": {
|
|
"mode": "subagent"
|
|
},
|
|
"test-runner": {
|
|
"mode": "subagent"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 2: Verify file creation**
|
|
|
|
```bash
|
|
cat opencode.json
|
|
```
|
|
|
|
Expected: Valid JSON matching above.
|
|
|
|
- [ ] **Step 3: Commit**
|
|
|
|
```bash
|
|
git add opencode.json
|
|
git commit -m "feat: add opencode.json project configuration
|
|
|
|
Sets paper-director as default agent with subagent definitions."
|
|
```
|
|
|
|
---
|
|
|
|
### Task 14: Create Workspace Directory
|
|
|
|
**Files:**
|
|
- Create: `workspace/.gitkeep`
|
|
|
|
- [ ] **Step 1: Create workspace directory**
|
|
|
|
```bash
|
|
mkdir -p workspace
|
|
```
|
|
|
|
- [ ] **Step 2: Create .gitkeep**
|
|
|
|
```bash
|
|
touch workspace/.gitkeep
|
|
```
|
|
|
|
Or on Windows:
|
|
```powershell
|
|
New-Item -ItemType File -Path workspace/.gitkeep -Force
|
|
```
|
|
|
|
- [ ] **Step 3: Verify directory creation**
|
|
|
|
```bash
|
|
ls -la workspace/
|
|
```
|
|
|
|
Expected: Directory exists with .gitkeep file.
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
git add workspace/.gitkeep
|
|
git commit -m "feat: add workspace directory for paper replication projects
|
|
|
|
Papers placed here will be processed by the replication agents."
|
|
```
|
|
|
|
---
|
|
|
|
### Task 15: Final Verification
|
|
|
|
**Files:**
|
|
- Read: All created files
|
|
|
|
- [ ] **Step 1: Verify directory structure**
|
|
|
|
```bash
|
|
find .opencode -type f -name "*.md" | sort
|
|
```
|
|
|
|
Expected output:
|
|
```
|
|
.opencode/agents/code-writer.md
|
|
.opencode/agents/paper-analyzer.md
|
|
.opencode/agents/paper-director.md
|
|
.opencode/agents/paper-image-extractor.md
|
|
.opencode/agents/test-runner.md
|
|
.opencode/commands/replicate.md
|
|
.opencode/commands/verify.md
|
|
.opencode/skills/code-generation/SKILL.md
|
|
.opencode/skills/environment-management/SKILL.md
|
|
.opencode/skills/paper-parsing/SKILL.md
|
|
.opencode/skills/pytorch-patterns/SKILL.md
|
|
.opencode/skills/verification/SKILL.md
|
|
```
|
|
|
|
- [ ] **Step 2: Verify opencode.json**
|
|
|
|
```bash
|
|
cat opencode.json | python -m json.tool
|
|
```
|
|
|
|
Expected: Valid JSON output, no errors.
|
|
|
|
- [ ] **Step 3: Verify workspace exists**
|
|
|
|
```bash
|
|
ls workspace/
|
|
```
|
|
|
|
Expected: .gitkeep file present.
|
|
|
|
- [ ] **Step 4: Run OpenCode to verify agents load**
|
|
|
|
```bash
|
|
opencode --help
|
|
```
|
|
|
|
Then in OpenCode:
|
|
```
|
|
/help
|
|
```
|
|
|
|
Verify that `/replicate` and `/verify` commands appear.
|
|
|
|
- [ ] **Step 5: Test agent switching**
|
|
|
|
In OpenCode, press Tab to cycle agents. Verify `paper-director` is available.
|
|
|
|
- [ ] **Step 6: Test subagent mention**
|
|
|
|
```
|
|
@paper-analyzer Can you help me?
|
|
```
|
|
|
|
Verify subagent responds.
|
|
|
|
- [ ] **Step 7: Final commit summary**
|
|
|
|
```bash
|
|
git log --oneline -15
|
|
```
|
|
|
|
Expected: All feature commits present.
|
|
|
|
---
|
|
|
|
## Self-Review Checklist
|
|
|
|
- [x] **Spec coverage**: All 5 agents, 5 skills, 2 commands, config file defined
|
|
- [x] **No placeholders**: All code blocks complete
|
|
- [x] **Consistent naming**: Agent/skill names match throughout
|
|
- [x] **File paths exact**: All paths specified completely
|
|
- [x] **Commits granular**: Each task has a commit step
|