hc cd6e1ebd27 feat(skills): add code-generation skill

2026-03-31 17:39:57 +08:00

4.6 KiB

Raw Blame History

name	description
code-generation	Use when generating PyTorch code from paper analysis to ensure correct mapping from paper to code

Code Generation from Papers

Overview

Guidelines for translating paper descriptions into working PyTorch code.

Announce at start: "I'm using the code-generation skill to ensure accurate paper-to-code translation."

Core Principles

Traceability: Every code block should reference paper section/equation
Testability: Write code that can be unit tested
Readability: Prefer clarity over cleverness
Modularity: One component per file

Paper-to-Code Mapping

Architecture Diagrams → nn.Module

Diagram Element	PyTorch Equivalent
Box/Block	nn.Module subclass
Arrow	forward() call chain
Split	Multiple outputs / tuple
Merge	torch.cat / torch.add
Skip connection	Residual addition

Equations → Tensor Operations

Notation	PyTorch
`Wx + b`	`nn.Linear(in, out)`
`\sigma(x)`	`torch.sigmoid(x)` or `nn.Sigmoid()`
`\text{softmax}(x)`	`F.softmax(x, dim=-1)`
`\\|x\\|_2`	`torch.norm(x, p=2)`
`x \odot y`	`x * y` (element-wise)
`x^T y`	`torch.matmul(x.T, y)` or `x.T @ y`
`\sum_i`	`torch.sum(x, dim=i)`
`\mathbb{E}[x]`	`torch.mean(x)`

Loss Functions

Paper Description	PyTorch
Cross-entropy	`nn.CrossEntropyLoss()`
MSE / L2	`nn.MSELoss()`
L1	`nn.L1Loss()`
BCE	`nn.BCEWithLogitsLoss()`
KL divergence	`nn.KLDivLoss()`
Custom	Subclass or functional

Code Structure Template

"""
{component_name}.py

Implements {what} from "{paper_title}" ({year})

Paper Reference:
- Section: {section_number}
- Equation: ({equation_number})
- Figure: {figure_number}

Author: Auto-generated for paper replication
"""

import torch
import torch.nn as nn
import torch.nn.functional as F
from typing import Optional, Tuple, List


class {ComponentName}(nn.Module):
    """
    {One-line description}
    
    From paper: "{exact quote or paraphrase}"
    
    Args:
        {param1}: {description} (paper: {where specified})
        {param2}: {description}
    
    Shape:
        - Input: {shape description}
        - Output: {shape description}
    
    Example:
        >>> layer = {ComponentName}(dim=512)
        >>> x = torch.randn(32, 100, 512)
        >>> out = layer(x)
        >>> out.shape
        torch.Size([32, 100, 512])
    """
    
    def __init__(
        self,
        {param1}: {type},
        {param2}: {type} = {default},
    ):
        super().__init__()
        
        # Paper Section X.Y: "{description}"
        self.layer1 = nn.Linear(...)
        
        # Equation (N): ...
        self.layer2 = nn.LayerNorm(...)
        
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        """
        Forward pass implementing Equation (N).
        
        Args:
            x: Input tensor of shape (batch, seq, dim)
            
        Returns:
            Output tensor of shape (batch, seq, dim)
        """
        # Step 1: ... (Eq. N, first term)
        h = self.layer1(x)
        
        # Step 2: ... (Eq. N, second term)
        out = self.layer2(h)
        
        return out

Common Patterns

Residual Connection

# Paper: "We add a residual connection"
out = self.sublayer(x) + x

Layer Normalization

# Paper: "Pre-LN Transformer"
x = self.norm(x)
x = self.attention(x)

# Paper: "Post-LN Transformer"
x = x + self.attention(x)
x = self.norm(x)

Multi-Head Attention

# Paper: "Standard multi-head attention with h heads"
self.attention = nn.MultiheadAttention(
    embed_dim=d_model,
    num_heads=h,
    dropout=dropout,
    batch_first=True,
)

Custom Activation

# Paper: "We use GELU activation"
x = F.gelu(x)

# Paper: "We use Swish/SiLU activation"
x = F.silu(x)

Handling Ambiguity

When paper is unclear:

Check code repository if available
Follow common practice for the architecture type
Document assumption in code comment
Add TODO for verification

# TODO: Paper unclear on initialization. Using PyTorch default.
# See: https://github.com/paper/repo for reference implementation
self.linear = nn.Linear(in_dim, out_dim)

Verification Checklist

Before completing a module:

All equations implemented
Shapes documented and verified
Paper references in comments
Type hints complete
Example in docstring works
No hardcoded dimensions (use params)
Gradient flow verified (no in-place ops breaking autograd)

4.6 KiB Raw Blame History