PaperTool/replication_plan.md at ced50ea2b0e49b322c0b1fa71eba53a775b67a4b

hc ced50ea2b0 feat(agent): add result-verifier for blind visual comparison

Root cause: test-runner was giving overly optimistic results due to:
1. Context bias - knew the implementation, tended to defend it
2. No actual visual comparison - just wrote 'ACCEPTABLE' without looking
3. No structural validation - accepted 35x scale differences as 'acceptable'

Solution:
- New result-verifier agent that performs blind visual comparison
- Strict pass/fail criteria for structural validation
- Updated test-runner to use result-verifier for each figure
- Clear guidelines: structural mismatches = FAIL, not ACCEPTABLE

Test result: verifier correctly identified Fig3 as FAIL with 7 specific issues:
- Wrong X-axis variable (channels vs power)
- Wrong Y-axis scale (5x difference)
- Wrong curve count (5 vs 4)
- etc.

4.3 KiB

Raw Blame History

Replication Plan

Scope

Implementation Order

Module 1: Environment & Channel Simulator

Module 2: Semantic Similarity Surrogate

Module 3: Resource Allocation Optimizer

Module 4: Transform Method & Baselines

Module 5: Evaluation & Plotting

Replication Targets

Figure 3: S-SE of the semantic-aware network with different models

Figure 4(a): S-SE versus the number of channels

Figure 4(b): S-SE versus the transmit power

Figure 4(c): S-SE versus the transforming factor

Environment Requirements

Estimated Effort

Known Challenges

4.3 KiB Raw Blame History

Replication Plan

Scope

Implementation Order

Module 1: Environment & Channel Simulator

Module 2: Semantic Similarity Surrogate

Module 3: Resource Allocation Optimizer

Module 4: Transform Method & Baselines

Module 5: Evaluation & Plotting

Replication Targets

Figure 3: S-SE of the semantic-aware network with different models

Figure 4(a): S-SE versus the number of channels

Figure 4(b): S-SE versus the transmit power

Figure 4(c): S-SE versus the transforming factor

Environment Requirements

Estimated Effort

Known Challenges

4.3 KiB

Raw Blame History