PaperTool/workspace/resource_allocation/analysis/replication_plan.md
hc ced50ea2b0 feat(agent): add result-verifier for blind visual comparison
Root cause: test-runner was giving overly optimistic results due to:
1. Context bias - knew the implementation, tended to defend it
2. No actual visual comparison - just wrote 'ACCEPTABLE' without looking
3. No structural validation - accepted 35x scale differences as 'acceptable'

Solution:
- New result-verifier agent that performs blind visual comparison
- Strict pass/fail criteria for structural validation
- Updated test-runner to use result-verifier for each figure
- Clear guidelines: structural mismatches = FAIL, not ACCEPTABLE

Test result: verifier correctly identified Fig3 as FAIL with 7 specific issues:
- Wrong X-axis variable (channels vs power)
- Wrong Y-axis scale (5x difference)
- Wrong curve count (5 vs 4)
- etc.
2026-03-31 23:56:36 +08:00

4.3 KiB

Replication Plan

Scope

The core goal of this replication is to implement the semantic-aware resource allocation algorithm (Hungarian algorithm for channel assignment + exhaustive search for optimal k_n) and the transform method for fair comparison. Out of scope: The DeepSC neural network training and NLP text processing. Instead, we will simulate the pre-trained DeepSC behavior using a parameterized surrogate function or look-up table mapping SNR and k_n to semantic similarity (\xi). The user explicitly requested NOT to reproduce Figure 2, so the focus will be entirely on Figures 3, 4a, 4b, and 4c.

Implementation Order

Module 1: Environment & Channel Simulator

  • File: src/models/environment.py
  • Dependencies: None
  • Test file: tests/test_environment.py
  • Acceptance criteria:
    • Generate N users and M channels with specified bandwidth
    • Apply pathloss (128.1 + 37.6 lg[d(km)] dB) and shadow fading (6 dB)
    • Calculate SNR \gamma_{n,m} based on noise power and Rayleigh fading

Module 2: Semantic Similarity Surrogate

  • File: src/models/semantic_model.py
  • Dependencies: src/models/environment.py
  • Test file: tests/test_semantic_model.py
  • Acceptance criteria:
    • Given SNR and k_n, returns a simulated semantic similarity \xi \in [0, 1]
    • Higher SNR and higher k_n strictly increase \xi

Module 3: Resource Allocation Optimizer

  • File: src/models/allocator.py
  • Dependencies: src/models/semantic_model.py, src/models/environment.py
  • Test file: tests/test_allocator.py
  • Acceptance criteria:
    • Implement exhaustive search over k_n \in [1, K] to find optimal \widetilde{\Phi}_{n,m}
    • Implement Hungarian algorithm for bipartite channel assignment (\alpha_{n,m})
    • Compute overall S-SE for the proposed model and conventional/fixed models

Module 4: Transform Method & Baselines

  • File: src/models/baselines.py
  • Dependencies: src/models/environment.py
  • Test file: tests/test_baselines.py
  • Acceptance criteria:
    • Implement Ideal Shannon limit SE calculation
    • Implement 4G and 5G CQI to SE mapping lookup
    • Implement transform method: calculate equivalent S-SE given transforming factor \mu

Module 5: Evaluation & Plotting

  • File: src/evaluate.py
  • Dependencies: All of the above
  • Test file: None (creates final plots)
  • Acceptance criteria:
    • Generate outputs corresponding to target Figures 3, 4a, 4b, 4c.

Replication Targets

Figure 3: S-SE of the semantic-aware network with different models

  • Type: Line Plot
  • Data source: Resource allocation output (Module 3) vs fixed k_n baselines
  • Priority: High
  • Expected values: Proposed model S-SE > fixed k_n models. Plateau expected around ~1.2 S-SE. (REFERENCE ONLY)

Figure 4(a): S-SE versus the number of channels

  • Type: Line Plot
  • Data source: Evaluation loop varying channels M from 1 to 10
  • Priority: High
  • Expected values: Semantic > Ideal > 5G > 4G for M>=5. (REFERENCE ONLY)

Figure 4(b): S-SE versus the transmit power

  • Type: Line Plot
  • Data source: Evaluation loop varying transmit power (-40 to 23 dBm)
  • Priority: High
  • Expected values: Semantic plateaus around 10 dBm, Ideal grows continuously and overtakes Semantic. (REFERENCE ONLY)

Figure 4(c): S-SE versus the transforming factor

  • Type: Line Plot
  • Data source: Evaluation loop varying \mu (bits/word) from 18 to 40
  • Priority: High
  • Expected values: Semantic outperforms 5G and 4G for \mu > 19, and outperforms Ideal for \mu > 27. (HIGH Reliability)

Environment Requirements

  • Python >= 3.10
  • NumPy >= 1.23.0
  • SciPy >= 1.9.0 (for linear_sum_assignment)
  • Matplotlib >= 3.6.0

Estimated Effort

  • Core model: 4 hours
  • Training pipeline (Optimization loop): 2 hours
  • Evaluation: 2 hours

Known Challenges

  1. DeepSC Simulator Approximation: The exact DeepSC performance curve is not provided analytically. Mitigation: We will fit a parameterized logistic/sigmoid curve that approximates the \xi mapping over SNR and k_n derived from the visual insights of Figure 2.
  2. 3GPP Tables for 4G/5G: 3GPP TS 36.213 and 38.214 need specific threshold tables. Mitigation: Implement an approximate step function matching realistic SE/CQI curves for these specifications.