Root cause: test-runner was giving overly optimistic results due to: 1. Context bias - knew the implementation, tended to defend it 2. No actual visual comparison - just wrote 'ACCEPTABLE' without looking 3. No structural validation - accepted 35x scale differences as 'acceptable' Solution: - New result-verifier agent that performs blind visual comparison - Strict pass/fail criteria for structural validation - Updated test-runner to use result-verifier for each figure - Clear guidelines: structural mismatches = FAIL, not ACCEPTABLE Test result: verifier correctly identified Fig3 as FAIL with 7 specific issues: - Wrong X-axis variable (channels vs power) - Wrong Y-axis scale (5x difference) - Wrong curve count (5 vs 4) - etc.
90 lines
4.3 KiB
Markdown
90 lines
4.3 KiB
Markdown
# Replication Plan
|
|
|
|
## Scope
|
|
The core goal of this replication is to implement the semantic-aware resource allocation algorithm (Hungarian algorithm for channel assignment + exhaustive search for optimal $k_n$) and the transform method for fair comparison.
|
|
**Out of scope:** The DeepSC neural network training and NLP text processing. Instead, we will simulate the pre-trained DeepSC behavior using a parameterized surrogate function or look-up table mapping SNR and $k_n$ to semantic similarity ($\xi$). The user explicitly requested NOT to reproduce Figure 2, so the focus will be entirely on Figures 3, 4a, 4b, and 4c.
|
|
|
|
## Implementation Order
|
|
|
|
### Module 1: Environment & Channel Simulator
|
|
- **File**: `src/models/environment.py`
|
|
- **Dependencies**: None
|
|
- **Test file**: `tests/test_environment.py`
|
|
- **Acceptance criteria**:
|
|
- [ ] Generate N users and M channels with specified bandwidth
|
|
- [ ] Apply pathloss (128.1 + 37.6 lg[d(km)] dB) and shadow fading (6 dB)
|
|
- [ ] Calculate SNR $\gamma_{n,m}$ based on noise power and Rayleigh fading
|
|
|
|
### Module 2: Semantic Similarity Surrogate
|
|
- **File**: `src/models/semantic_model.py`
|
|
- **Dependencies**: `src/models/environment.py`
|
|
- **Test file**: `tests/test_semantic_model.py`
|
|
- **Acceptance criteria**:
|
|
- [ ] Given SNR and $k_n$, returns a simulated semantic similarity $\xi \in [0, 1]$
|
|
- [ ] Higher SNR and higher $k_n$ strictly increase $\xi$
|
|
|
|
### Module 3: Resource Allocation Optimizer
|
|
- **File**: `src/models/allocator.py`
|
|
- **Dependencies**: `src/models/semantic_model.py`, `src/models/environment.py`
|
|
- **Test file**: `tests/test_allocator.py`
|
|
- **Acceptance criteria**:
|
|
- [ ] Implement exhaustive search over $k_n \in [1, K]$ to find optimal $\widetilde{\Phi}_{n,m}$
|
|
- [ ] Implement Hungarian algorithm for bipartite channel assignment ($\alpha_{n,m}$)
|
|
- [ ] Compute overall S-SE for the proposed model and conventional/fixed models
|
|
|
|
### Module 4: Transform Method & Baselines
|
|
- **File**: `src/models/baselines.py`
|
|
- **Dependencies**: `src/models/environment.py`
|
|
- **Test file**: `tests/test_baselines.py`
|
|
- **Acceptance criteria**:
|
|
- [ ] Implement Ideal Shannon limit SE calculation
|
|
- [ ] Implement 4G and 5G CQI to SE mapping lookup
|
|
- [ ] Implement transform method: calculate equivalent S-SE given transforming factor $\mu$
|
|
|
|
### Module 5: Evaluation & Plotting
|
|
- **File**: `src/evaluate.py`
|
|
- **Dependencies**: All of the above
|
|
- **Test file**: None (creates final plots)
|
|
- **Acceptance criteria**:
|
|
- [ ] Generate outputs corresponding to target Figures 3, 4a, 4b, 4c.
|
|
|
|
## Replication Targets
|
|
|
|
### Figure 3: S-SE of the semantic-aware network with different models
|
|
- **Type**: Line Plot
|
|
- **Data source**: Resource allocation output (Module 3) vs fixed $k_n$ baselines
|
|
- **Priority**: High
|
|
- **Expected values**: Proposed model S-SE > fixed $k_n$ models. Plateau expected around ~1.2 S-SE. (REFERENCE ONLY)
|
|
|
|
### Figure 4(a): S-SE versus the number of channels
|
|
- **Type**: Line Plot
|
|
- **Data source**: Evaluation loop varying channels M from 1 to 10
|
|
- **Priority**: High
|
|
- **Expected values**: Semantic > Ideal > 5G > 4G for M>=5. (REFERENCE ONLY)
|
|
|
|
### Figure 4(b): S-SE versus the transmit power
|
|
- **Type**: Line Plot
|
|
- **Data source**: Evaluation loop varying transmit power (-40 to 23 dBm)
|
|
- **Priority**: High
|
|
- **Expected values**: Semantic plateaus around 10 dBm, Ideal grows continuously and overtakes Semantic. (REFERENCE ONLY)
|
|
|
|
### Figure 4(c): S-SE versus the transforming factor
|
|
- **Type**: Line Plot
|
|
- **Data source**: Evaluation loop varying $\mu$ (bits/word) from 18 to 40
|
|
- **Priority**: High
|
|
- **Expected values**: Semantic outperforms 5G and 4G for $\mu > 19$, and outperforms Ideal for $\mu > 27$. (HIGH Reliability)
|
|
|
|
## Environment Requirements
|
|
- Python >= 3.10
|
|
- NumPy >= 1.23.0
|
|
- SciPy >= 1.9.0 (for linear_sum_assignment)
|
|
- Matplotlib >= 3.6.0
|
|
|
|
## Estimated Effort
|
|
- Core model: 4 hours
|
|
- Training pipeline (Optimization loop): 2 hours
|
|
- Evaluation: 2 hours
|
|
|
|
## Known Challenges
|
|
1. DeepSC Simulator Approximation: The exact DeepSC performance curve is not provided analytically. Mitigation: We will fit a parameterized logistic/sigmoid curve that approximates the $\xi$ mapping over SNR and $k_n$ derived from the visual insights of Figure 2.
|
|
2. 3GPP Tables for 4G/5G: 3GPP TS 36.213 and 38.214 need specific threshold tables. Mitigation: Implement an approximate step function matching realistic SE/CQI curves for these specifications. |