PaperTool/workspace/resource_allocation/analysis/image_understanding.md
hc ced50ea2b0 feat(agent): add result-verifier for blind visual comparison
Root cause: test-runner was giving overly optimistic results due to:
1. Context bias - knew the implementation, tended to defend it
2. No actual visual comparison - just wrote 'ACCEPTABLE' without looking
3. No structural validation - accepted 35x scale differences as 'acceptable'

Solution:
- New result-verifier agent that performs blind visual comparison
- Strict pass/fail criteria for structural validation
- Updated test-runner to use result-verifier for each figure
- Clear guidelines: structural mismatches = FAIL, not ACCEPTABLE

Test result: verifier correctly identified Fig3 as FAIL with 7 specific issues:
- Wrong X-axis variable (channels vs power)
- Wrong Y-axis scale (5x difference)
- Wrong curve count (5 vs 4)
- etc.
2026-03-31 23:56:36 +08:00

40 lines
1.9 KiB
Markdown

# Image Understanding
## Summary
- Total images: 6
- Architecture diagrams: 1
- Experiment figures: 5
- Other: 0
---
## Figure 1: The structure of semantic-aware networks
**Type**: Architecture
**Priority**: LOW
**Key insight**: Shows a base station communicating with multiple users. Each user generates semantic symbols via a neural network model from their devices before transmission.
## Figure 2: The semantic similarity for DeepSC
**Type**: Plot
**Priority**: MEDIUM
**Key insight**: 3D surface plot showing how semantic similarity ($\xi_{n,m}$) depends on SNR (-10 to 20 dB) and the number of symbols per word ($k_n$, 0 to 20). High SNR and higher $k_n$ lead to semantic similarity approaching 1.0.
## Figure 3: The S-SE of the semantic-aware network with different models
**Type**: Plot
**Priority**: HIGH
**Key insight**: Line plot showing S-SE ($\Phi$) vs Number of channels ($M$). The proposed model achieves the highest S-SE (plateauing at 1.2), significantly outperforming conventional models with various fixed $k_n$ values.
## Figure 4(a): The S-SE versus the number of channels
**Type**: Plot
**Priority**: HIGH
**Key insight**: Compares Semantic, Ideal, 5G, and 4G systems. Semantic achieves the highest S-SE (1.2 at $M \ge 5$), followed by Ideal, 5G, and 4G.
## Figure 4(b): The S-SE versus the transmit power
**Type**: Plot
**Priority**: HIGH
**Key insight**: S-SE vs Transmit power (-40 to 23 dBm). The Semantic system quickly rises and plateaus around 10 dBm, outperforming 4G and 5G. The Ideal system grows continuously and overtakes the Semantic system at very high transmit power (around 18-20 dBm).
## Figure 4(c): The S-SE versus the transforming factor
**Type**: Plot
**Priority**: HIGH
**Key insight**: S-SE vs Transforming factor $\mu$ (bits/word) from 18 to 40. Semantic performance is constant (~1.18). Ideal, 5G, and 4G S-SE decrease as $\mu$ increases. Semantic outperforms 5G and 4G for $\mu > 19$, and outperforms Ideal for $\mu > 27$.