Root cause: test-runner was giving overly optimistic results due to: 1. Context bias - knew the implementation, tended to defend it 2. No actual visual comparison - just wrote 'ACCEPTABLE' without looking 3. No structural validation - accepted 35x scale differences as 'acceptable' Solution: - New result-verifier agent that performs blind visual comparison - Strict pass/fail criteria for structural validation - Updated test-runner to use result-verifier for each figure - Clear guidelines: structural mismatches = FAIL, not ACCEPTABLE Test result: verifier correctly identified Fig3 as FAIL with 7 specific issues: - Wrong X-axis variable (channels vs power) - Wrong Y-axis scale (5x difference) - Wrong curve count (5 vs 4) - etc.
38 lines
1.5 KiB
Python
38 lines
1.5 KiB
Python
import os
|
|
import urllib.request
|
|
|
|
images = [
|
|
(
|
|
"https://cdn-mineru.openxlab.org.cn/result/2026-03-24/0d56f97d-c18c-44c9-aca6-ca0396c7d581/394f0e8c2f43987b4109d8842fa25e4c0385ca116ec0169de42f163621e39834.jpg",
|
|
"fig1.jpg",
|
|
),
|
|
(
|
|
"https://cdn-mineru.openxlab.org.cn/result/2026-03-24/0d56f97d-c18c-44c9-aca6-ca0396c7d581/ce01b773d3b34678c8a12b896d8b0bcffcb7ea494c2bf19ff76b4e283cbfeaef.jpg",
|
|
"fig2.jpg",
|
|
),
|
|
(
|
|
"https://cdn-mineru.openxlab.org.cn/result/2026-03-24/0d56f97d-c18c-44c9-aca6-ca0396c7d581/3204db8177d30d70838729ef95d84db1c8e7c75a18367c0cd6c13425c016690f.jpg",
|
|
"fig3.jpg",
|
|
),
|
|
(
|
|
"https://cdn-mineru.openxlab.org.cn/result/2026-03-24/0d56f97d-c18c-44c9-aca6-ca0396c7d581/41c75c9a006cf5b6783405d99e1ae502a1dc6fe575f2cb897a4cf0e2aa02e733.jpg",
|
|
"fig4a.jpg",
|
|
),
|
|
(
|
|
"https://cdn-mineru.openxlab.org.cn/result/2026-03-24/0d56f97d-c18c-44c9-aca6-ca0396c7d581/f1b5b1f978f2709f0479997e60f7010cca642327488a4d2eff6db3d5f68c4297.jpg",
|
|
"fig4b.jpg",
|
|
),
|
|
(
|
|
"https://cdn-mineru.openxlab.org.cn/result/2026-03-24/0d56f97d-c18c-44c9-aca6-ca0396c7d581/419aa724b6768f034af9072caa4d8784e5d68a50c7aa83472f6c702d34d92df9.jpg",
|
|
"fig4c.jpg",
|
|
),
|
|
]
|
|
|
|
os.makedirs("workspace/resource_allocation/paper_images/", exist_ok=True)
|
|
os.makedirs("workspace/resource_allocation/analysis/reference_images/", exist_ok=True)
|
|
|
|
for url, name in images:
|
|
urllib.request.urlretrieve(
|
|
url, f"workspace/resource_allocation/paper_images/{name}"
|
|
)
|