# How to reproduce the GAP-TV / CASSI KAIST 10-scene results

This recipe lets any third party verify the numbers claimed in
[`../solution.md`](../solution.md) §3.

## 1. Fetch the dataset

```bash
mkdir -p data && cd data
curl -L -o cassi-kaist-10s.tar.gz \
  https://physicsworldmodel.org/datasets/cassi-kaist-10s.tar.gz
sha256sum cassi-kaist-10s.tar.gz   # must match benchmark.json:dataset.sha256
tar xzf cassi-kaist-10s.tar.gz
cd ..
```

## 2. Install the solver

```bash
python -m venv .venv && source .venv/bin/activate
pip install -r ../code/requirements.txt
```

## 3. Run the evaluation

```bash
python ../code/solver.py \
  --benchmark PWM-L3-cassi-kaist-10scenes \
  --data-dir data/cassi-kaist-10s \
  --report
```

The script emits per-scene PSNR / SSIM to stdout and writes
`reports/results.md` with the summary table.

## 4. Expected output (within ± 0.1 dB)

```
scene_01: psnr=33.21 dB, ssim=0.892
scene_02: psnr=31.84 dB, ssim=0.876
scene_03: psnr=32.55 dB, ssim=0.884
scene_04: psnr=33.02 dB, ssim=0.887
scene_05: psnr=31.69 dB, ssim=0.872
scene_06: psnr=32.41 dB, ssim=0.880
scene_07: psnr=33.10 dB, ssim=0.890
scene_08: psnr=31.95 dB, ssim=0.877
scene_09: psnr=32.27 dB, ssim=0.881
scene_10: psnr=31.96 dB, ssim=0.873
```

**Mean: PSNR ≈ 32.4 dB, SSIM ≈ 0.881** — matches `solution.json:self_reported_metrics`.

## 5. Determinism notes

- The evaluation noise uses `np.random.default_rng(42)` — same seed gives same numbers across runs.
- The GAP-TV solver itself is fully deterministic (no randomness inside `solve`).
- Bit-exact reproduction across hardware is **not** guaranteed; matches within ± 0.1 dB are acceptable.
