Abstract
Current synthetic data pipelines for computer vision generate images without diagnosing what the downstream model actually needs. We propose Synthetic Designed Experiments for Representational Sufficiency (SDRS), a principled framework based on the statistical theory of Design of Experiments (DoE). SDRS treats the downstream model as a black-box system and the synthetic generator as an experimental apparatus. Using fractional factorial designs, SDRS efficiently audits a model's factor-sensitivity profile via ANOVA decomposition, identifying coverage failures (Type I gaps) and spurious dependencies (Type II gaps).
Theoretical Framework: ANOVA Decomposition
SDRS leverages the Analysis of Variance (ANOVA) to decompose the model's response (e.g., loss or accuracy) into contributions from individual scene factors and their interactions. For a set of factors \(\{F_1, F_2, \dots, F_n\}\), the total variance in model performance is partitioned as:
A high F-statistic for a specific factor indicates that the model is highly sensitive to that factor, revealing potential representational gaps or biases.
- Type I Gaps: Coverage failures where the model lacks representational sufficiency for certain factor levels.
- Type II Gaps: Reliance on spurious nuisance dependencies (e.g., background shortcuts).
Experiment 1: Diagnostic on dSprites
We planted specific biases in a dSprites-based dataset to test if SDRS could detect them. The audit correctly identified both gap types, and targeted data improved accuracy significantly.
Accuracy Comparison
| Condition | Accuracy |
|---|---|
| No Synthetic Data (Baseline) | 47.4% |
| Random Synthetic Data | 53.8% |
| Domain Randomization | 53.5% |
| SDRS (Targeted) | 79.0% |
Experiment 2: Dense Segmentation
In a procedural scene segmentation task, SDRS detected background-complexity shortcuts that limited model generalization.
mIoU Performance
| Method | mIoU |
|---|---|
| Baseline | 0.332 |
| Random Sampling | 0.976 |
| SDRS (Targeted) | 0.998 |
Experiment 3: Entanglement Detection
SDRS can also be used to audit the generator itself, identifying cross-factor contamination in imperfect synthetic pipelines.
Conclusion
SDRS transforms synthetic data generation from a "hit-or-miss" random process into a principled diagnostic tool. By applying Design of Experiments to vision models, we can systematically identify and fix representational failures, leading to more robust and reliable AI systems.