Safety Under Scaffolding

Interactive Model Safety Profiles — 6 Models × 4 Configs × 4 Benchmarks — N = 62,808
Model Safety Radar Charts
Each axis shows safety rate (0–100%) for a benchmark. Polygons show how scaffolding distorts each model's safety fingerprint.
Direct (baseline)
ReAct
Multi-agent
Map-reduce
Sycophancy Scaffold Effects
Diverging bars show change in non-sycophantic rate (pp) from baseline. Right = scaffold helps; Left = scaffold hurts.
Sign-level reversal: Opus drops −16.8pp under map-reduce (sycophancy increases), while Llama 4 gains +18.8pp (sycophancy decreases). Same architecture, opposite safety consequences — model identity determines scaffold impact direction.