How multi-sample reasoning and aggregation can stabilize outputs in math, logic, and policy-heavy prompts.