How GAIA reframes assistant evaluation around realistic task completion that blends reasoning, retrieval, and action.