Agentic Reasoning for Robust Vision Systems via Increased Test-Time Compute
Abstract
We propose the Visual Reasoning Agent (VRA), a training-free, agentic reasoning framework that achieves up to 40% absolute accuracy gains on challenging visual reasoning benchmarks. VRA leverages increased test-time compute through multi-step reasoning and tool use to enhance vision system robustness.
Type
Publication
arXiv preprint arXiv:2509.16343