Target−directed attention: Sequential decision−making for gaze planning
J. Vogel and N. de Freitas
It is widely agreed that efficient visual search requires the integration of target-driven top-down information and image-driven bottom-up information. Yet the problem of gaze planning - that is, selecting the next best gaze location given the current observations - remains largely unsolved. We propose a probabilistic system that models the gaze sequence as a finite-horizon Bayesian sequential decision process. Direct policy search is used to reason about the next best gaze locations. The system integrates bottom-up saliency information, top-down target knowledge and additional context information through principled Bayesian priors. This results in proposal gaze locations that depend not only the featural visual saliency, but also on prior knowledge and the spatial likelihood of locating the target. The system has been implemented using state-of- the-art object detectors and evaluated on a real-world dataset by comparing it to gaze sequences proposed by a pure bottom-up saliency-based process and to an object detection approach that analyzes the full image. The target-directed attention system is shown to result in higher object detection precision than both competitors, to attend to more relevant targets than the bottom-up attention system, and to require significantly less computation time than the exhaustive approach.