Testing entropy-based search strategies for a visual classification task
© Avdiyenko et al; licensee BioMed Central Ltd. 2012
Published: 16 July 2012
There is experimental evidence that saccades during visual search preferentially target locations that contain task-relevant information . However, the question remains what kind of strategy people use to decide what is relevant for a task. Do we use simple heuristics or complex algorithms based on the ideas of information theory? For example, for a shape-learning and -matching task sequential entropy-minimization was successfully used to predict human fixations . Inspired by this fact, we test three entropy-based strategies for sequential location (image patches) selection during a visual classification task.
Determining the task-relevant parts of a visual scene is closely related to the problem of feature selection in machine learning. Correspondingly, we tested to what extend human behavior can be explained in terms of feature selection criteria.
The first strategy (MI, Mutual Information), is considered as a heuristic. It simply ranks all locations according to the mutual information they provide about digit identity . The second strategy (CMI, Conditional Mutual Information) takes into account high-order dependencies between locations, i.e. selects those that are both informative and non-redundant with respect to the already selected locations. The third strategy (AMI, Adaptive conditional Mutual Information) is an adaptive version of CMI, which takes into account also observed values of the already attended locations, suggesting that every next decision depends on what one has seen on the previous steps. Thus, in contrast to MI and CMI, the optimal location sequence is different for each image.
We compare these strategies with respect to their explanatory power of the observed behavioral data . For this, we turn them into generative models of patch sequences. Each model assumes that the next patch is chosen as softly maximizing the strategy-specific information about the image class. The softmax function is parametrized with β. For low values of β all sequences are equally likely, i.e. the subject acts randomly, whereas a high β concentrates the probability mass on sequences which select the most informative patches – according to each strategy – on each step. Preliminary results show that the behavior of most subjects is best explained by the AMI strategy.
Our clicking experiment provide evidence that for a visual classification task people are able to employ quite complex entropy-based search strategies. We found in particular that, even though it is more computationally demanding, most people act adaptively, i.e. take into account image-specific information.
- Betz T, Kietzmann TC, Wilming N, Koenig P: Investigating task-dependent top-down effects on overt visual attention. J Vision. 2010, 10(3) (15): 1-14.View ArticleGoogle Scholar
- Renninger LW, Verghese P, Coughlan J: Where to look next? Eye movements reduce local uncertainty. J Vision. 2007, 7(3) (6): 1-17.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.