Diagram of our model. An image is first convolved with each group of filters in the first layer, where the filters are organized into a topographic array (filters and topography are learned by the PCICA algorithm beforehand). Outputs of that are rectified by the sigmoid and the absolute functions. Then, in the second layer, the rectified outputs within each group are pooled to produce invariant representation, which is subjected to the inhibition simulated by the convolution with the DoG function. Conspicuous maps from different groups are combined to produce a final saliency map as output.