1,696 research outputs found

    Multi-scale keypoints in V1 and face detection

    Get PDF
    End-stopped cells in cortical area V1, which combine out- puts of complex cells tuned to different orientations, serve to detect line and edge crossings (junctions) and points with a large curvature. In this paper we study the importance of the multi-scale keypoint representa- tion, i.e. retinotopic keypoint maps which are tuned to different spatial frequencies (scale or Level-of-Detail). We show that this representation provides important information for Focus-of-Attention (FoA) and object detection. In particular, we show that hierarchically-structured saliency maps for FoA can be obtained, and that combinations over scales in conjunction with spatial symmetries can lead to face detection through grouping operators that deal with keypoints at the eyes, nose and mouth, especially when non-classical receptive field inhibition is employed. Al- though a face detector can be based on feedforward and feedback loops within area V1, such an operator must be embedded into dorsal and ventral data streams to and from higher areas for obtaining translation-, rotation- and scale-invariant face (object) detection

    Correlates of facial expressions in the primary visual cortex

    Get PDF
    Face detection and recognition should be complemented by recognition of facial expression, for example for social robots which must react to human emotions. Our framework is based on two multi-scale representations in cortical area V1: keypoints at eyes, nose and mouth are grouped for face detection [1]; lines and edges provide information for face recognition [2]

    Face normalization using multi-scale cortical keypoints

    Get PDF
    Empirical studies concerning face recognition suggest that faces may be stored in memory by a few canonical representations. Models of visual perception are based on image representations in cortical area V1 and beyond, which contain many cell layers for feature extractions. Simple, complex and end-stopped cells tuned to different spatial frequencies (scales) and/or orientations provide input for line, edge and keypoint detection. This yields a rich, multi-scale object representation that can be stored in memory in order to identify objects. The multi-scale, keypoint-based saliency maps for Focus-of-Attention can be explored to obtain face detection and normalization, after which face recognition can be achieved using the line/edge representation. In this paper, we focus only on face normalization, showing that multi-scale keypoints can be used to construct canonical representations of faces in memory

    Multi-scale keypoints in V1 and beyond: object segregation, scale selection, saliency maps and face detection

    Get PDF
    End-stopped cells in cortical area V1, which combine outputs of complex cells tuned to different orientations, serve to detect line and edge crossings, singularities and points with large curvature. These cells can be used to construct retinotopic keypoint maps at different spatial scales (level-of-detail). The importance of the multi-scale keypoint representation is studied in this paper. It is shown that this representation provides very important information for object recognition and face detection. Different grouping operators can be used for object segregation and automatic scale selection. Saliency maps for focus-of-attention can be constructed. Such maps can be employed for face detection by grouping facial landmarks at eyes, nose and mouth. Although a face detector can be based on processing within area V1, it is argued that such an operator must be embedded into dorsal and ventral data streams, to and from higher cortical areas, for obtaining translation-, rotation- and scale-invariant detection

    Persistent Evidence of Local Image Properties in Generic ConvNets

    Full text link
    Supervised training of a convolutional network for object classification should make explicit any information related to the class of objects and disregard any auxiliary information associated with the capture of the image or the variation within the object class. Does this happen in practice? Although this seems to pertain to the very final layers in the network, if we look at earlier layers we find that this is not the case. Surprisingly, strong spatial information is implicit. This paper addresses this, in particular, exploiting the image representation at the first fully connected layer, i.e. the global image descriptor which has been recently shown to be most effective in a range of visual recognition tasks. We empirically demonstrate evidences for the finding in the contexts of four different tasks: 2d landmark detection, 2d object keypoints prediction, estimation of the RGB values of input image, and recovery of semantic label of each pixel. We base our investigation on a simple framework with ridge rigression commonly across these tasks, and show results which all support our insight. Such spatial information can be used for computing correspondence of landmarks to a good accuracy, but should potentially be useful for improving the training of the convolutional nets for classification purposes

    Face recognition by cortical multi-scale line and edge representations

    Get PDF
    Empirical studies concerning face recognition suggest that faces may be stored in memory by a few canonical representations. Models of visual perception are based on image representations in cortical area V1 and beyond, which contain many cell layers for feature extraction. Simple, complex and end-stopped cells provide input for line, edge and keypoint detection. Detected events provide a rich, multi-scale object representation, and this representation can be stored in memory in order to identify objects. In this paper, the above context is applied to face recognition. The multi-scale line/edge representation is explored in conjunction with keypoint-based saliency maps for Focus-of-Attention. Recognition rates of up to 96% were achieved by combining frontal and 3/4 views, and recognition was quite robust against partial occlusions

    UcoSLAM: Simultaneous Localization and Mapping by Fusion of KeyPoints and Squared Planar Markers

    Full text link
    This paper proposes a novel approach for Simultaneous Localization and Mapping by fusing natural and artificial landmarks. Most of the SLAM approaches use natural landmarks (such as keypoints). However, they are unstable over time, repetitive in many cases or insufficient for a robust tracking (e.g. in indoor buildings). On the other hand, other approaches have employed artificial landmarks (such as squared fiducial markers) placed in the environment to help tracking and relocalization. We propose a method that integrates both approaches in order to achieve long-term robust tracking in many scenarios. Our method has been compared to the start-of-the-art methods ORB-SLAM2 and LDSO in the public dataset Kitti, Euroc-MAV, TUM and SPM, obtaining better precision, robustness and speed. Our tests also show that the combination of markers and keypoints achieves better accuracy than each one of them independently.Comment: Paper submitted to Pattern Recognitio
    • …
    corecore