2,079 research outputs found

    Strengthening the Effectiveness of Pedestrian Detection with Spatially Pooled Features

    Full text link
    We propose a simple yet effective approach to the problem of pedestrian detection which outperforms the current state-of-the-art. Our new features are built on the basis of low-level visual features and spatial pooling. Incorporating spatial pooling improves the translational invariance and thus the robustness of the detection process. We then directly optimise the partial area under the ROC curve (\pAUC) measure, which concentrates detection performance in the range of most practical importance. The combination of these factors leads to a pedestrian detector which outperforms all competitors on all of the standard benchmark datasets. We advance state-of-the-art results by lowering the average miss rate from 13%13\% to 11%11\% on the INRIA benchmark, 41%41\% to 37%37\% on the ETH benchmark, 51%51\% to 42%42\% on the TUD-Brussels benchmark and 36%36\% to 29%29\% on the Caltech-USA benchmark.Comment: 16 pages. Appearing in Proc. European Conf. Computer Vision (ECCV) 201

    Learning Smooth Pooling Regions for Visual Recognition

    Full text link
    From the early HMAX model to Spatial Pyramid Matching, spatial pooling has played an important role in visual recognition pipelines. By aggregating local statistics, it equips the recognition pipelines with a certain degree of robustness to translation and deformation yet preserving spatial information. Despite of its predominance in current recognition systems, we have seen little progress to fully adapt the pooling strategy to the task at hand. In this paper, we propose a flexible parameterization of the spatial pooling step and learn the pooling regions together with the classifier. We investigate a smoothness regularization term that in conjuncture with an efficient learning scheme makes learning scalable. Our framework can work with both popular pooling operators: sum-pooling and max-pooling. Finally, we show benefits of our approach for object recognition tasks based on visual words and higher level event recognition tasks based on object-bank features. In both cases, we improve over the hand-crafted spatial pooling step showing the importance of its adaptation to the task

    Pooling-Invariant Image Feature Learning

    Full text link
    Unsupervised dictionary learning has been a key component in state-of-the-art computer vision recognition architectures. While highly effective methods exist for patch-based dictionary learning, these methods may learn redundant features after the pooling stage in a given early vision architecture. In this paper, we offer a novel dictionary learning scheme to efficiently take into account the invariance of learned features after the spatial pooling stage. The algorithm is built on simple clustering, and thus enjoys efficiency and scalability. We discuss the underlying mechanism that justifies the use of clustering algorithms, and empirically show that the algorithm finds better dictionaries than patch-based methods with the same dictionary size

    Binding of Object Representations by Synchronous Cortical Dynamics Explains Temporal Order and Spatial Pooling Data

    Full text link
    A key problem in cognitive science concerns how the brain binds together parts of an object into a coherent visual object representation. One difficulty that this binding process needs to overcome is that different parts of an object may be processed by the brain at different rates and may thus become desynchronized. Perceptual framing is a mechanism that resynchronizes cortical activities corresponding to the same retinal object. A neural network model based on cooperation between oscillators via feedback from a subsequent processing stage is presented that is able to rapidly resynchronize desynchronized featural activities. Model properties help to explain perceptual framing data, including psychophysical data about temporal order judgments. These cooperative model interactions also simulate data concerning the reduction of threshold contrast as a function of stimulus length. The model hereby provides a unified explanation of temporal order and threshold contrast data as manifestations of a cortical binding process that can rapidly resynchronize image parts which belong together in visual object representations.Air Force Office of Scientific Research (F49620-92-J-0225, F49620-92-J-0334, F49620-92-J-0499); Office of Naval Research (N00014-92- J-4015, N00014-91-J-4100

    Pre-saccadic perception: separate time courses for enhancement and spatial pooling at the saccade target

    Get PDF
    We interact with complex scenes using eye movements to select targets of interest. Studies have shown that the future target of a saccadic eye movement is processed differently by the visual system. A number of effects have been reported, including a benefit for perceptual performance at the target (“enhancement”), reduced influences of backward masking (“unmasking”), reduced crowding (“un-crowding”) and spatial compression towards the saccade target. We investigated the time course of these effects by measuring orientation discrimination for targets that were spatially crowded or temporally masked. In four experiments, we varied the target-flanker distance, the presence of forward/backward masks, the orientation of the flankers and whether participants made a saccade. Masking and randomizing flanker orientation reduced performance in both fixation and saccade trials. We found a small improvement in performance on saccade trials, compared to fixation trials, with a time course that was consistent with a general enhancement at the saccade target. In addition, a decrement in performance (reporting the average flanker orientation, rather than the target) was found in the time bins nearest saccade onset when random oriented flankers were used, consistent with spatial pooling around the saccade target. We did not find strong evidence for un-crowding. Overall, our pattern of results was consistent with both an early, general enhancement at the saccade target and a later, peri-saccadic compression/pooling towards the saccade target
    • …
    corecore