8,617 research outputs found
Robust object representation by boosting-like deep learning architecture
This paper presents a new deep learning architecture for robust object representation, aiming at efficiently combining the proposed synchronized multi-stage feature (SMF) and a boosting-like algorithm. The SMF structure can capture a variety of characteristics from the inputting object based on the fusion of the handcraft features and deep learned features. With the proposed boosting-like algorithm, we can obtain more convergence stability on training multi-layer network by using the boosted samples. We show the generalization of our object representation architecture by applying it to undertake various tasks, i.e. pedestrian detection and action recognition. Our approach achieves 15.89% and 3.85% reduction in the average miss rate compared with ACF and JointDeep on the largest Caltech dataset, and acquires competitive results on the MSRAction3D dataset
Learning Complexity-Aware Cascades for Deep Pedestrian Detection
The design of complexity-aware cascaded detectors, combining features of very
different complexities, is considered. A new cascade design procedure is
introduced, by formulating cascade learning as the Lagrangian optimization of a
risk that accounts for both accuracy and complexity. A boosting algorithm,
denoted as complexity aware cascade training (CompACT), is then derived to
solve this optimization. CompACT cascades are shown to seek an optimal
trade-off between accuracy and complexity by pushing features of higher
complexity to the later cascade stages, where only a few difficult candidate
patches remain to be classified. This enables the use of features of vastly
different complexities in a single detector. In result, the feature pool can be
expanded to features previously impractical for cascade design, such as the
responses of a deep convolutional neural network (CNN). This is demonstrated
through the design of a pedestrian detector with a pool of features whose
complexities span orders of magnitude. The resulting cascade generalizes the
combination of a CNN with an object proposal mechanism: rather than a
pre-processing stage, CompACT cascades seamlessly integrate CNNs in their
stages. This enables state of the art performance on the Caltech and KITTI
datasets, at fairly fast speeds
Enhancement of ELDA Tracker Based on CNN Features and Adaptive Model Update
Appearance representation and the observation model are the most important components in designing a robust visual tracking algorithm for video-based sensors. Additionally, the exemplar-based linear discriminant analysis (ELDA) model has shown good performance in object tracking. Based on that, we improve the ELDA tracking algorithm by deep convolutional neural network (CNN) features and adaptive model update. Deep CNN features have been successfully used in various computer vision tasks. Extracting CNN features on all of the candidate windows is time consuming. To address this problem, a two-step CNN feature extraction method is proposed by separately computing convolutional layers and fully-connected layers. Due to the strong discriminative ability of CNN features and the exemplar-based model, we update both object and background models to improve their adaptivity and to deal with the tradeoff between discriminative ability and adaptivity. An object updating method is proposed to select the “good” models (detectors), which are quite discriminative and uncorrelated to other selected models. Meanwhile, we build the background model as a Gaussian mixture model (GMM) to adapt to complex scenes, which is initialized offline and updated online. The proposed tracker is evaluated on a benchmark dataset of 50 video sequences with various challenges. It achieves the best overall performance among the compared state-of-the-art trackers, which demonstrates the effectiveness and robustness of our tracking algorithm
Fourier-based Rotation-invariant Feature Boosting: An Efficient Framework for Geospatial Object Detection
Geospatial object detection of remote sensing imagery has been attracting an
increasing interest in recent years, due to the rapid development in spaceborne
imaging. Most of previously proposed object detectors are very sensitive to
object deformations, such as scaling and rotation. To this end, we propose a
novel and efficient framework for geospatial object detection in this letter,
called Fourier-based rotation-invariant feature boosting (FRIFB). A
Fourier-based rotation-invariant feature is first generated in polar
coordinate. Then, the extracted features can be further structurally refined
using aggregate channel features. This leads to a faster feature computation
and more robust feature representation, which is good fitting for the coming
boosting learning. Finally, in the test phase, we achieve a fast pyramid
feature extraction by estimating a scale factor instead of directly collecting
all features from image pyramid. Extensive experiments are conducted on two
subsets of NWPU VHR-10 dataset, demonstrating the superiority and effectiveness
of the FRIFB compared to previous state-of-the-art methods
- …