127 research outputs found

    Object Detection from a Vehicle Using Deep Learning Network and Future Integration with Multi-Sensor Fusion Algorithm

    Get PDF
    Accuracy in detecting a moving object is critical to autonomous driving or advanced driver assistance systems (ADAS). By including the object classification from multiple sensor detections, the model of the object or environment can be identified more accurately. The critical parameters involved in improving the accuracy are the size and the speed of the moving object. All sensor data are to be used in defining a composite object representation so that it could be used for the class information in the core object’s description. This composite data can then be used by a deep learning network for complete perception fusion in order to solve the detection and tracking of moving objects problem. Camera image data from subsequent frames along the time axis in conjunction with the speed and size of the object will further contribute in developing better recognition algorithms. In this paper, we present preliminary results using only camera images for detecting various objects using deep learning network, as a first step toward multi-sensor fusion algorithm development. The simulation experiments based on camera images show encouraging results where the proposed deep learning network based detection algorithm was able to detect various objects with certain degree of confidence. A laboratory experimental setup is being commissioned where three different types of sensors, a digital camera with 8 megapixel resolution, a LIDAR with 40m range, and ultrasonic distance transducer sensors will be used for multi-sensor fusion to identify the object in real-time

    Robust object representation by boosting-like deep learning architecture

    Get PDF
    This paper presents a new deep learning architecture for robust object representation, aiming at efficiently combining the proposed synchronized multi-stage feature (SMF) and a boosting-like algorithm. The SMF structure can capture a variety of characteristics from the inputting object based on the fusion of the handcraft features and deep learned features. With the proposed boosting-like algorithm, we can obtain more convergence stability on training multi-layer network by using the boosted samples. We show the generalization of our object representation architecture by applying it to undertake various tasks, i.e. pedestrian detection and action recognition. Our approach achieves 15.89% and 3.85% reduction in the average miss rate compared with ACF and JointDeep on the largest Caltech dataset, and acquires competitive results on the MSRAction3D dataset

    PANDA: Pose Aligned Networks for Deep Attribute Modeling

    Full text link
    We propose a method for inferring human attributes (such as gender, hair style, clothes style, expression, action) from images of people under large variation of viewpoint, pose, appearance, articulation and occlusion. Convolutional Neural Nets (CNN) have been shown to perform very well on large scale object recognition problems. In the context of attribute classification, however, the signal is often subtle and it may cover only a small part of the image, while the image is dominated by the effects of pose and viewpoint. Discounting for pose variation would require training on very large labeled datasets which are not presently available. Part-based models, such as poselets and DPM have been shown to perform well for this problem but they are limited by shallow low-level features. We propose a new method which combines part-based models and deep learning by training pose-normalized CNNs. We show substantial improvement vs. state-of-the-art methods on challenging attribute classification tasks in unconstrained settings. Experiments confirm that our method outperforms both the best part-based methods on this problem and conventional CNNs trained on the full bounding box of the person.Comment: 8 page

    Deep Poselets for Human Detection

    Full text link
    We address the problem of detecting people in natural scenes using a part approach based on poselets. We propose a bootstrapping method that allows us to collect millions of weakly labeled examples for each poselet type. We use these examples to train a Convolutional Neural Net to discriminate different poselet types and separate them from the background class. We then use the trained CNN as a way to represent poselet patches with a Pose Discriminative Feature (PDF) vector -- a compact 256-dimensional feature vector that is effective at discriminating pose from appearance. We train the poselet model on top of PDF features and combine them with object-level CNNs for detection and bounding box prediction. The resulting model leads to state-of-the-art performance for human detection on the PASCAL datasets
    • …