6,863 research outputs found

    Learning to Refine Human Pose Estimation

    Full text link
    Multi-person pose estimation in images and videos is an important yet challenging task with many applications. Despite the large improvements in human pose estimation enabled by the development of convolutional neural networks, there still exist a lot of difficult cases where even the state-of-the-art models fail to correctly localize all body joints. This motivates the need for an additional refinement step that addresses these challenging cases and can be easily applied on top of any existing method. In this work, we introduce a pose refinement network (PoseRefiner) which takes as input both the image and a given pose estimate and learns to directly predict a refined pose by jointly reasoning about the input-output space. In order for the network to learn to refine incorrect body joint predictions, we employ a novel data augmentation scheme for training, where we model "hard" human pose cases. We evaluate our approach on four popular large-scale pose estimation benchmarks such as MPII Single- and Multi-Person Pose Estimation, PoseTrack Pose Estimation, and PoseTrack Pose Tracking, and report systematic improvement over the state of the art.Comment: To appear in CVPRW (2018). Workshop: Visual Understanding of Humans in Crowd Scene and the 2nd Look Into Person Challenge (VUHCS-LIP

    Msb r‐cnn: A multi‐stage balanced defect detection network

    Get PDF
    Deep learning networks are applied for defect detection, among which Cascade R‐CNN is a multi‐stage object detection network and is state of the art in terms of accuracy and efficiency. However, it is still a challenge for Cascade R‐CNN to deal with complex and diverse defects, as the widely varied shapes of defects lead to inefficiency for the traditional convolution filter to extract features. Additionally, the imbalance in features, losses and samples cause lower accuracy. To address the above challenges, this paper proposes a multi‐stage balanced R‐CNN (MSB R‐CNN) for defect detection based on Cascade R‐CNN. Firstly, deformable convolution is adopted in different stages of the backbone network to improve its adaptability to the varying shapes of the defect. Then, the features obtained by the backbone network are refined and enhanced by the balanced feature pyramid. To overcome the imbalance of classification and regression loss, the balanced L1 loss is applied at different stages to correct it. Finally, for the sample selection, the interaction of union (IoU) balanced sampler and the online hard example mining (OHEM) sampler are combined at different stages to make the sampling more reasonable, which can bring a better accuracy and convergence effect to the model. The results of our experiments on the DAGM2007 dataset has shown that our network (MSB R‐CNN) can achieve a mean average precision (mAP) of 67.5%, an increase of 1.5% mAP, compared to Cascade R‐CNN
    corecore