1,244 research outputs found

    Deep Residual Learning for Image Recognition

    Full text link
    Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.Comment: Tech repor

    Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

    Full text link
    State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features---using the recently popular terminology of neural networks with 'attention' mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model, our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.Comment: Extended tech repor

    Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

    Full text link
    Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224x224) input image. This requirement is "artificial" and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale. In this work, we equip the networks with another pooling strategy, "spatial pyramid pooling", to eliminate the above requirement. The new network structure, called SPP-net, can generate a fixed-length representation regardless of image size/scale. Pyramid pooling is also robust to object deformations. With these advantages, SPP-net should in general improve all CNN-based image classification methods. On the ImageNet 2012 dataset, we demonstrate that SPP-net boosts the accuracy of a variety of CNN architectures despite their different designs. On the Pascal VOC 2007 and Caltech101 datasets, SPP-net achieves state-of-the-art classification results using a single full-image representation and no fine-tuning. The power of SPP-net is also significant in object detection. Using SPP-net, we compute the feature maps from the entire image only once, and then pool features in arbitrary regions (sub-images) to generate fixed-length representations for training the detectors. This method avoids repeatedly computing the convolutional features. In processing test images, our method is 24-102x faster than the R-CNN method, while achieving better or comparable accuracy on Pascal VOC 2007. In ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2014, our methods rank #2 in object detection and #3 in image classification among all 38 teams. This manuscript also introduces the improvement made for this competition.Comment: This manuscript is the accepted version for IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2015. See Changelo

    Transformation, Identification, and Inversion of Goldberg-Coxeter Fullerenes

    Full text link
    It is difficult to identify a G-C fullerene directly from its dimensions as its lattice is not proportional to that of its archetype in general, although they have the same three-dimensional shape. In this paper, the area scale factor of a G-C fullerene is proved to be an integer, which can be calculated from its dimensions. All the G-C transformations are k-inflations that can be easily identified and inversed, primary transformations whose area scale factors are prime numbers, or composite transformations whose area scale factors are the product of those of its sub-transformations. As the result, a method to identify any G-C fullerenes according to the area scale factor was presented.Comment: 6 pages, 3 figure

    Trace Elements in Coal Gangue: A Review

    Get PDF
    Coal gangue is one of the largest industrial residues. It has high ash content, low carbonaceous content, and heating value. Meanwhile, it has some trace elements. Large quantities of coal gangue cause serious environmental problems by polluting the air, water, and soil as well as occupying a tremendous amount of land. Now, coal gangue utilization is a matter of great concern and has attracted wide interest. However, some toxic trace elements in coal gangue should be paid more attention during the utilization of coal gangue. In this article, the modes of occurrence and the leaching characters of trace elements in coal gangue were introduced according to the result of the sequential extraction method and the leaching method. The release character of trace elements during combustion of coal gangue and the environmental implication of trace elements in coal gangue were also discussed. The sulfide-bound trace elements are dominant form in coal gangue. Leaching behavior of trace elements from coal gangue is affected by many factors. Different trace elements presented different transformation behaviors. Trace elements in coal gangue could release out and produce environmental implication in various degrees, depending on the type of trace elements
    • …
    corecore