418 research outputs found

    Optimizing Memory Efficiency for Convolution Kernels on Kepler GPUs

    Full text link
    Convolution is a fundamental operation in many applications, such as computer vision, natural language processing, image processing, etc. Recent successes of convolutional neural networks in various deep learning applications put even higher demand on fast convolution. The high computation throughput and memory bandwidth of graphics processing units (GPUs) make GPUs a natural choice for accelerating convolution operations. However, maximally exploiting the available memory bandwidth of GPUs for convolution is a challenging task. This paper introduces a general model to address the mismatch between the memory bank width of GPUs and computation data width of threads. Based on this model, we develop two convolution kernels, one for the general case and the other for a special case with one input channel. By carefully optimizing memory access patterns and computation patterns, we design a communication-optimized kernel for the special case and a communication-reduced kernel for the general case. Experimental data based on implementations on Kepler GPUs show that our kernels achieve 5.16X and 35.5% average performance improvement over the latest cuDNN library, for the special case and the general case, respectively

    S3^3FD: Single Shot Scale-invariant Face Detector

    Full text link
    This paper presents a real-time face detector, named Single Shot Scale-invariant Face Detector (S3^3FD), which performs superiorly on various scales of faces with a single deep neural network, especially for small faces. Specifically, we try to solve the common problem that anchor-based detectors deteriorate dramatically as the objects become smaller. We make contributions in the following three aspects: 1) proposing a scale-equitable face detection framework to handle different scales of faces well. We tile anchors on a wide range of layers to ensure that all scales of faces have enough features for detection. Besides, we design anchor scales based on the effective receptive field and a proposed equal proportion interval principle; 2) improving the recall rate of small faces by a scale compensation anchor matching strategy; 3) reducing the false positive rate of small faces via a max-out background label. As a consequence, our method achieves state-of-the-art detection performance on all the common face detection benchmarks, including the AFW, PASCAL face, FDDB and WIDER FACE datasets, and can run at 36 FPS on a Nvidia Titan X (Pascal) for VGA-resolution images.Comment: Accepted by ICCV 2017 + its supplementary materials; Updated the latest results on WIDER FAC

    Three-Dimensional Numerical Modeling of Flow Hydrodynamics and Cohesive Sediment Transport in Enid Lake, Mississippi

    Get PDF
    Enid Lake is one of the largest reservoirs located in Yazoo River Basin, the largest basin in the state of Mississippi. The lake was impounded by Enid Dam on the Yocona River in Yalobusha County and covers an area of 30 square kilometers. It provides significant natural and recreational resources. The soils in this region are highly erodible, resulting in a large amount of fine-grained cohesive sediment discharged into the lake. In this study, a 3D numerical model was developed to simulate the free surface hydrodynamics and transportation of cohesive sediment with a median diameter of 0.0025 to 0.003 mm in Enid Lake. Flow fields in the lake are generally induced by wind and upstream river inflow, and the sediment is also introduced from the inflow during storm events. The general processes of sediment flocculation and settling were considered in the model, and the erosion rate and deposition rate of cohesive sediment were calculated. In this model, the sediment simulation was coupled with flow simulation. In this research, remote sensing technology was applied to estimate the sediment concentration at the lake surface and provide validation data for numerical model simulation. The model results and remote sensing data help us to understand the transport, deposition and resuspension processes of cohesive sediment in large reservoirs due to wind-induced currents and upstream river flows

    Reforestation in southern China: revisiting soil N mineralization and nitrification after 8 years restoration

    Get PDF
    Nitrogen availability and tree species selection play important roles in reforestation. However, long-term field studies on the effects and mechanisms of tree species composition on N transformation are very limited. Eight years after tree seedlings were planted in a field experiment, we revisited the site and tested how tree species composition affects the dynamics of N mineralization and nitrification. Both tree species composition and season significantly influenced the soil dissolved organic carbon (DOC) and nitrogen (DON). N-fixing Acacia crassicarpa monoculture had the highest DON, and 10-mixed species plantation had the highest DOC. The lowest DOC and DON concentrations were both observed in Eucalyptus urophylla monoculture. The tree species composition also significantly affected net N mineralization rates. The highest rate of net N mineralization was found in A. crassicarpa monoculture, which was over twice than that in Castanopsis hystrix monoculture. The annual net N mineralization rates of 10-mixed and 30-mixed plantations were similar as that of N-fixing monoculture. Since mixed plantations have good performance in increasing soil DOC, DON, N mineralization and plant biodiversity, we recommend that mixed species plantations should be used as a sustainable approach for the restoration of degraded land in southern China
    corecore