435 research outputs found
Optimizing Memory Efficiency for Convolution Kernels on Kepler GPUs
Convolution is a fundamental operation in many applications, such as computer
vision, natural language processing, image processing, etc. Recent successes of
convolutional neural networks in various deep learning applications put even
higher demand on fast convolution. The high computation throughput and memory
bandwidth of graphics processing units (GPUs) make GPUs a natural choice for
accelerating convolution operations. However, maximally exploiting the
available memory bandwidth of GPUs for convolution is a challenging task. This
paper introduces a general model to address the mismatch between the memory
bank width of GPUs and computation data width of threads. Based on this model,
we develop two convolution kernels, one for the general case and the other for
a special case with one input channel. By carefully optimizing memory access
patterns and computation patterns, we design a communication-optimized kernel
for the special case and a communication-reduced kernel for the general case.
Experimental data based on implementations on Kepler GPUs show that our kernels
achieve 5.16X and 35.5% average performance improvement over the latest cuDNN
library, for the special case and the general case, respectively
SFD: Single Shot Scale-invariant Face Detector
This paper presents a real-time face detector, named Single Shot
Scale-invariant Face Detector (SFD), which performs superiorly on various
scales of faces with a single deep neural network, especially for small faces.
Specifically, we try to solve the common problem that anchor-based detectors
deteriorate dramatically as the objects become smaller. We make contributions
in the following three aspects: 1) proposing a scale-equitable face detection
framework to handle different scales of faces well. We tile anchors on a wide
range of layers to ensure that all scales of faces have enough features for
detection. Besides, we design anchor scales based on the effective receptive
field and a proposed equal proportion interval principle; 2) improving the
recall rate of small faces by a scale compensation anchor matching strategy; 3)
reducing the false positive rate of small faces via a max-out background label.
As a consequence, our method achieves state-of-the-art detection performance on
all the common face detection benchmarks, including the AFW, PASCAL face, FDDB
and WIDER FACE datasets, and can run at 36 FPS on a Nvidia Titan X (Pascal) for
VGA-resolution images.Comment: Accepted by ICCV 2017 + its supplementary materials; Updated the
latest results on WIDER FAC
Three-Dimensional Numerical Modeling of Flow Hydrodynamics and Cohesive Sediment Transport in Enid Lake, Mississippi
Enid Lake is one of the largest reservoirs located in Yazoo River Basin, the largest basin in the state of Mississippi. The lake was impounded by Enid Dam on the Yocona River in Yalobusha County and covers an area of 30 square kilometers. It provides significant natural and recreational resources. The soils in this region are highly erodible, resulting in a large amount of fine-grained cohesive sediment discharged into the lake. In this study, a 3D numerical model was developed to simulate the free surface hydrodynamics and transportation of cohesive sediment with a median diameter of 0.0025 to 0.003 mm in Enid Lake. Flow fields in the lake are generally induced by wind and upstream river inflow, and the sediment is also introduced from the inflow during storm events. The general processes of sediment flocculation and settling were considered in the model, and the erosion rate and deposition rate of cohesive sediment were calculated. In this model, the sediment simulation was coupled with flow simulation. In this research, remote sensing technology was applied to estimate the sediment concentration at the lake surface and provide validation data for numerical model simulation. The model results and remote sensing data help us to understand the transport, deposition and resuspension processes of cohesive sediment in large reservoirs due to wind-induced currents and upstream river flows
Reforestation in southern China: revisiting soil N mineralization and nitrification after 8 years restoration
Nitrogen availability and tree species selection play important roles in reforestation. However, long-term field studies on the effects and mechanisms of tree species composition on N transformation are very limited. Eight years after tree seedlings were planted in a field experiment, we revisited the site and tested how tree species composition affects the dynamics of N mineralization and nitrification. Both tree species composition and season significantly influenced the soil dissolved organic carbon (DOC) and nitrogen (DON). N-fixing Acacia crassicarpa monoculture had the highest DON, and 10-mixed species plantation had the highest DOC. The lowest DOC and DON concentrations were both observed in Eucalyptus urophylla monoculture. The tree species composition also significantly affected net N mineralization rates. The highest rate of net N mineralization was found in A. crassicarpa monoculture, which was over twice than that in Castanopsis hystrix monoculture. The annual net N mineralization rates of 10-mixed and 30-mixed plantations were similar as that of N-fixing monoculture. Since mixed plantations have good performance in increasing soil DOC, DON, N mineralization and plant biodiversity, we recommend that mixed species plantations should be used as a sustainable approach for the restoration of degraded land in southern China
- …