212 research outputs found

    3D Depthwise Convolution: Reducing Model Parameters in 3D Vision Tasks

    Full text link
    Standard 3D convolution operations require much larger amounts of memory and computation cost than 2D convolution operations. The fact has hindered the development of deep neural nets in many 3D vision tasks. In this paper, we investigate the possibility of applying depthwise separable convolutions in 3D scenario and introduce the use of 3D depthwise convolution. A 3D depthwise convolution splits a single standard 3D convolution into two separate steps, which would drastically reduce the number of parameters in 3D convolutions with more than one order of magnitude. We experiment with 3D depthwise convolution on popular CNN architectures and also compare it with a similar structure called pseudo-3D convolution. The results demonstrate that, with 3D depthwise convolutions, 3D vision tasks like classification and reconstruction can be carried out with more light-weighted neural networks while still delivering comparable performances.Comment: Work in progres

    Application of Convolutional Neural Network in the Segmentation and Classification of High-Resolution Remote Sensing Images

    Get PDF
    Numerous convolution neural networks increase accuracy of classification for remote sensing scene images at the expense of the models space and time sophistication This causes the model to run slowly and prevents the realization of a trade-off among model accuracy and running time The loss of deep characteristics as the network gets deeper makes it impossible to retrieve the key aspects with a sample double branching structure which is bad for classifying remote sensing scene photo

    DPSA: Dense pixelwise spatial attention network for hatching egg fertility detection

    Get PDF
    © 2020 SPIE and IS & T. Deep convolutional neural networks show a good prospect in the fertility detection and classification of specific pathogen-free hatching egg embryos in the production of avian influenza vaccine, and our previous work has mainly investigated three factors of networks to push performance: depth, width, and cardinality. However, an important problem that feeble embryos with weak blood vessels interfering with the classification of resilient fertile ones remains. Inspired by fine-grained classification, we introduce the attention mechanism into our model by proposing a dense pixelwise spatial attention module combined with the existing channel attention through depthwise separable convolutions to further enhance the network class-discriminative ability. In our fused attention module, depthwise convolutions are used for channel-specific features learning, and dilated convolutions with different sampling rates are adopted to capture spatial multiscale context and preserve rich detail, which can maintain high resolution and increase receptive fields simultaneously. The attention mask with strong semantic information generated by aggregating outputs of the spatial pyramid dilated convolution is broadcasted to low-level features via elementwise multiplications, serving as a feature selector to emphasize informative features and suppress less useful ones. A series of experiments conducted on our hatching egg dataset show that our attention network achieves a lower misjudgment rate on weak embryos and a more stable accuracy, which is up to 98.3% and 99.1% on 5-day and 9-day old eggs, respectively

    IRX-1D: A Simple Deep Learning Architecture for Remote Sensing Classifications

    Full text link
    We proposes a simple deep learning architecture combining elements of Inception, ResNet and Xception networks. Four new datasets were used for classification with both small and large training samples. Results in terms of classification accuracy suggests improved performance by proposed architecture in comparison to Bayesian optimised 2D-CNN with small training samples. Comparison of results using small training sample with Indiana Pines hyperspectral dataset suggests comparable or better performance by proposed architecture than nine reported works using different deep learning architectures. In spite of achieving high classification accuracy with limited training samples, comparison of classified image suggests different land cover classes are assigned to same area when compared with the classified image provided by the model trained using large training samples with all datasets.Comment: 22 Page, 6 tables, 9 Figure

    Dynamic Convolution Self-Attention Network for Land-Cover Classification in VHR Remote-Sensing Images

    Get PDF
    The current deep convolutional neural networks for very-high-resolution (VHR) remote-sensing image land-cover classification often suffer from two challenges. First, the feature maps extracted by network encoders based on vanilla convolution usually contain a lot of redundant information, which easily causes misclassification of land cover. Moreover, these encoders usually require a large number of parameters and high computational costs. Second, as remote-sensing images are complex and contain many objects with large-scale variances, it is difficult to use the popular feature fusion modules to improve the representation ability of networks. To address the above issues, we propose a dynamic convolution self-attention network (DCSA-Net) for VHR remote-sensing image land-cover classification. The proposed network has two advantages. On one hand, we designed a lightweight dynamic convolution module (LDCM) by using dynamic convolution and a self-attention mechanism. This module can extract more useful image features than vanilla convolution, avoiding the negative effect of useless feature maps on land-cover classification. On the other hand, we designed a context information aggregation module (CIAM) with a ladder structure to enlarge the receptive field. This module can aggregate multi-scale contexture information from feature maps with different resolutions using a dense connection. Experiment results show that the proposed DCSA-Net is superior to state-of-the-art networks due to higher accuracy of land-cover classification, fewer parameters, and lower computational cost. The source code is made public available.National Natural Science Foundation of China (Program No. 61871259, 62271296, 61861024), in part by Natural Science Basic Research Program of Shaanxi (Program No. 2021JC-47), in part by Key Research and Development Program of Shaanxi (Program No. 2022GY-436, 2021ZDLGY08-07), in part by Natural Science Basic Research Program of Shaanxi (Program No. 2022JQ-634, 2022JQ-018), and in part by Shaanxi Joint Laboratory of Artificial Intelligence (No. 2020SS-03)

    DGCNet: An Efficient 3D-Densenet based on Dynamic Group Convolution for Hyperspectral Remote Sensing Image Classification

    Full text link
    Deep neural networks face many problems in the field of hyperspectral image classification, lack of effective utilization of spatial spectral information, gradient disappearance and overfitting as the model depth increases. In order to accelerate the deployment of the model on edge devices with strict latency requirements and limited computing power, we introduce a lightweight model based on the improved 3D-Densenet model and designs DGCNet. It improves the disadvantage of group convolution. Referring to the idea of dynamic network, dynamic group convolution(DGC) is designed on 3d convolution kernel. DGC introduces small feature selectors for each grouping to dynamically decide which part of the input channel to connect based on the activations of all input channels. Multiple groups can capture different and complementary visual and semantic features of input images, allowing convolution neural network(CNN) to learn rich features. 3D convolution extracts high-dimensional and redundant hyperspectral data, and there is also a lot of redundant information between convolution kernels. DGC module allows 3D-Densenet to select channel information with richer semantic features and discard inactive regions. The 3D-CNN passing through the DGC module can be regarded as a pruned network. DGC not only allows 3D-CNN to complete sufficient feature extraction, but also takes into account the requirements of speed and calculation amount. The inference speed and accuracy have been improved, with outstanding performance on the IN, Pavia and KSC datasets, ahead of the mainstream hyperspectral image classification methods

    Slum image detection and localization using transfer learning: a case study in Northern Morocco

    Get PDF
    Developing countries are faced with social and economic challenges, including the emergence and proliferation of slums. Slum detection and localization methods typically rely on regular topographic surveys or on visual identification of high-resolution spatial satellite images, as well as socio-environmental surveys from land surveys and general population censuses. Yet, they consume so much time and effort. To overcome these problems, this paper exploits well-known seven pretrained models using transfer learning approaches such as MobileNets, InceptionV3, NASNetMobile, Xception, VGG16, EfficientNet, and ResNet50, consecutively, on a smaller dataset of medium-resolution satellite imagery. The accuracies obtained from these experiments, respectively, demonstrate that the top three pretrained models achieve 98.78%, 97.9%, and 97.56%. Besides, MobileNets have the smallest memory sizes of 9.1 Mo and the shortest latency of 17.01 s, which can be implemented as needed. The results show the good performance of the top three pretrained models to be used for detecting and localizing slum housing in northern Morocco

    Neural Architecture Search for Image Segmentation and Classification

    Get PDF
    Deep learning (DL) is a class of machine learning algorithms that relies on deep neural networks (DNNs) for computations. Unlike traditional machine learning algorithms, DL can learn from raw data directly and effectively. Hence, DL has been successfully applied to tackle many real-world problems. When applying DL to a given problem, the primary task is designing the optimum DNN. This task relies heavily on human expertise, is time-consuming, and requires many trial-and-error experiments. This thesis aims to automate the laborious task of designing the optimum DNN by exploring the neural architecture search (NAS) approach. Here, we propose two new NAS algorithms for two real-world problems: pedestrian lane detection for assistive navigation and hyperspectral image segmentation for biosecurity scanning. Additionally, we also introduce a new dataset-agnostic predictor of neural network performance, which can be used to speed-up NAS algorithms that require the evaluation of candidate DNNs
    • …
    corecore