327 research outputs found

    Real-time Semantic Segmentation of Crop and Weed for Precision Agriculture Robots Leveraging Background Knowledge in CNNs

    Full text link
    Precision farming robots, which target to reduce the amount of herbicides that need to be brought out in the fields, must have the ability to identify crops and weeds in real time to trigger weeding actions. In this paper, we address the problem of CNN-based semantic segmentation of crop fields separating sugar beet plants, weeds, and background solely based on RGB data. We propose a CNN that exploits existing vegetation indexes and provides a classification in real time. Furthermore, it can be effectively re-trained to so far unseen fields with a comparably small amount of training data. We implemented and thoroughly evaluated our system on a real agricultural robot operating in different fields in Germany and Switzerland. The results show that our system generalizes well, can operate at around 20Hz, and is suitable for online operation in the fields.Comment: Accepted for publication at IEEE International Conference on Robotics and Automation 2018 (ICRA 2018

    A W-Shaped Convolutional Network for Robust Crop and Weed Classification in Agriculture

    Get PDF
    Agricultural image and vision computing are significantly different from other object classification-based methods because two base classes in agriculture, crops and weeds, have many common traits. Efficient crop, weeds, and soil classification are required to perform autonomous (spraying, harvesting, etc.) activities in agricultural fields. In a three-class (crop-weed-background) agricultural classification scenario, it is usually easier to accurately classify the background class than the crop and weed classes because the background class appears significantly different feature-wise than the crop and weed classes. However, robustly distinguishing between the crop and weed classes is challenging because their appearance features generally look very similar. To address this problem, we propose a framework based on a convolutional W-shaped network with two encoder-decoder structures of different sizes. The first encoder-decoder structure differentiates between background and vegetation (crop and weed), and the second encoder-decoder structure learns discriminating features to classify crop and weed classes efficiently. The proposed W network is generalizable for different crop types. The effectiveness of the proposed network is demonstrated on two crop datasets – a tobacco dataset and a sesame dataset, both collected in this study and made available publicly online for use by the community – by evaluating and comparing the performance with existing related methods. The proposed method consistently outperforms existing related methods on both datasets

    Advancing Land Cover Mapping in Remote Sensing with Deep Learning

    Get PDF
    Automatic mapping of land cover in remote sensing data plays an increasingly significant role in several earth observation (EO) applications, such as sustainable development, autonomous agriculture, and urban planning. Due to the complexity of the real ground surface and environment, accurate classification of land cover types is facing many challenges. This thesis provides novel deep learning-based solutions to land cover mapping challenges such as how to deal with intricate objects and imbalanced classes in multi-spectral and high-spatial resolution remote sensing data. The first work presents a novel model to learn richer multi-scale and global contextual representations in very high-resolution remote sensing images, namely the dense dilated convolutions' merging (DDCM) network. The proposed method is light-weighted, flexible and extendable, so that it can be used as a simple yet effective encoder and decoder module to address different classification and semantic mapping challenges. Intensive experiments on different benchmark remote sensing datasets demonstrate that the proposed method can achieve better performance but consume much fewer computation resources compared with other published methods. Next, a novel graph model is developed for capturing long-range pixel dependencies in remote sensing images to improve land cover mapping. One key component in the method is the self-constructing graph (SCG) module that can effectively construct global context relations (latent graph structure) without requiring prior knowledge graphs. The proposed SCG-based models achieved competitive performance on different representative remote sensing datasets with faster training and lower computational cost compared to strong baseline models. The third work introduces a new framework, namely the multi-view self-constructing graph (MSCG) network, to extend the vanilla SCG model to be able to capture multi-view context representations with rotation invariance to achieve improved segmentation performance. Meanwhile, a novel adaptive class weighting loss function is developed to alleviate the issue of class imbalance commonly found in EO datasets for semantic segmentation. Experiments on benchmark data demonstrate the proposed framework is computationally efficient and robust to produce improved segmentation results for imbalanced classes. To address the key challenges in multi-modal land cover mapping of remote sensing data, namely, 'what', 'how' and 'where' to effectively fuse multi-source features and to efficiently learn optimal joint representations of different modalities, the last work presents a compact and scalable multi-modal deep learning framework (MultiModNet) based on two novel modules: the pyramid attention fusion module and the gated fusion unit. The proposed MultiModNet outperforms the strong baselines on two representative remote sensing datasets with fewer parameters and at a lower computational cost. Extensive ablation studies also validate the effectiveness and flexibility of the framework
    • …
    corecore