73 research outputs found

    Weed Recognition in Agriculture: A Mask R-CNN Approach

    Get PDF
    Recent interdisciplinary collaboration on deep learning has led to a growing interest in its application in the agriculture domain. Weed control and management are some of the crucial tasks in agriculture to maintain high crop productivity. The inception phase of weed control and management is to successfully recognize the weed plants, followed by providing a suitable management plan. Due to the complexities in agriculture images, such as similar colour and texture, we need to incorporate a deep neural network that uses pixel-wise grouping for identifying the plant species. In this thesis, we analysed the performance of one of the most popular deep neural networks aimed to solve the instance segmentation (pixel-wise analysis) problems: Mask R-CNN, for weed plant recognition (detection and classification) using field images and aerial images. We have used Mask R-CNN to recognize the crop plants and weed plants using the Crop/Weed Field Image Dataset (CWFID) for the field image study. However, the CWFID\u27s limitations are that it identifies all weed plants as a single class and all of the crop plants are from a single organic carrot field. We have created a synthetic dataset with 80 weed plant species to tackle this problem and tested it with Mask R-CNN to expand our study. Throughout this thesis, we predominantly focused on detecting one specific invasive weed type called Persicaria Perfoliata or Mile-A-Minute (MAM) for our aerial image study. In general, supervised model outcomes are slow to aerial images, primarily due to large image size and scarcity of well-annotated datasets, making it relatively harder to recognize the species from higher altitudes. We propose a three-level (leaves, trees, forest) hierarchy to recognize the species using Unmanned Aerial Vehicles(UAVs) to address this issue. To create a dataset that resembles weed clusters similar to MAM, we have used a localized style transfer technique to transfer the style from the available MAM images to a portion of the aerial images\u27 content using VGG-19 architecture. We have also generated another dataset at a relatively low altitude and tested it with Mask R-CNN and reached ~92% AP50 using these low-altitude resized images

    Automated High-resolution Earth Observation Image Interpretation: Outcome of the 2020 Gaofen Challenge

    Get PDF
    In this article, we introduce the 2020 Gaofen Challenge and relevant scientific outcomes. The 2020 Gaofen Challenge is an international competition, which is organized by the China High-Resolution Earth Observation Conference Committee and the Aerospace Information Research Institute, Chinese Academy of Sciences and technically cosponsored by the IEEE Geoscience and Remote Sensing Society and the International Society for Photogrammetry and Remote Sensing. It aims at promoting the academic development of automated high-resolution earth observation image interpretation. Six independent tracks have been organized in this challenge, which cover the challenging problems in the field of object detection and semantic segmentation. With the development of convolutional neural networks, deep-learning-based methods have achieved good performance on image interpretation. In this article, we report the details and the best-performing methods presented so far in the scope of this challenge

    Synthetic Aperture Radar (SAR) Meets Deep Learning

    Get PDF
    This reprint focuses on the application of the combination of synthetic aperture radars and depth learning technology. It aims to further promote the development of SAR image intelligent interpretation technology. A synthetic aperture radar (SAR) is an important active microwave imaging sensor, whose all-day and all-weather working capacity give it an important place in the remote sensing community. Since the United States launched the first SAR satellite, SAR has received much attention in the remote sensing community, e.g., in geological exploration, topographic mapping, disaster forecast, and traffic monitoring. It is valuable and meaningful, therefore, to study SAR-based remote sensing applications. In recent years, deep learning represented by convolution neural networks has promoted significant progress in the computer vision community, e.g., in face recognition, the driverless field and Internet of things (IoT). Deep learning can enable computational models with multiple processing layers to learn data representations with multiple-level abstractions. This can greatly improve the performance of various applications. This reprint provides a platform for researchers to handle the above significant challenges and present their innovative and cutting-edge research results when applying deep learning to SAR in various manuscript types, e.g., articles, letters, reviews and technical reports

    Cereal grain and ear detection with convolutional neural networks

    Get PDF
    High computing power and data availability have made it possible to combine traditional farming with modern machine learning methods. The profitability and environmental friendliness of agriculture can be improved through automatic data processing. For example, applications related to computer vision are enabling automation of various tasks more and more efficiently. Computer vision is a field of study which centers on how computers gain understanding from digital images. A subfield of computer vision, called object detection focuses on mathematical techniques to detect, localize, and classify semantic objects in digital images. This thesis studies object detection methods that are based on convolutional neural networks and how they can be applied in precision agriculture to detect cereal grains and ears. Cultivation of pure-oats poses particular challenges for farmers. The fields need to be inspected regularly to ensure that the crop is not contaminated by other cereals. If the quantity of foreign cereals containing gluten exceeds a certain threshold per kilogram of weight, that crop cannot be used to produce gluten-free products. Detecting foreign grains and ears at the early stages of the growing season ensures the quality of the gluten-free crop.Suuri laskentateho ja tiedon saatavuus ovat mahdollistaneet modernien koneoppimismenetelmien käytön perinteisen maanviljelyn yhteydessä. Maatalouden kannattavuutta ja ympäristöystävällisyyttä voidaan parantaa automaattisen tietojenkäsittelyn avulla. Yhä useampia tehtäviä voidaan automatisoida tehokkaammin esimerkiksi tietokonenäön avulla. Tietokonenäkö on tutkimusala, joka tutkii sitä, miten tietokoneet ymmärtävät digitaalisten kuvien sisältämää informaatiota. Hahmontunnistus on yksi tietokonenäön osa-alueista, jossa keskitytään matemaattisiin tekniikoihin, joiden avulla kuvista havaitaan, paikallistetaan ja luokitellaan hahmoja. Puhdaskauran viljely asettaa viljelijöille erityisiä haasteita. Pellot on tarkistettava säännöllisesti, jolla varmistetaan se, että sato ei ole muiden viljojen saastuttama. Satoa ei voida käyttää gluteenittomien tuotteiden tuottamiseen, jos gluteenia sisältävien viljojen määrä ylittää sallitun rajan painokiloa kohden. Gluteenittoman sadon laatu voidaan varmistaa varhaisessa vaiheessa havaitsemalla vieraiden lajien jyvät ja tähkät

    Global Dynamics of the Offshore Wind Energy Sector Derived from Earth Observation Data - Deep Learning Based Object Detection Optimised with Synthetic Training Data for Offshore Wind Energy Infrastructure Extraction from Sentinel-1 Imagery

    Get PDF
    The expansion of renewable energies is being driven by the gradual phaseout of fossil fuels in order to reduce greenhouse gas emissions, the steadily increasing demand for energy and, more recently, by geopolitical events. The offshore wind energy sector is on the verge of a massive expansion in Europe, the United Kingdom, China, but also in the USA, South Korea and Vietnam. Accordingly, the largest marine infrastructure projects to date will be carried out in the upcoming decades, with thousands of offshore wind turbines being installed. In order to accompany this process globally and to provide a database for research, development and monitoring, this dissertation presents a deep learning-based approach for object detection that enables the derivation of spatiotemporal developments of offshore wind energy infrastructures from satellite-based radar data of the Sentinel-1 mission. For training the deep learning models for offshore wind energy infrastructure detection, an approach is presented that makes it possible to synthetically generate remote sensing data and the necessary annotation for the supervised deep learning process. In this synthetic data generation process, expert knowledge about image content and sensor acquisition techniques is made machine-readable. Finally, extensive and highly variable training data sets are generated from this knowledge representation, with which deep learning models can learn to detect objects in real-world satellite data. The method for the synthetic generation of training data based on expert knowledge offers great potential for deep learning in Earth observation. Applications of deep learning based methods can be developed and tested faster with this procedure. Furthermore, the synthetically generated and thus controllable training data offer the possibility to interpret the learning process of the optimised deep learning models. The method developed in this dissertation to create synthetic remote sensing training data was finally used to optimise deep learning models for the global detection of offshore wind energy infrastructure. For this purpose, images of the entire global coastline from ESA's Sentinel-1 radar mission were evaluated. The derived data set includes over 9,941 objects, which distinguish offshore wind turbines, transformer stations and offshore wind energy infrastructures under construction from each other. In addition to this spatial detection, a quarterly time series from July 2016 to June 2021 was derived for all objects. This time series reveals the start of construction, the construction phase and the time of completion with subsequent operation for each object. The derived offshore wind energy infrastructure data set provides the basis for an analysis of the development of the offshore wind energy sector from July 2016 to June 2021. For this analysis, further attributes of the detected offshore wind turbines were derived. The most important of these are the height and installed capacity of a turbine. The turbine height was calculated by a radargrammetric analysis of the previously detected Sentinel-1 signal and then used to statistically model the installed capacity. The results show that in June 2021, 8,885 offshore wind turbines with a total capacity of 40.6~GW were installed worldwide. The largest installed capacities are in the EU (15.2~GW), China (14.1~GW) and the United Kingdom (10.7~GW). From July 2016 to June 2021, China has expanded 13~GW of offshore wind energy infrastructure. The EU has installed 8~GW and the UK 5.8~GW of offshore wind energy infrastructure in the same period. This temporal analysis shows that China was the main driver of the expansion of the offshore wind energy sector in the period under investigation. The derived data set for the description of the offshore wind energy sector was made publicly available. It is thus freely accessible to all decision-makers and stakeholders involved in the development of offshore wind energy projects. Especially in the scientific context, it serves as a database that enables a wide range of investigations. Research questions regarding offshore wind turbines themselves as well as the influence of the expansion in the coming decades can be investigated. This supports the imminent and urgently needed expansion of offshore wind energy in order to promote sustainable expansion in addition to the expansion targets that have been set

    NeRF: Neural Radiance Field in 3D Vision, A Comprehensive Review

    Full text link
    Neural Radiance Field (NeRF), a new novel view synthesis with implicit scene representation has taken the field of Computer Vision by storm. As a novel view synthesis and 3D reconstruction method, NeRF models find applications in robotics, urban mapping, autonomous navigation, virtual reality/augmented reality, and more. Since the original paper by Mildenhall et al., more than 250 preprints were published, with more than 100 eventually being accepted in tier one Computer Vision Conferences. Given NeRF popularity and the current interest in this research area, we believe it necessary to compile a comprehensive survey of NeRF papers from the past two years, which we organized into both architecture, and application based taxonomies. We also provide an introduction to the theory of NeRF based novel view synthesis, and a benchmark comparison of the performance and speed of key NeRF models. By creating this survey, we hope to introduce new researchers to NeRF, provide a helpful reference for influential works in this field, as well as motivate future research directions with our discussion section

    Multi-task near-field perception for autonomous driving using surround-view fisheye cameras

    Get PDF
    Die Bildung der Augen führte zum Urknall der Evolution. Die Dynamik änderte sich von einem primitiven Organismus, der auf den Kontakt mit der Nahrung wartete, zu einem Organismus, der durch visuelle Sensoren gesucht wurde. Das menschliche Auge ist eine der raffiniertesten Entwicklungen der Evolution, aber es hat immer noch Mängel. Der Mensch hat über Millionen von Jahren einen biologischen Wahrnehmungsalgorithmus entwickelt, der in der Lage ist, Autos zu fahren, Maschinen zu bedienen, Flugzeuge zu steuern und Schiffe zu navigieren. Die Automatisierung dieser Fähigkeiten für Computer ist entscheidend für verschiedene Anwendungen, darunter selbstfahrende Autos, Augmented Realität und architektonische Vermessung. Die visuelle Nahfeldwahrnehmung im Kontext von selbstfahrenden Autos kann die Umgebung in einem Bereich von 0 - 10 Metern und 360° Abdeckung um das Fahrzeug herum wahrnehmen. Sie ist eine entscheidende Entscheidungskomponente bei der Entwicklung eines sichereren automatisierten Fahrens. Jüngste Fortschritte im Bereich Computer Vision und Deep Learning in Verbindung mit hochwertigen Sensoren wie Kameras und LiDARs haben ausgereifte Lösungen für die visuelle Wahrnehmung hervorgebracht. Bisher stand die Fernfeldwahrnehmung im Vordergrund. Ein weiteres wichtiges Problem ist die begrenzte Rechenleistung, die für die Entwicklung von Echtzeit-Anwendungen zur Verfügung steht. Aufgrund dieses Engpasses kommt es häufig zu einem Kompromiss zwischen Leistung und Laufzeiteffizienz. Wir konzentrieren uns auf die folgenden Themen, um diese anzugehen: 1) Entwicklung von Nahfeld-Wahrnehmungsalgorithmen mit hoher Leistung und geringer Rechenkomplexität für verschiedene visuelle Wahrnehmungsaufgaben wie geometrische und semantische Aufgaben unter Verwendung von faltbaren neuronalen Netzen. 2) Verwendung von Multi-Task-Learning zur Überwindung von Rechenengpässen durch die gemeinsame Nutzung von initialen Faltungsschichten zwischen den Aufgaben und die Entwicklung von Optimierungsstrategien, die die Aufgaben ausbalancieren.The formation of eyes led to the big bang of evolution. The dynamics changed from a primitive organism waiting for the food to come into contact for eating food being sought after by visual sensors. The human eye is one of the most sophisticated developments of evolution, but it still has defects. Humans have evolved a biological perception algorithm capable of driving cars, operating machinery, piloting aircraft, and navigating ships over millions of years. Automating these capabilities for computers is critical for various applications, including self-driving cars, augmented reality, and architectural surveying. Near-field visual perception in the context of self-driving cars can perceive the environment in a range of 0 - 10 meters and 360° coverage around the vehicle. It is a critical decision-making component in the development of safer automated driving. Recent advances in computer vision and deep learning, in conjunction with high-quality sensors such as cameras and LiDARs, have fueled mature visual perception solutions. Until now, far-field perception has been the primary focus. Another significant issue is the limited processing power available for developing real-time applications. Because of this bottleneck, there is frequently a trade-off between performance and run-time efficiency. We concentrate on the following issues in order to address them: 1) Developing near-field perception algorithms with high performance and low computational complexity for various visual perception tasks such as geometric and semantic tasks using convolutional neural networks. 2) Using Multi-Task Learning to overcome computational bottlenecks by sharing initial convolutional layers between tasks and developing optimization strategies that balance tasks

    Multimodal Adversarial Learning

    Get PDF
    Deep Convolutional Neural Networks (DCNN) have proven to be an exceptional tool for object recognition, generative modelling, and multi-modal learning in various computer vision applications. However, recent findings have shown that such state-of-the-art models can be easily deceived by inserting slight imperceptible perturbations to key pixels in the input. A good target detection systems can accurately identify targets by localizing their coordinates on the input image of interest. This is ideally achieved by labeling each pixel in an image as a background or a potential target pixel. However, prior research still confirms that such state of the art targets models are susceptible to adversarial attacks. In the case of generative models, facial sketches drawn by artists mostly used by law enforcement agencies depend on the ability of the artist to clearly replicate all the key facial features that aid in capturing the true identity of a subject. Recent works have attempted to synthesize these sketches into plausible visual images to improve visual recognition and identification. However, synthesizing photo-realistic images from sketches proves to be an even more challenging task, especially for sensitive applications such as suspect identification. However, the incorporation of hybrid discriminators, which perform attribute classification of multiple target attributes, a quality guided encoder that minimizes the perceptual dissimilarity of the latent space embedding of the synthesized and real image at different layers in the network have shown to be powerful tools towards better multi modal learning techniques. In general, our overall approach was aimed at improving target detection systems and the visual appeal of synthesized images while incorporating multiple attribute assignment to the generator without compromising the identity of the synthesized image. We synthesized sketches using XDOG filter for the CelebA, Multi-modal and CelebA-HQ datasets and from an auxiliary generator trained on sketches from CUHK, IIT-D and FERET datasets. Our results overall for different model applications are impressive compared to current state of the art
    corecore