Search CORE

25 research outputs found

A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

Author: Ball John E.
Anderson Derek T.
Chan Chee Seng
Publication venue
Publication date: 01/01/2017
Field of study

In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

arXiv.org e-Print Archive

FigShare

The Application of Machine Learning to At-Risk Cultural Heritage Image Data

Author: ROBERTS MATTHEW,IAN
Publication venue
Publication date: 01/01/2020
Field of study

This project investigates the application of Convolutional Neural Network (CNN) methods and technologies to problems related to At-Risk cultural heritage object recognition. The primary aim for this work is the use of developmental software combining the disciplines of computer vision and artefact studies, developing applications in the field of heritage protection specifically related to the illegal antiquities market. To accomplish this digital image data provided by the Durham University Oriental Museum was used in conjunction with several different implementations of pre-trained CNN software models, for the purposes of artefact Classification and Identification. Testing focused on data capture using a variety of digital recording devices, guided by the developmental needs of a heritage programme seeking to create software solutions to heritage threats in the Middle East and North Africa (MENA) region. Quantitative data results using information retrieval metrics is reported for all model and test sets, and has been used to evaluate the models predictive results

Durham e-Theses

Learning visual representations with neural networks for video captioning and image generation

Author: Yao Li
Publication venue
Publication date: 01/12/2017
Field of study

La recherche sur les réseaux de neurones a permis de réaliser de larges progrès durant la dernière décennie. Non seulement les réseaux de neurones ont été appliqués avec succès pour résoudre des problèmes de plus en plus complexes; mais ils sont aussi devenus l’approche dominante dans les domaines où ils ont été testés tels que la compréhension du langage, les agents jouant à des jeux de manière automatique ou encore la vision par ordinateur, grâce à leurs capacités calculatoires et leurs efficacités statistiques. La présente thèse étudie les réseaux de neurones appliqués à des problèmes en vision par ordinateur, où les représentations sémantiques abstraites jouent un rôle fondamental. Nous démontrerons, à la fois par la théorie et par l’expérimentation, la capacité des réseaux de neurones à apprendre de telles représentations à partir de données, avec ou sans supervision. Le contenu de la thèse est divisé en deux parties. La première partie étudie les réseaux de neurones appliqués à la description de vidéo en langage naturel, nécessitant l’apprentissage de représentation visuelle. Le premier modèle proposé permet d’avoir une attention dynamique sur les différentes trames de la vidéo lors de la génération de la description textuelle pour de courtes vidéos. Ce modèle est ensuite amélioré par l’introduction d’une opération de convolution récurrente. Par la suite, la dernière section de cette partie identifie un problème fondamental dans la description de vidéo en langage naturel et propose un nouveau type de métrique d’évaluation qui peut être utilisé empiriquement comme un oracle afin d’analyser les performances de modèles concernant cette tâche. La deuxième partie se concentre sur l’apprentissage non-supervisé et étudie une famille de modèles capables de générer des images. En particulier, l’accent est mis sur les “Neural Autoregressive Density Estimators (NADEs), une famille de modèles probabilistes pour les images naturelles. Ce travail met tout d’abord en évidence une connection entre les modèles NADEs et les réseaux stochastiques génératifs (GSN). De plus, une amélioration des modèles NADEs standards est proposée. Dénommés NADEs itératifs, cette amélioration introduit plusieurs itérations lors de l’inférence du modèle NADEs tout en préservant son nombre de paramètres. Débutant par une revue chronologique, ce travail se termine par un résumé des récents développements en lien avec les contributions présentées dans les deux parties principales, concernant les problèmes d’apprentissage de représentation sémantiques pour les images et les vidéos. De prometteuses directions de recherche sont envisagées.The past decade has been marked as a golden era of neural network research. Not only have neural networks been successfully applied to solve more and more challenging real- world problems, but also they have become the dominant approach in many of the places where they have been tested. These places include, for instance, language understanding, game playing, and computer vision, thanks to neural networks’ superiority in computational efficiency and statistical capacity. This thesis applies neural networks to problems in computer vision where high-level and semantically meaningful representations play a fundamental role. It demonstrates both in theory and in experiment the ability to learn such representations from data with and without supervision. The main content of the thesis is divided into two parts. The first part studies neural networks in the context of learning visual representations for the task of video captioning. Models are developed to dynamically focus on different frames while generating a natural language description of a short video. Such a model is further improved by recurrent convolutional operations. The end of this part identifies fundamental challenges in video captioning and proposes a new type of evaluation metric that may be used experimentally as an oracle to benchmark performance. The second part studies the family of models that generate images. While the first part is supervised, this part is unsupervised. The focus of it is the popular family of Neural Autoregressive Density Estimators (NADEs), a tractable probabilistic model for natural images. This work first makes a connection between NADEs and Generative Stochastic Networks (GSNs). The standard NADE is improved by introducing multiple iterations in its inference without increasing the number of parameters, which is dubbed iterative NADE. With a historical view at the beginning, this work ends with a summary of recent development for work discussed in the first two parts around the central topic of learning visual representations for images and videos. A bright future is envisioned at the end

Modelling tree biomass using direct and additive methods with point cloud deep learning in a temperate mixed forest

Author: Coops Nicholas C.
Montwé David
Ragab Ahmed
Seely Harry
White Joanne C.
Winiwarter Lukas
Publication venue: Elsevier BV
Publication date: 18/11/2023
Field of study

ABSTRACT: Airborne laser scanning (ALS) data has been widely used for total aboveground tree biomass (AGB) modelling, however, there is less research focusing on estimating specific tree biomass components (wood, branches, bark, and foliage). Knowledge about these biomass components is essential for carbon accounting, understanding forest nutrient cycling, and other applications. In this study, we compare additive AGB estimation (sum of estimated components) with direct AGB estimation using deep neural network (DNN) and random forest (RF) models. We utilise two point cloud DNNs: point-based Dynamic Graph Convolutional Neural Network (DGCNN) and Octree-based Convolutional Neural Network (OCNN). DNN and RF models were trained using a dataset comprised of 2336 sample plots from a mixed temperate forest in New Brunswick, Canada. Results indicate that additive AGB models perform similarly to direct models in terms of coefficient of determination (R2) and root-mean square error (RMSE), and reduced the mean absolute percentage error (MAPE) by 22% on average. Compared to RF, the DNNs provided a small improvement in performance, with OCNN explaining 5% more variation in the data (R2 = 0.76) and reducing MAPE by 20% on average. Overall, this study showcases the effectiveness of additive tree AGB models and highlights the potential of DNNs for enhanced AGB estimation. To further improve DNN performance, we recommend using larger training datasets, implementing hyperparameter optimization, and incorporating additional data such as multispectral imagery