607 research outputs found
Neural Architecture Search for Image Segmentation and Classification
Deep learning (DL) is a class of machine learning algorithms that relies on deep neural networks (DNNs) for computations. Unlike traditional machine learning algorithms, DL can learn from raw data directly and effectively. Hence, DL has been successfully applied to tackle many real-world problems. When applying DL to a given problem, the primary task is designing the optimum DNN. This task relies heavily on human expertise, is time-consuming, and requires many trial-and-error experiments.
This thesis aims to automate the laborious task of designing the optimum DNN by exploring the neural architecture search (NAS) approach. Here, we propose two new NAS algorithms for two real-world problems: pedestrian lane detection for assistive navigation and hyperspectral image segmentation for biosecurity scanning. Additionally, we also introduce a new dataset-agnostic predictor of neural network performance, which can be used to speed-up NAS algorithms that require the evaluation of candidate DNNs
Advances and Applications of DSmT for Information Fusion. Collected Works, Volume 5
This fifth volume on Advances and Applications of DSmT for Information Fusion collects theoretical and applied contributions of researchers working in different fields of applications and in mathematics, and is available in open-access. The collected contributions of this volume have either been published or presented after disseminating the fourth volume in 2015 in international conferences, seminars, workshops and journals, or they are new. The contributions of each part of this volume are chronologically ordered.
First Part of this book presents some theoretical advances on DSmT, dealing mainly with modified Proportional Conflict Redistribution Rules (PCR) of combination with degree of intersection, coarsening techniques, interval calculus for PCR thanks to set inversion via interval analysis (SIVIA), rough set classifiers, canonical decomposition of dichotomous belief functions, fast PCR fusion, fast inter-criteria analysis with PCR, and improved PCR5 and PCR6 rules preserving the (quasi-)neutrality of (quasi-)vacuous belief assignment in the fusion of sources of evidence with their Matlab codes.
Because more applications of DSmT have emerged in the past years since the apparition of the fourth book of DSmT in 2015, the second part of this volume is about selected applications of DSmT mainly in building change detection, object recognition, quality of data association in tracking, perception in robotics, risk assessment for torrent protection and multi-criteria decision-making, multi-modal image fusion, coarsening techniques, recommender system, levee characterization and assessment, human heading perception, trust assessment, robotics, biometrics, failure detection, GPS systems, inter-criteria analysis, group decision, human activity recognition, storm prediction, data association for autonomous vehicles, identification of maritime vessels, fusion of support vector machines (SVM), Silx-Furtif RUST code library for information fusion including PCR rules, and network for ship classification.
Finally, the third part presents interesting contributions related to belief functions in general published or presented along the years since 2015. These contributions are related with decision-making under uncertainty, belief approximations, probability transformations, new distances between belief functions, non-classical multi-criteria decision-making problems with belief functions, generalization of Bayes theorem, image processing, data association, entropy and cross-entropy measures, fuzzy evidence numbers, negator of belief mass, human activity recognition, information fusion for breast cancer therapy, imbalanced data classification, and hybrid techniques mixing deep learning with belief functions as well
RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework
In recent years, remote sensing (RS) vision foundation models such as RingMo
have emerged and achieved excellent performance in various downstream tasks.
However, the high demand for computing resources limits the application of
these models on edge devices. It is necessary to design a more lightweight
foundation model to support on-orbit RS image interpretation. Existing methods
face challenges in achieving lightweight solutions while retaining
generalization in RS image interpretation. This is due to the complex high and
low-frequency spectral components in RS images, which make traditional single
CNN or Vision Transformer methods unsuitable for the task. Therefore, this
paper proposes RingMo-lite, an RS multi-task lightweight network with a
CNN-Transformer hybrid framework, which effectively exploits the
frequency-domain properties of RS to optimize the interpretation process. It is
combined by the Transformer module as a low-pass filter to extract global
features of RS images through a dual-branch structure, and the CNN module as a
stacked high-pass filter to extract fine-grained details effectively.
Furthermore, in the pretraining stage, the designed frequency-domain masked
image modeling (FD-MIM) combines each image patch's high-frequency and
low-frequency characteristics, effectively capturing the latent feature
representation in RS data. As shown in Fig. 1, compared with RingMo, the
proposed RingMo-lite reduces the parameters over 60% in various RS image
interpretation tasks, the average accuracy drops by less than 2% in most of the
scenes and achieves SOTA performance compared to models of the similar size. In
addition, our work will be integrated into the MindSpore computing platform in
the near future
Deep learning methods applied to digital elevation models: state of the art
Deep Learning (DL) has a wide variety of applications in various
thematic domains, including spatial information. Although with
limitations, it is also starting to be considered in operations
related to Digital Elevation Models (DEMs). This study aims to
review the methods of DL applied in the field of altimetric spatial
information in general, and DEMs in particular. Void Filling (VF),
Super-Resolution (SR), landform classification and hydrography
extraction are just some of the operations where traditional methods
are being replaced by DL methods. Our review concludes
that although these methods have great potential, there are
aspects that need to be improved. More appropriate terrain information
or algorithm parameterisation are some of the challenges
that this methodology still needs to face.Functional Quality of Digital Elevation Models in Engineering’ of the State Agency Research of SpainPID2019-106195RB- I00/AEI/10.13039/50110001103
Knowledge Distillation and Continual Learning for Optimized Deep Neural Networks
Over the past few years, deep learning (DL) has been achieving state-of-theart performance on various human tasks such as speech generation, language translation, image segmentation, and object detection. While traditional machine learning models require hand-crafted features, deep learning algorithms can automatically extract discriminative features and learn complex knowledge from large datasets. This powerful learning ability makes deep learning models attractive to both academia and big corporations.
Despite their popularity, deep learning methods still have two main limitations: large memory consumption and catastrophic knowledge forgetting. First, DL algorithms use very deep neural networks (DNNs) with many billion parameters, which have a big model size and a slow inference speed. This restricts the application of DNNs in resource-constraint devices such as mobile phones and autonomous vehicles. Second, DNNs are known to suffer from catastrophic forgetting. When incrementally learning new tasks, the model performance on old tasks significantly drops. The ability to accommodate new knowledge while retaining previously learned knowledge is called continual learning. Since the realworld environments in which the model operates are always evolving, a robust neural network needs to have this continual learning ability for adapting to new changes
Synthetic Aperture Radar (SAR) Meets Deep Learning
This reprint focuses on the application of the combination of synthetic aperture radars and depth learning technology. It aims to further promote the development of SAR image intelligent interpretation technology. A synthetic aperture radar (SAR) is an important active microwave imaging sensor, whose all-day and all-weather working capacity give it an important place in the remote sensing community. Since the United States launched the first SAR satellite, SAR has received much attention in the remote sensing community, e.g., in geological exploration, topographic mapping, disaster forecast, and traffic monitoring. It is valuable and meaningful, therefore, to study SAR-based remote sensing applications. In recent years, deep learning represented by convolution neural networks has promoted significant progress in the computer vision community, e.g., in face recognition, the driverless field and Internet of things (IoT). Deep learning can enable computational models with multiple processing layers to learn data representations with multiple-level abstractions. This can greatly improve the performance of various applications. This reprint provides a platform for researchers to handle the above significant challenges and present their innovative and cutting-edge research results when applying deep learning to SAR in various manuscript types, e.g., articles, letters, reviews and technical reports
A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery
Semantic segmentation (classification) of Earth Observation imagery is a
crucial task in remote sensing. This paper presents a comprehensive review of
technical factors to consider when designing neural networks for this purpose.
The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural
Networks (RNNs), Generative Adversarial Networks (GANs), and transformer
models, discussing prominent design patterns for these ANN families and their
implications for semantic segmentation. Common pre-processing techniques for
ensuring optimal data preparation are also covered. These include methods for
image normalization and chipping, as well as strategies for addressing data
imbalance in training samples, and techniques for overcoming limited data,
including augmentation techniques, transfer learning, and domain adaptation. By
encompassing both the technical aspects of neural network design and the
data-related considerations, this review provides researchers and practitioners
with a comprehensive and up-to-date understanding of the factors involved in
designing effective neural networks for semantic segmentation of Earth
Observation imagery.Comment: 145 pages with 32 figure
Machine Learning Approaches for Semantic Segmentation on Partly-Annotated Medical Images
Semantic segmentation of medical images plays a crucial role in assisting medical practitioners in providing accurate and swift diagnoses; nevertheless, deep neural networks require extensive labelled data to learn and generalise appropriately. This is a major issue in medical imagery because most of the datasets are not fully annotated. Training models with partly-annotated datasets generate plenty of predictions that belong to correct unannotated areas that are categorised as false positives; as a result, standard segmentation metrics and objective functions do not work correctly, affecting the overall performance of the models. In this thesis, the semantic segmentation of partly-annotated medical datasets is extensively and thoroughly studied. The general objective is to improve the segmentation results of medical images via innovative supervised and semi-supervised approaches. The main contributions of this work are the following. Firstly, a new metric, specifically designed for this kind of dataset, can provide a reliable score to partly-annotated datasets with positive expert feedback in their generated predictions by exploiting all the confusion matrix values except the false positives. Secondly, an innovative approach to generating better pseudo-labels when applying co-training with the disagreement selection strategy. This method expands the pixels in disagreement utilising the combined predictions as a guide. Thirdly, original attention mechanisms based on disagreement are designed for two cases: intra-model and inter-model. These attention modules leverage the disagreement between layers (from the same or different model instances) to enhance the overall learning process and generalisation of the models. Lastly, innovative deep supervision methods improve the segmentation results by training neural networks one subnetwork at a time following the order of the supervision branches. The methods are thoroughly evaluated on several histopathological datasets showing significant improvements
INoD: Injected Noise Discriminator for Self-Supervised Representation Learning in Agricultural Fields
Perception datasets for agriculture are limited both in quantity and
diversity which hinders effective training of supervised learning approaches.
Self-supervised learning techniques alleviate this problem, however, existing
methods are not optimized for dense prediction tasks in agriculture domains
which results in degraded performance. In this work, we address this limitation
with our proposed Injected Noise Discriminator (INoD) which exploits principles
of feature replacement and dataset discrimination for self-supervised
representation learning. INoD interleaves feature maps from two disjoint
datasets during their convolutional encoding and predicts the dataset
affiliation of the resultant feature map as a pretext task. Our approach
enables the network to learn unequivocal representations of objects seen in one
dataset while observing them in conjunction with similar features from the
disjoint dataset. This allows the network to reason about higher-level
semantics of the entailed objects, thus improving its performance on various
downstream tasks. Additionally, we introduce the novel Fraunhofer Potato 2022
dataset consisting of over 16,800 images for object detection in potato fields.
Extensive evaluations of our proposed INoD pretraining strategy for the tasks
of object detection, semantic segmentation, and instance segmentation on the
Sugar Beets 2016 and our potato dataset demonstrate that it achieves
state-of-the-art performance.Comment: 8 pages, 7 figure
H-RNet: hybrid rlation network for few-shot learning-based hyperspectral image classification.
Deep network models rely on sufficient training samples to perform reasonably well, which has inevitably constrained their application in classification of hyperspectral images (HSIs) due to the limited availability of labeled data. To tackle this particular challenge, we propose a hybrid relation network, H-RNet, by combining three-dimensional (3-D) convolution neural networks (CNN) and two-dimensional (2-D) CNN to extract the spectral–spatial features whilst reducing the complexity of the network. In an end-to-end relation learning module, the sample pairing approach can effectively alleviate the problem of few labeled samples and learn correlations between samples more accurately for more effective classification. Experimental results on three publicly available datasets have fully demonstrated the superior performance of the proposed model in comparison to a few state-of-the-art methods
- …