506 research outputs found
Self-supervised learning for transferable representations
Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks
Advances and Applications of DSmT for Information Fusion. Collected Works, Volume 5
This fifth volume on Advances and Applications of DSmT for Information Fusion collects theoretical and applied contributions of researchers working in different fields of applications and in mathematics, and is available in open-access. The collected contributions of this volume have either been published or presented after disseminating the fourth volume in 2015 in international conferences, seminars, workshops and journals, or they are new. The contributions of each part of this volume are chronologically ordered.
First Part of this book presents some theoretical advances on DSmT, dealing mainly with modified Proportional Conflict Redistribution Rules (PCR) of combination with degree of intersection, coarsening techniques, interval calculus for PCR thanks to set inversion via interval analysis (SIVIA), rough set classifiers, canonical decomposition of dichotomous belief functions, fast PCR fusion, fast inter-criteria analysis with PCR, and improved PCR5 and PCR6 rules preserving the (quasi-)neutrality of (quasi-)vacuous belief assignment in the fusion of sources of evidence with their Matlab codes.
Because more applications of DSmT have emerged in the past years since the apparition of the fourth book of DSmT in 2015, the second part of this volume is about selected applications of DSmT mainly in building change detection, object recognition, quality of data association in tracking, perception in robotics, risk assessment for torrent protection and multi-criteria decision-making, multi-modal image fusion, coarsening techniques, recommender system, levee characterization and assessment, human heading perception, trust assessment, robotics, biometrics, failure detection, GPS systems, inter-criteria analysis, group decision, human activity recognition, storm prediction, data association for autonomous vehicles, identification of maritime vessels, fusion of support vector machines (SVM), Silx-Furtif RUST code library for information fusion including PCR rules, and network for ship classification.
Finally, the third part presents interesting contributions related to belief functions in general published or presented along the years since 2015. These contributions are related with decision-making under uncertainty, belief approximations, probability transformations, new distances between belief functions, non-classical multi-criteria decision-making problems with belief functions, generalization of Bayes theorem, image processing, data association, entropy and cross-entropy measures, fuzzy evidence numbers, negator of belief mass, human activity recognition, information fusion for breast cancer therapy, imbalanced data classification, and hybrid techniques mixing deep learning with belief functions as well
Synthetic Aperture Radar (SAR) Meets Deep Learning
This reprint focuses on the application of the combination of synthetic aperture radars and depth learning technology. It aims to further promote the development of SAR image intelligent interpretation technology. A synthetic aperture radar (SAR) is an important active microwave imaging sensor, whose all-day and all-weather working capacity give it an important place in the remote sensing community. Since the United States launched the first SAR satellite, SAR has received much attention in the remote sensing community, e.g., in geological exploration, topographic mapping, disaster forecast, and traffic monitoring. It is valuable and meaningful, therefore, to study SAR-based remote sensing applications. In recent years, deep learning represented by convolution neural networks has promoted significant progress in the computer vision community, e.g., in face recognition, the driverless field and Internet of things (IoT). Deep learning can enable computational models with multiple processing layers to learn data representations with multiple-level abstractions. This can greatly improve the performance of various applications. This reprint provides a platform for researchers to handle the above significant challenges and present their innovative and cutting-edge research results when applying deep learning to SAR in various manuscript types, e.g., articles, letters, reviews and technical reports
A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery
Semantic segmentation (classification) of Earth Observation imagery is a
crucial task in remote sensing. This paper presents a comprehensive review of
technical factors to consider when designing neural networks for this purpose.
The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural
Networks (RNNs), Generative Adversarial Networks (GANs), and transformer
models, discussing prominent design patterns for these ANN families and their
implications for semantic segmentation. Common pre-processing techniques for
ensuring optimal data preparation are also covered. These include methods for
image normalization and chipping, as well as strategies for addressing data
imbalance in training samples, and techniques for overcoming limited data,
including augmentation techniques, transfer learning, and domain adaptation. By
encompassing both the technical aspects of neural network design and the
data-related considerations, this review provides researchers and practitioners
with a comprehensive and up-to-date understanding of the factors involved in
designing effective neural networks for semantic segmentation of Earth
Observation imagery.Comment: 145 pages with 32 figure
Machine Learning Approaches for Semantic Segmentation on Partly-Annotated Medical Images
Semantic segmentation of medical images plays a crucial role in assisting medical practitioners in providing accurate and swift diagnoses; nevertheless, deep neural networks require extensive labelled data to learn and generalise appropriately. This is a major issue in medical imagery because most of the datasets are not fully annotated. Training models with partly-annotated datasets generate plenty of predictions that belong to correct unannotated areas that are categorised as false positives; as a result, standard segmentation metrics and objective functions do not work correctly, affecting the overall performance of the models. In this thesis, the semantic segmentation of partly-annotated medical datasets is extensively and thoroughly studied. The general objective is to improve the segmentation results of medical images via innovative supervised and semi-supervised approaches. The main contributions of this work are the following. Firstly, a new metric, specifically designed for this kind of dataset, can provide a reliable score to partly-annotated datasets with positive expert feedback in their generated predictions by exploiting all the confusion matrix values except the false positives. Secondly, an innovative approach to generating better pseudo-labels when applying co-training with the disagreement selection strategy. This method expands the pixels in disagreement utilising the combined predictions as a guide. Thirdly, original attention mechanisms based on disagreement are designed for two cases: intra-model and inter-model. These attention modules leverage the disagreement between layers (from the same or different model instances) to enhance the overall learning process and generalisation of the models. Lastly, innovative deep supervision methods improve the segmentation results by training neural networks one subnetwork at a time following the order of the supervision branches. The methods are thoroughly evaluated on several histopathological datasets showing significant improvements
Computational Imaging for Phase Retrieval and Biomedical Applications
In conventional imaging, optimizing hardware is prioritized to enhance image quality directly. Digital signal processing is viewed as supplementary. Computational imaging intentionally distorts images through modulation schemes in illumination or sensing. Then its reconstruction algorithms extract desired object information from raw data afterwards. Co-designing hardware and algorithms reduces demands on hardware and achieves the same or even better image quality. Algorithm design is at the heart of computational imaging, with model-based inverse problem or data-driven deep learning methods as approaches. This thesis presents research work from both perspectives, with a primary focus on the phase retrieval issue in computational microscopy and the application of deep learning techniques to address biomedical imaging challenges.
The first half of the thesis begins with Fourier ptychography, which was employed to overcome chromatic aberration problems in multispectral imaging. Then, we proposed a novel computational coherent imaging modality based on Kramers-Kronig relations, aiming to replace Fourier ptychography as a non-iterative method. While this approach showed promise, it lacks certain essential characteristics of the original Fourier ptychography. To address this limitation, we introduced two additional algorithms to form a whole package scheme. Through comprehensive evaluation, we demonstrated that the combined scheme outperforms Fourier ptychography in achieving high-resolution, large field-of-view, aberration-free coherent imaging.
The second half of the thesis shifts focus to deep-learning-based methods. In one project, we optimized the scanning strategy and image processing pipeline of an epifluorescence microscope to address focus issues. Additionally, we leveraged deep-learning-based object detection models to automate cell analysis tasks. In another project, we predicted the polarity status of mouse embryos from bright field images using adapted deep learning models. These findings highlight the capability of computational imaging to automate labor-intensive processes, and even outperform humans in challenging tasks.</p
Conditional Invertible Generative Models for Supervised Problems
Invertible neural networks (INNs), in the setting of normalizing flows, are a type of unconditional generative likelihood model. Despite various attractive properties compared to other common generative model types, they are rarely useful for supervised tasks or real applications due to their unguided outputs. In this work, we therefore present three new methods that extend the standard INN setting, falling under a broader category we term generative invertible models. These new methods allow leveraging the theoretical and practical benefits of INNs to solve supervised problems in new ways, including real-world applications from different branches of science. The key finding is that our approaches enhance many aspects of trustworthiness in comparison to conventional feed-forward networks, such as uncertainty estimation and quantification, explainability, and proper handling of outlier data
A review of spatial enhancement of hyperspectral remote sensing imaging techniques
Remote sensing technology has undeniable importance in various industrial applications, such as mineral exploration, plant detection, defect detection in aerospace and shipbuilding, and optical gas imaging, to name a few. Remote sensing technology has been continuously evolving, offering a range of image modalities that can facilitate the aforementioned applications. One such modality is Hyperspectral Imaging (HSI). Unlike Multispectral Images (MSI) and natural images, HSI consist of hundreds of bands. Despite their high spectral resolution, HSI suffer from low spatial resolution in comparison to their MSI counterpart, which hinders the utilization of their full potential. Therefore, spatial enhancement, or Super Resolution (SR), of HSI is a classical problem that has been gaining rapid attention over the past two decades. The literature is rich with various SR algorithms that enhance the spatial resolution of HSI while preserving their spectral fidelity. This paper reviews and discusses the most important algorithms relevant to this area of research between 2002-2022, along with the most frequently used datasets, HSI sensors, and quality metrics. Meta-analysis are drawn based on the aforementioned information, which is used as a foundation that summarizes the state of the field in a way that bridges the past and the present, identifies the current gap in it, and recommends possible future directions
Deep learning in food category recognition
Integrating artificial intelligence with food category recognition has been a field of interest for research for the
past few decades. It is potentially one of the next steps in revolutionizing human interaction with food. The
modern advent of big data and the development of data-oriented fields like deep learning have provided advancements
in food category recognition. With increasing computational power and ever-larger food datasets,
the approach’s potential has yet to be realized. This survey provides an overview of methods that can be applied
to various food category recognition tasks, including detecting type, ingredients, quality, and quantity. We
survey the core components for constructing a machine learning system for food category recognition, including
datasets, data augmentation, hand-crafted feature extraction, and machine learning algorithms. We place a
particular focus on the field of deep learning, including the utilization of convolutional neural networks, transfer
learning, and semi-supervised learning. We provide an overview of relevant studies to promote further developments
in food category recognition for research and industrial applicationsMRC (MC_PC_17171)Royal Society (RP202G0230)BHF (AA/18/3/34220)Hope Foundation for Cancer Research (RM60G0680)GCRF (P202PF11)Sino-UK Industrial
Fund (RP202G0289)LIAS (P202ED10Data Science
Enhancement Fund (P202RE237)Fight for Sight (24NN201);Sino-UK
Education Fund (OP202006)BBSRC (RM32G0178B8
Tensor singular spectral analysis for 3D feature extraction in hyperspectral images.
Due to the cubic structure of a hyperspectral image (HSI), how to characterize its spectral and spatial properties in three dimensions is challenging. Conventional spectral-spatial methods usually extract spectral and spatial information separately, ignoring their intrinsic correlations. Recently, some 3D feature extraction methods are developed for the extraction of spectral and spatial features simultaneously, although they rely on local spatial-spectral regions and thus ignore the global spectral similarity and spatial consistency. Meanwhile, some of these methods contain huge model parameters which require a large number of training samples. In this paper, a novel Tensor Singular Spectral Analysis (TensorSSA) method is proposed to extract global and low-rank features of HSI. In TensorSSA, an adaptive embedding operation is first proposed to construct a trajectory tensor corresponding to the entire HSI, which takes full advantage of the spatial similarity and improves the adequate representation of the global low-rank properties of the HSI. Moreover, the obtained trajectory tensor, which contains the global and local spatial and spectral information of the HSI, is decomposed by the Tensor singular value decomposition (t-SVD) to explore its low-rank intrinsic features. Finally, the efficacy of the extracted features is evaluated using the accuracy of image classification with a support vector machine (SVM) classifier. Experimental results on three publicly available datasets have fully demonstrated the superiority of the proposed TensorSSA over a few state-of-the-art 2D/3D feature extraction and deep learning algorithms, even with a limited number of training samples
- …