228,918 research outputs found
Machine Learning and Quantum Devices
These brief lecture notes cover the basics of neural networks and deep learning as well as their applications in the quantum domain, for physicists without prior knowledge. In the first part, we describe training using back-propagation, image classification, convolutional networks and autoencoders.The second part is about advanced techniques like reinforcement learning (for discovering control strategies), recurrent neural networks (for analyzing timetraces), and Boltzmann machines (for learning probability distributions). In the third lecture, we discuss first recent applications to quantum physics, with an emphasis on quantum information processing machines. Finally, the fourth lecture is devoted to the promise of using quantum effects to accelerate machine learning
Simple but Effective Unsupervised Classification for Specified Domain Images: A Case Study on Fungi Images
High-quality labeled datasets are essential for deep learning. Traditional
manual annotation methods are not only costly and inefficient but also pose
challenges in specialized domains where expert knowledge is needed.
Self-supervised methods, despite leveraging unlabeled data for feature
extraction, still require hundreds or thousands of labeled instances to guide
the model for effective specialized image classification. Current unsupervised
learning methods offer automatic classification without prior annotation but
often compromise on accuracy. As a result, efficiently procuring high-quality
labeled datasets remains a pressing challenge for specialized domain images
devoid of annotated data. Addressing this, an unsupervised classification
method with three key ideas is introduced: 1) dual-step feature dimensionality
reduction using a pre-trained model and manifold learning, 2) a voting
mechanism from multiple clustering algorithms, and 3) post-hoc instead of prior
manual annotation. This approach outperforms supervised methods in
classification accuracy, as demonstrated with fungal image data, achieving
94.1% and 96.7% on public and private datasets respectively. The proposed
unsupervised classification method reduces dependency on pre-annotated
datasets, enabling a closed-loop for data classification. The simplicity and
ease of use of this method will also bring convenience to researchers in
various fields in building datasets, promoting AI applications for images in
specialized domains
Label-similarity Curriculum Learning
Curriculum learning can improve neural network training by guiding the
optimization to desirable optima. We propose a novel curriculum learning
approach for image classification that adapts the loss function by changing the
label representation. The idea is to use a probability distribution over
classes as target label, where the class probabilities reflect the similarity
to the true class. Gradually, this label representation is shifted towards the
standard one-hot-encoding. That is, in the beginning minor mistakes are
corrected less than large mistakes, resembling a teaching process in which
broad concepts are explained first before subtle differences are taught.
The class similarity can be based on prior knowledge. For the special case of
the labels being natural words, we propose a generic way to automatically
compute the similarities. The natural words are embedded into Euclidean space
using a standard word embedding. The probability of each class is then a
function of the cosine similarity between the vector representations of the
class and the true label. The proposed label-similarity curriculum learning
(LCL) approach was empirically evaluated using several popular deep learning
architectures for image classification tasks applied to five datasets including
ImageNet, CIFAR100, and AWA2. In all scenarios, LCL was able to improve the
classification accuracy on the test data compared to standard training.Comment: Accepted as a conference paper at ECCV 202
PatchMixer: Rethinking network design to boost generalization for 3D point cloud understanding
The recent trend in deep learning methods for 3D point cloud understanding is
to propose increasingly sophisticated architectures either to better capture 3D
geometries or by introducing possibly undesired inductive biases. Moreover,
prior works introducing novel architectures compared their performance on the
same domain, devoting less attention to their generalization to other domains.
We argue that the ability of a model to transfer the learnt knowledge to
different domains is an important feature that should be evaluated to
exhaustively assess the quality of a deep network architecture. In this work we
propose PatchMixer, a simple yet effective architecture that extends the ideas
behind the recent MLP-Mixer paper to 3D point clouds. The novelties of our
approach are the processing of local patches instead of the whole shape to
promote robustness to partial point clouds, and the aggregation of patch-wise
features using an MLP as a simpler alternative to the graph convolutions or the
attention mechanisms that are used in prior works. We evaluated our method on
the shape classification and part segmentation tasks, achieving superior
generalization performance compared to a selection of the most relevant deep
architectures.Comment: Published in the Image and Vision Computing journa
A Biologically Interpretable Two-Stage Deep Neural Network (BIT-DNN) for Vegetation Recognition From Hyperspectral Imagery
Spectral-spatial-based deep learning models have recently proven to be effective in hyper-spectral image (HSI) classification for various earth monitoring applications such as land cover classification and agricultural monitoring. However, due to the nature of ``black-box'' model representation, how to explain and interpret the learning process and the model decision, especially for vegetation classification, remains an open challenge. This study proposes a novel interpretable deep learning model--a biologically interpretable two-stage deep neural network (BIT-DNN), by incorporating the prior-knowledge (i.e., biophysical and biochemical attributes and their hierarchical structures of target entities)-based spectral-spatial feature transformation into the proposed framework, capable of achieving both high accuracy and interpretability on HSI-based classification tasks. The proposed model introduces a two-stage feature learning process: in the first stage, an enhanced interpretable feature block extracts the low-level spectral features associated with the biophysical and biochemical attributes of target entities; and in the second stage, an interpretable capsule block extracts and encapsulates the high-level joint spectral-spatial features representing the hierarchical structure of biophysical and biochemical attributes of these target entities, which provides the model an improved performance on classification and intrinsic interpretability with reduced computational complexity. We have tested and evaluated the model using four real HSI data sets for four separate tasks (i.e., plant species classification, land cover classification, urban scene recognition, and crop disease recognition tasks). The proposed model has been compared with five state-of-the-art deep learning models. The results demonstrate that the proposed model has competitive advantages in terms of both classification accuracy and model interpretability, especially for vegetation classification
Continual deep learning via progressive learning
Machine learning is one of several approaches to artificial intelligence. It allows us to build machines that can learn from experience as opposed to being explicitly programmed. Current machine learning formulations are mostly designed for learning and performing a particular task from a tabula rasa using data available for that task. For machine learning to converge to artificial intelligence, in addition to other desiderata, it must be in a state of continual learning, i.e., have the ability to be in a continuous learning process, such that when a new task is presented, the system can leverage prior knowledge from prior tasks, in learning and performing this new task, and augment the prior knowledge with the newly acquired knowledge without having a significant adverse effect on the prior knowledge. Continual learning is key to advancing machine learning and artificial intelligence. Deep learning is a powerful general-purpose approach to machine learning that is able to solve numerous and various tasks with minimal modification. Deep learning extends machine learning, and specially neural networks, to learn multiple levels of distributed representations together with the required mapping function into a single composite function. The emergence of deep learning and neural networks as a generic approach to machine learning, coupled with their ability to learn versatile hierarchical representations, has paved the way for continual learning. The main aim of this thesis is the study and development of a structured approach to continual learning, leveraging the success of deep learning and neural networks. This thesis studies the application of deep learning to a number of supervised learning tasks, and in particular, classification tasks in machine perception, e.g., image recognition, automatic speech recognition, and speech emotion recognition. The relation between the systems developed for these tasks is investigated to illuminate the layer-wise relevance of features in deep networks trained for these tasks via transfer learning, and these independent systems are unified into continual learning systems. The main contribution of this thesis is the construction and formulation of a deep learning framework, denoted progressive learning, that allows a holistic and systematic approach to continual learning. Progressive learning comprises a number of procedures that address the continual learning desiderata. It is shown that, when tasks are related, progressive learning leads to faster learning that converges to better generalization performance using less amounts of data and a smaller number of dedicated parameters, for the tasks studied in this thesis, by accumulating and leveraging knowledge learned across tasks in a continuous manner. It is envisioned that progressive learning is a step towards a fully general continual learning framework
Deep invariant feature learning for remote sensing scene classification
Image classification, as the core task in the computer vision field, has proceeded at a breakneck pace. It largely attributes to the recent growth of deep learning techniques which have blown the conventional statistical methods on a plethora of benchmarks and even can outperform humans in specific image classification tasks. Despite deep learning exceeding alternative techniques, they have many apparent disadvantages that prevent them from being deployed for the general-purpose. Specifically, deep learning always requires a considerable amount of well-annotated data to circumvent the problems of over-fitting and the lacking of prior knowledge. However, manually labelled data is expensive to acquire and is impossible to incorporate the variations as much as the real world. Consequently, deep learning models usually fail when they confront with the underrepresented variations in the training data. This is the main reason why the deep learning model is barely satisfactory in the challenging image recognition task that contains nuisance variations such as, Remote Sensing Scene Classification (RSSC).
The classification of remote sensing scene image is a procedure of assigning the semantic meaning labels for the given satellite images that contain the complicated variations, such as texture and appearances. The algorithms for effectively understanding and recognising remote sensing scene images have the potential to be employed in a broad range of applications, such as urban planning, Land Use and Land Cover (LULC) determination, natural hazards detection, vegetation mapping, environmental monitoring. This inspires us to design the frameworks that can automatically predict the precise label for satellite images. In our research project, we mine and define the challenges in RSSC community compared with general scene image recognition tasks. Specifically, we summarise the problems into the following perspectives. 1) Visual-semantic ambiguity: the discrepancy between visual features and semantic concepts; 2) Variations: the intra-class diversity and inter-class similarity; 3) Clutter background; 4) The small size of the training set; 5) Unsatisfactory classification accuracy in large-scale datasets.
To address the aforementioned challenges, we explore a way to dynamically expand the capabilities of incorporating the prior knowledge by transforming the input data so that we can learn the globally invariant second-order features from the transformed data for improving the performance of RSSC tasks. First, we devise a recurrent transformer network (RTN) to progressively discover the discriminative regions of input images and learn the corresponding second-order features. The model is optimised using pairwise ranking loss to achieve localising discriminative parts and learning the corresponding features in a mutually reinforced way. Second, we observed that existing remote sensing image datasets lack the provision of ontological structures. Therefore, a multi-granularity canonical appearance pooling (MG-CAP) model is proposed to automatically seek the implied hierarchical structures of datasets and produced covariance features contained the multi-grained information. Third, we explore a way to improve the discriminative power of the second-order features. To accomplish this target, we present a covariance feature embedding (CFE) model to improve the distinctive power of covariance pooling by using suitable matrix normalisation methods and a low-norm cosine similarity loss to accurately metric the distances of highdimensional features. Finally, we improved the performance of RSSC while using fewer model parameters. An invariant deep compressible covariance pooling (IDCCP) model is presented to boost the classification accuracy for RSSC tasks. Meanwhile, we proofed the generalisability of our IDCCP model using group theory and manifold optimisation techniques. All of the proposed frameworks allow being optimised in an end-to-end manner and are well-supported by GPU acceleration. We conduct extensive experiments on the well-known remote sensing scene image datasets to demonstrate the great promotions of our proposed methods in comparison with state-of-the-art approaches
- …