10,988 research outputs found
Off the Radar: Uncertainty-Aware Radar Place Recognition with Introspective Querying and Map Maintenance
Localisation with Frequency-Modulated Continuous-Wave (FMCW) radar has gained
increasing interest due to its inherent resistance to challenging environments.
However, complex artefacts of the radar measurement process require appropriate
uncertainty estimation to ensure the safe and reliable application of this
promising sensor modality. In this work, we propose a multi-session map
management system which constructs the best maps for further localisation based
on learned variance properties in an embedding space. Using the same variance
properties, we also propose a new way to introspectively reject localisation
queries that are likely to be incorrect. For this, we apply robust noise-aware
metric learning, which both leverages the short-timescale variability of radar
data along a driven path (for data augmentation) and predicts the downstream
uncertainty in metric-space-based place recognition. We prove the effectiveness
of our method over extensive cross-validated tests of the Oxford Radar RobotCar
and MulRan dataset. In this, we outperform the current state-of-the-art in
radar place recognition and other uncertainty-aware methods when using only
single nearest-neighbour queries. We also show consistent performance increases
when rejecting queries based on uncertainty over a difficult test environment,
which we did not observe for a competing uncertainty-aware place recognition
system.Comment: 8 pages, 6 figure
On information captured by neural networks: connections with memorization and generalization
Despite the popularity and success of deep learning, there is limited
understanding of when, how, and why neural networks generalize to unseen
examples. Since learning can be seen as extracting information from data, we
formally study information captured by neural networks during training.
Specifically, we start with viewing learning in presence of noisy labels from
an information-theoretic perspective and derive a learning algorithm that
limits label noise information in weights. We then define a notion of unique
information that an individual sample provides to the training of a deep
network, shedding some light on the behavior of neural networks on examples
that are atypical, ambiguous, or belong to underrepresented subpopulations. We
relate example informativeness to generalization by deriving nonvacuous
generalization gap bounds. Finally, by studying knowledge distillation, we
highlight the important role of data and label complexity in generalization.
Overall, our findings contribute to a deeper understanding of the mechanisms
underlying neural network generalization.Comment: PhD thesi
Machine learning in solar physics
The application of machine learning in solar physics has the potential to
greatly enhance our understanding of the complex processes that take place in
the atmosphere of the Sun. By using techniques such as deep learning, we are
now in the position to analyze large amounts of data from solar observations
and identify patterns and trends that may not have been apparent using
traditional methods. This can help us improve our understanding of explosive
events like solar flares, which can have a strong effect on the Earth
environment. Predicting hazardous events on Earth becomes crucial for our
technological society. Machine learning can also improve our understanding of
the inner workings of the sun itself by allowing us to go deeper into the data
and to propose more complex models to explain them. Additionally, the use of
machine learning can help to automate the analysis of solar data, reducing the
need for manual labor and increasing the efficiency of research in this field.Comment: 100 pages, 13 figures, 286 references, accepted for publication as a
Living Review in Solar Physics (LRSP
Modular lifelong machine learning
Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge.
Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand.
This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems.
First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures.
Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations.
Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods.
Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer
Representation Learning With Hidden Unit Clustering For Low Resource Speech Applications
The representation learning of speech, without textual resources, is an area
of significant interest for many low resource speech applications. In this
paper, we describe an approach to self-supervised representation learning from
raw audio using a hidden unit clustering (HUC) framework. The input to the
model consists of audio samples that are windowed and processed with 1-D
convolutional layers. The learned "time-frequency" representations from the
convolutional neural network (CNN) module are further processed with long short
term memory (LSTM) layers which generate a contextual vector representation for
every windowed segment. The HUC framework, allowing the categorization of the
representations into a small number of phoneme-like units, is used to train the
model for learning semantically rich speech representations. The targets
consist of phoneme-like pseudo labels for each audio segment and these are
generated with an iterative k-means algorithm. We explore techniques that
improve the speaker invariance of the learned representations and illustrate
the effectiveness of the proposed approach on two settings, i) completely
unsupervised speech applications on the sub-tasks described as part of the
ZeroSpeech 2021 challenge and ii) semi-supervised automatic speech recognition
(ASR) applications on the TIMIT dataset and on the GramVaani challenge Hindi
dataset. In these experiments, we achieve state-of-art results for various
ZeroSpeech tasks. Further, on the ASR experiments, the HUC representations are
shown to improve significantly over other established benchmarks based on
Wav2vec, HuBERT and Best-RQ
Automatic Caption Generation for Aerial Images: A Survey
Aerial images have attracted attention from researcher community since long time. Generating a caption for an aerial image describing its content in comprehensive way is less studied but important task as it has applications in agriculture, defence, disaster management and many more areas. Though different approaches were followed for natural image caption generation, generating a caption for aerial image remains a challenging task due to its special nature. Use of emerging techniques from Artificial Intelligence (AI) and Natural Language Processing (NLP) domains have resulted in generation of accepted quality captions for aerial images. However lot needs to be done to fully utilize potential of aerial image caption generation task. This paper presents detail survey of the various approaches followed by researchers for aerial image caption generation task. The datasets available for experimentation, criteria used for performance evaluation and future directions are also discussed
Search-time Efficient Device Constraints-Aware Neural Architecture Search
Edge computing aims to enable edge devices, such as IoT devices, to process
data locally instead of relying on the cloud. However, deep learning techniques
like computer vision and natural language processing can be computationally
expensive and memory-intensive. Creating manual architectures specialized for
each device is infeasible due to their varying memory and computational
constraints. To address these concerns, we automate the construction of
task-specific deep learning architectures optimized for device constraints
through Neural Architecture Search (NAS). We present DCA-NAS, a principled
method of fast neural network architecture search that incorporates edge-device
constraints such as model size and floating-point operations. It incorporates
weight sharing and channel bottleneck techniques to speed up the search time.
Based on our experiments, we see that DCA-NAS outperforms manual architectures
for similar sized models and is comparable to popular mobile architectures on
various image classification datasets like CIFAR-10, CIFAR-100, and
Imagenet-1k. Experiments with search spaces -- DARTS and NAS-Bench-201 show the
generalization capabilities of DCA-NAS. On further evaluating our approach on
Hardware-NAS-Bench, device-specific architectures with low inference latency
and state-of-the-art performance were discovered.Comment: Accepted to 10th International Conference on Pattern Recognition and
Machine Intelligence (PReMI) 202
A generative flow for conditional sampling via optimal transport
Sampling conditional distributions is a fundamental task for Bayesian
inference and density estimation. Generative models, such as normalizing flows
and generative adversarial networks, characterize conditional distributions by
learning a transport map that pushes forward a simple reference (e.g., a
standard Gaussian) to a target distribution. While these approaches
successfully describe many non-Gaussian problems, their performance is often
limited by parametric bias and the reliability of gradient-based (adversarial)
optimizers to learn these transformations. This work proposes a non-parametric
generative model that iteratively maps reference samples to the target. The
model uses block-triangular transport maps, whose components are shown to
characterize conditionals of the target distribution. These maps arise from
solving an optimal transport problem with a weighted cost function,
thereby extending the data-driven approach in [Trigila and Tabak, 2016] for
conditional sampling. The proposed approach is demonstrated on a two
dimensional example and on a parameter inference problem involving nonlinear
ODEs.Comment: 18 pages, 5 figure
Gravitational wave memory beyond general relativity
Gravitational wave memory is a nonoscillatory correction to the gravitational
wave strain predicted by general relativity, which has yet to be detected.
Within general relativity, its dominant component, known as the null memory,
can be understood as arising from the backreaction of the energy carried by
gravitational waves, and therefore it corresponds to a direct manifestation of
the nonlinearity of the theory. In this paper, we investigate the null-memory
prediction in a broad class of modified gravity theories, with the aim of
exploring potential lessons to be learned from future measurements of the
memory effect. Based on Isaacson's approach to the leading-order field
equations, we in particular compute the null memory for the most general
scalar-vector-tensor theory with second-order equations of motion and vanishing
field potentials. We find that the functional form of the null memory is only
modified through the potential presence of additional radiative null energy
sources in the theory. We subsequently generalize this result by proving a
theorem that states that the simple structure of the tensor null-memory
equation remains unaltered in any metric theory whose massless gravitational
fields satisfy decoupled wave equations to first order in perturbation theory,
which encompasses a large class of viable extensions to general relativity.Comment: 39 page
- …