1,918 research outputs found
Recommended from our members
Interpretable Machine Learning Architectures for Efficient Signal Detection with Applications to Gravitational Wave Astronomy
Deep learning has seen rapid evolution in the past decade, accomplishing tasks that were previously unimaginable. At the same time, researchers strive to better understand and interpret the underlying mechanisms of the deep models, which are often justifiably regarded as "black boxes". Overcoming this deficiency will not only serve to suggest better learning architectures and training methods, but also extend deep learning to scenarios where interpretability is key to the application. One such scenario is signal detection and estimation, with gravitational wave detection as a specific example, where classic methods are often preferred for their interpretability. Nonetheless, while classic statistical detection methods such as matched filtering excel in their simplicity and intuitiveness, they can be suboptimal in terms of both accuracy and computational efficiency. Therefore, it is appealing to have methods that achieve ``the best of both worlds'', namely enjoying simultaneously excellent performance and interpretability.
In this thesis, we aim to bridge this gap between modern deep learning and classic statistical detection, by revisiting the signal detection problem from a new perspective. First, to address the perceived distinction in interpretability between classic matched filtering and deep learning, we state the intrinsic connections between the two families of methods, and identify how trainable networks can address the structural limitations of matched filtering. Based on these ideas, we propose two trainable architectures that are constructed based on matched filtering, but with learnable templates and adaptivity to unknown noise distributions, and therefore higher detection accuracy. We next turn our attention toward improving the computational efficiency of detection, where we aim to design architectures that leverage structures within the problem for efficiency gains. By leveraging the statistical structure of class imbalance, we integrate hierarchical detection into trainable networks, and use a novel loss function which explicitly encodes both detection accuracy and efficiency. Furthermore, by leveraging the geometric structure of the signal set, we consider using signal space optimization as an alternative computational primitive for detection, which is intuitively more efficient than covering with a template bank. We theoretical prove the efficiency gain by analyzing Riemannian gradient descent on the signal manifold, which reveals an exponential improvement in efficiency over matched filtering. We also propose a practical trainable architecture for template optimization, which makes use of signal embedding and kernel interpolation.
We demonstrate the performance of all proposed architectures on the task of gravitational wave detection in astrophysics, where matched filtering is the current method of choice. The architectures are also widely applicable to general signal or pattern detection tasks, which we exemplify with the handwritten digit recognition task using the template optimization architecture. Together, we hope the this work useful to scientists and engineers seeking machine learning architectures with high performance and interpretability, and contribute to our understanding of deep learning as a whole
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Semantic Security with Infinite Dimensional Quantum Eavesdropping Channel
We propose a new proof method for direct coding theorems for wiretap channels
where the eavesdropper has access to a quantum version of the transmitted
signal on an infinite-dimensional Hilbert space and the legitimate parties
communicate through a classical channel or a classical input, quantum output
(cq) channel. The transmitter input can be subject to an additive cost
constraint, which specializes to the case of an average energy constraint. This
method yields errors that decay exponentially with increasing block lengths.
Moreover, it provides a guarantee of a quantum version of semantic security,
which is an established concept in classical cryptography and physical layer
security. Therefore, it complements existing works which either do not prove
the exponential error decay or use weaker notions of security. The main part of
this proof method is a direct coding result on channel resolvability which
states that there is only a doubly exponentially small probability that a
standard random codebook does not solve the channel resolvability problem for
the cq channel. Semantic security has strong operational implications meaning
essentially that the eavesdropper cannot use its quantum observation to gather
any meaningful information about the transmitted signal. We also discuss the
connections between semantic security and various other established notions of
secrecy
Random block coordinate methods for inconsistent convex optimisation problems
We develop a novel randomised block coordinate primal-dual algorithm for a
class of non-smooth ill-posed convex programs. Lying in the midway between the
celebrated Chambolle-Pock primal-dual algorithm and Tseng's accelerated
proximal gradient method, we establish global convergence of the last iterate
as well optimal and complexity rates in the convex and
strongly convex case, respectively, being the iteration count. Motivated by
the increased complexity in the control of distribution level electric power
systems, we test the performance of our method on a second-order cone
relaxation of an AC-OPF problem. Distributed control is achieved via the
distributed locational marginal prices (DLMPs), which are obtained \revise{as}
dual variables in our optimisation framework.Comment: Changed title and revised manuscrip
Advances and Applications of DSmT for Information Fusion. Collected Works, Volume 5
This fifth volume on Advances and Applications of DSmT for Information Fusion collects theoretical and applied contributions of researchers working in different fields of applications and in mathematics, and is available in open-access. The collected contributions of this volume have either been published or presented after disseminating the fourth volume in 2015 in international conferences, seminars, workshops and journals, or they are new. The contributions of each part of this volume are chronologically ordered.
First Part of this book presents some theoretical advances on DSmT, dealing mainly with modified Proportional Conflict Redistribution Rules (PCR) of combination with degree of intersection, coarsening techniques, interval calculus for PCR thanks to set inversion via interval analysis (SIVIA), rough set classifiers, canonical decomposition of dichotomous belief functions, fast PCR fusion, fast inter-criteria analysis with PCR, and improved PCR5 and PCR6 rules preserving the (quasi-)neutrality of (quasi-)vacuous belief assignment in the fusion of sources of evidence with their Matlab codes.
Because more applications of DSmT have emerged in the past years since the apparition of the fourth book of DSmT in 2015, the second part of this volume is about selected applications of DSmT mainly in building change detection, object recognition, quality of data association in tracking, perception in robotics, risk assessment for torrent protection and multi-criteria decision-making, multi-modal image fusion, coarsening techniques, recommender system, levee characterization and assessment, human heading perception, trust assessment, robotics, biometrics, failure detection, GPS systems, inter-criteria analysis, group decision, human activity recognition, storm prediction, data association for autonomous vehicles, identification of maritime vessels, fusion of support vector machines (SVM), Silx-Furtif RUST code library for information fusion including PCR rules, and network for ship classification.
Finally, the third part presents interesting contributions related to belief functions in general published or presented along the years since 2015. These contributions are related with decision-making under uncertainty, belief approximations, probability transformations, new distances between belief functions, non-classical multi-criteria decision-making problems with belief functions, generalization of Bayes theorem, image processing, data association, entropy and cross-entropy measures, fuzzy evidence numbers, negator of belief mass, human activity recognition, information fusion for breast cancer therapy, imbalanced data classification, and hybrid techniques mixing deep learning with belief functions as well
Structured Semidefinite Programming for Recovering Structured Preconditioners
We develop a general framework for finding approximately-optimal
preconditioners for solving linear systems. Leveraging this framework we obtain
improved runtimes for fundamental preconditioning and linear system solving
problems including the following. We give an algorithm which, given positive
definite with
nonzero entries, computes an -optimal
diagonal preconditioner in time , where is the
optimal condition number of the rescaled matrix. We give an algorithm which,
given that is either the pseudoinverse
of a graph Laplacian matrix or a constant spectral approximation of one, solves
linear systems in in time. Our diagonal
preconditioning results improve state-of-the-art runtimes of
attained by general-purpose semidefinite programming, and our solvers improve
state-of-the-art runtimes of where is the
current matrix multiplication constant. We attain our results via new
algorithms for a class of semidefinite programs (SDPs) we call
matrix-dictionary approximation SDPs, which we leverage to solve an associated
problem we call matrix-dictionary recovery.Comment: Merge of arXiv:1812.06295 and arXiv:2008.0172
A qualitative difference between gradient flows of convex functions in finite- and infinite-dimensional Hilbert spaces
We consider gradient flow/gradient descent and heavy ball/accelerated
gradient descent optimization for convex objective functions. In the gradient
flow case, we prove the following:
1. If does not have a minimizer, the convergence can
be arbitrarily slow.
2. If does have a minimizer, the excess energy is
integrable/summable in time. In particular, as
.
3. In Hilbert spaces, this is optimal: can decay to as
slowly as any given function which is monotone decreasing and integrable at
, even for a fixed quadratic objective.
4. In finite dimension (or more generally, for all gradient flow curves of
finite length), this is not optimal: We prove that there are convex monotone
decreasing integrable functions which decrease to zero slower than
for the gradient flow of any convex function on .
For instance, we show that any gradient flow of a convex function in
finite dimension satisfies .
This improves on the commonly reported rate and provides a sharp
characterization of the energy decay law. We also note that it is impossible to
establish a rate for any function which satisfies
, even asymptotically.
Similar results are obtained in related settings for (1) discrete time
gradient descent, (2) stochastic gradient descent with multiplicative noise and
(3) the heavy ball ODE. In the case of stochastic gradient descent, the
summability of is used to prove that almost surely - an improvement on the convergence almost surely up to a
subsequence which follows from the decay estimate
On the latent dimension of deep autoencoders for reduced order modeling of PDEs parametrized by random fields
Deep Learning is having a remarkable impact on the design of Reduced Order
Models (ROMs) for Partial Differential Equations (PDEs), where it is exploited
as a powerful tool for tackling complex problems for which classical methods
might fail. In this respect, deep autoencoders play a fundamental role, as they
provide an extremely flexible tool for reducing the dimensionality of a given
problem by leveraging on the nonlinear capabilities of neural networks. Indeed,
starting from this paradigm, several successful approaches have already been
developed, which are here referred to as Deep Learning-based ROMs (DL-ROMs).
Nevertheless, when it comes to stochastic problems parameterized by random
fields, the current understanding of DL-ROMs is mostly based on empirical
evidence: in fact, their theoretical analysis is currently limited to the case
of PDEs depending on a finite number of (deterministic) parameters. The purpose
of this work is to extend the existing literature by providing some theoretical
insights about the use of DL-ROMs in the presence of stochasticity generated by
random fields. In particular, we derive explicit error bounds that can guide
domain practitioners when choosing the latent dimension of deep autoencoders.
We evaluate the practical usefulness of our theory by means of numerical
experiments, showing how our analysis can significantly impact the performance
of DL-ROMs
- …