12 research outputs found
Guaranteed bounds on the Kullback-Leibler divergence of univariate mixtures using piecewise log-sum-exp inequalities
Information-theoretic measures such as the entropy, cross-entropy and the
Kullback-Leibler divergence between two mixture models is a core primitive in
many signal processing tasks. Since the Kullback-Leibler divergence of mixtures
provably does not admit a closed-form formula, it is in practice either
estimated using costly Monte-Carlo stochastic integration, approximated, or
bounded using various techniques. We present a fast and generic method that
builds algorithmically closed-form lower and upper bounds on the entropy, the
cross-entropy and the Kullback-Leibler divergence of mixtures. We illustrate
the versatile method by reporting on our experiments for approximating the
Kullback-Leibler divergence between univariate exponential mixtures, Gaussian
mixtures, Rayleigh mixtures, and Gamma mixtures.Comment: 20 pages, 3 figure
MU-MIMO Communications with MIMO Radar: From Co-existence to Joint Transmission
Beamforming techniques are proposed for a joint multi-input-multi-output
(MIMO) radar-communication (RadCom) system, where a single device acts both as
a radar and a communication base station (BS) by simultaneously communicating
with downlink users and detecting radar targets. Two operational options are
considered, where we first split the antennas into two groups, one for radar
and the other for communication. Under this deployment, the radar signal is
designed to fall into the null-space of the downlink channel. The communication
beamformer is optimized such that the beampattern obtained matches the radar's
beampattern while satisfying the communication performance requirements. To
reduce the optimizations' constraints, we consider a second operational option,
where all the antennas transmit a joint waveform that is shared by both radar
and communications. In this case, we formulate an appropriate probing
beampattern, while guaranteeing the performance of the downlink communications.
By incorporating the SINR constraints into objective functions as penalty
terms, we further simplify the original beamforming designs to weighted
optimizations, and solve them by efficient manifold algorithms. Numerical
results show that the shared deployment outperforms the separated case
significantly, and the proposed weighted optimizations achieve a similar
performance to the original optimizations, despite their significantly lower
computational complexity.Comment: 15 pages, 15 figures. This work has been submitted to the IEEE for
possible publication. Copyright may be transferred without notice, after
which this version may no longer be accessibl
LSEMINK: A Modified Newton-Krylov Method for Log-Sum-Exp Minimization
This paper introduces LSEMINK, an effective modified Newton-Krylov algorithm
geared toward minimizing the log-sum-exp function for a linear model. Problems
of this kind arise commonly, for example, in geometric programming and
multinomial logistic regression. Although the log-sum-exp function is smooth
and convex, standard line search Newton-type methods can become inefficient
because the quadratic approximation of the objective function can be unbounded
from below. To circumvent this, LSEMINK modifies the Hessian by adding a shift
in the row space of the linear model. We show that the shift renders the
quadratic approximation to be bounded from below and that the overall scheme
converges to a global minimizer under mild assumptions. Our convergence proof
also shows that all iterates are in the row space of the linear model, which
can be attractive when the model parameters do not have an intuitive meaning,
as is common in machine learning. Since LSEMINK uses a Krylov subspace method
to compute the search direction, it only requires matrix-vector products with
the linear model, which is critical for large-scale problems. Our numerical
experiments on image classification and geometric programming illustrate that
LSEMINK considerably reduces the time-to-solution and increases the scalability
compared to geometric programming and natural gradient descent approaches. It
has significantly faster initial convergence than standard Newton-Krylov
methods, which is particularly attractive in applications like machine
learning. In addition, LSEMINK is more robust to ill-conditioning arising from
the nonsmoothness of the problem. We share our MATLAB implementation at
https://github.com/KelvinKan/LSEMINK
Developing an app to interpret chest X-rays to support the diagnosis of respiratory pathology with artificial intelligence
Background: Medical images, including results from X-rays, are an integral part of medical diagnosis.
Their interpretation requires an experienced radiologist. One of the main problems in developing countries
is access to timely medical diagnosis. Lack of investment in health care infrastructure, geographical isolation
and shortage of trained specialists are common obstacles to providing adequate health care in many areas of
the world. In this work we show how to build and deploy a Deep Learning computer vision application for
the classification of 14 common thorax disease using X-rays images.
Methods: We make use of the FAST.AI and pytorch framework to create and train the DenseNet-121
model to classify the X-ray images from the ChestX-ray14 data set which contains 112,120 frontal-view X-ray
images of 30,805 unique patients. After training and validate our model we create a web-app using Heroku,
this web-app can be accessed by any mobile device with internet connection.
Results: We obtained 70% for detecting pneumothorax for the one-vs-all task. Meanwhile, for the
multilabel-multiclass task we are able to achieve state-of-the-art accuracy with fewer epochs, reducing
drastically the training time of the model. We also demonstrate the feature localization of our model by
using the Grad-CAM methodologies, feature which can be useful for early diagnostic of dangerous illnesses.
Conclusions: In this work we present our study of the use of machine learning techniques to identify
diseases using X-ray information. We have used the new framework of Fast.AI, and imported the resulting
model to an app which can be tested by any user. The app has an intuitive interface where the user can
upload an image and obtain a likelihood for the given image be classified as one of the 14 labeled diseases.
This classification could assist diagnosis by medical providers and broaden access to medical services to
remote areas.publishe
On the Properties of Kullback-Leibler Divergence Between Multivariate Gaussian Distributions
Kullback-Leibler (KL) divergence is one of the most important divergence
measures between probability distributions. In this paper, we prove several
properties of KL divergence between multivariate Gaussian distributions. First,
for any two -dimensional Gaussian distributions and
, we give the supremum of
when . For
small , we show that the supremum is . This quantifies the approximate
symmetry of small KL divergence between Gaussians. We also find the infimum of
when . We give the conditions when the supremum and infimum can be
attained. Second, for any three -dimensional Gaussians ,
, and , we find an upper bound of
if and for
. For small and
, we show the upper bound is
.
This reveals that KL divergence between Gaussians follows a relaxed triangle
inequality. Importantly, all the bounds in the theorems presented in this paper
are independent of the dimension . Finally, We discuss the applications of
our theorems in explaining counterintuitive phenomenon of flow-based model,
deriving deep anomaly detection algorithm, and extending one-step robustness
guarantee to multiple steps in safe reinforcement learning.Comment: arXiv admin note: text overlap with arXiv:2002.0332