Search CORE

12 research outputs found

Guaranteed bounds on the Kullback-Leibler divergence of univariate mixtures using piecewise log-sum-exp inequalities

Author: Nielsen Frank
Sun Ke
Publication venue: 'MDPI AG'
Publication date: 16/08/2016
Field of study

Information-theoretic measures such as the entropy, cross-entropy and the Kullback-Leibler divergence between two mixture models is a core primitive in many signal processing tasks. Since the Kullback-Leibler divergence of mixtures provably does not admit a closed-form formula, it is in practice either estimated using costly Monte-Carlo stochastic integration, approximated, or bounded using various techniques. We present a fast and generic method that builds algorithmically closed-form lower and upper bounds on the entropy, the cross-entropy and the Kullback-Leibler divergence of mixtures. We illustrate the versatile method by reporting on our experiments for approximating the Kullback-Leibler divergence between univariate exponential mixtures, Gaussian mixtures, Rayleigh mixtures, and Gamma mixtures.Comment: 20 pages, 3 figure

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

MU-MIMO Communications with MIMO Radar: From Co-existence to Joint Transmission

Author: Hanzo Lajos
Li Ang
Liu Fan
Masouros Christos
Sun Huafei
Publication venue
Publication date: 03/07/2017
Field of study

Beamforming techniques are proposed for a joint multi-input-multi-output (MIMO) radar-communication (RadCom) system, where a single device acts both as a radar and a communication base station (BS) by simultaneously communicating with downlink users and detecting radar targets. Two operational options are considered, where we first split the antennas into two groups, one for radar and the other for communication. Under this deployment, the radar signal is designed to fall into the null-space of the downlink channel. The communication beamformer is optimized such that the beampattern obtained matches the radar's beampattern while satisfying the communication performance requirements. To reduce the optimizations' constraints, we consider a second operational option, where all the antennas transmit a joint waveform that is shared by both radar and communications. In this case, we formulate an appropriate probing beampattern, while guaranteeing the performance of the downlink communications. By incorporating the SINR constraints into objective functions as penalty terms, we further simplify the original beamforming designs to weighted optimizations, and solve them by efficient manifold algorithms. Numerical results show that the shared deployment outperforms the separated case significantly, and the proposed weighted optimizations achieve a similar performance to the original optimizations, despite their significantly lower computational complexity.Comment: 15 pages, 15 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Crossref

UCL Discovery

LSEMINK: A Modified Newton-Krylov Method for Log-Sum-Exp Minimization

Author: Kan Kelvin
Nagy James G.
Ruthotto Lars
Publication venue
Publication date: 10/07/2023
Field of study

This paper introduces LSEMINK, an effective modified Newton-Krylov algorithm geared toward minimizing the log-sum-exp function for a linear model. Problems of this kind arise commonly, for example, in geometric programming and multinomial logistic regression. Although the log-sum-exp function is smooth and convex, standard line search Newton-type methods can become inefficient because the quadratic approximation of the objective function can be unbounded from below. To circumvent this, LSEMINK modifies the Hessian by adding a shift in the row space of the linear model. We show that the shift renders the quadratic approximation to be bounded from below and that the overall scheme converges to a global minimizer under mild assumptions. Our convergence proof also shows that all iterates are in the row space of the linear model, which can be attractive when the model parameters do not have an intuitive meaning, as is common in machine learning. Since LSEMINK uses a Krylov subspace method to compute the search direction, it only requires matrix-vector products with the linear model, which is critical for large-scale problems. Our numerical experiments on image classification and geometric programming illustrate that LSEMINK considerably reduces the time-to-solution and increases the scalability compared to geometric programming and natural gradient descent approaches. It has significantly faster initial convergence than standard Newton-Krylov methods, which is particularly attractive in applications like machine learning. In addition, LSEMINK is more robust to ill-conditioning arising from the nonsmoothness of the problem. We share our MATLAB implementation at https://github.com/KelvinKan/LSEMINK

arXiv.org e-Print Archive

Developing an app to interpret chest X-rays to support the diagnosis of respiratory pathology with artificial intelligence

Author: Elkins Andrew
Freitas Felipe F.
Sanz Verónica
Publication venue: 'AME Publishing Company'
Publication date: 26/06/2019
Field of study

Background: Medical images, including results from X-rays, are an integral part of medical diagnosis. Their interpretation requires an experienced radiologist. One of the main problems in developing countries is access to timely medical diagnosis. Lack of investment in health care infrastructure, geographical isolation and shortage of trained specialists are common obstacles to providing adequate health care in many areas of the world. In this work we show how to build and deploy a Deep Learning computer vision application for the classification of 14 common thorax disease using X-rays images. Methods: We make use of the FAST.AI and pytorch framework to create and train the DenseNet-121 model to classify the X-ray images from the ChestX-ray14 data set which contains 112,120 frontal-view X-ray images of 30,805 unique patients. After training and validate our model we create a web-app using Heroku, this web-app can be accessed by any mobile device with internet connection. Results: We obtained 70% for detecting pneumothorax for the one-vs-all task. Meanwhile, for the multilabel-multiclass task we are able to achieve state-of-the-art accuracy with fewer epochs, reducing drastically the training time of the model. We also demonstrate the feature localization of our model by using the Grad-CAM methodologies, feature which can be useful for early diagnostic of dangerous illnesses. Conclusions: In this work we present our study of the use of machine learning techniques to identify diseases using X-ray information. We have used the new framework of Fast.AI, and imported the resulting model to an app which can be tested by any user. The app has an intuitive interface where the user can upload an image and obtain a likelihood for the given image be classified as one of the 14 labeled diseases. This classification could assist diagnosis by medical providers and broaden access to medical services to remote areas.publishe

arXiv.org e-Print Archive

Repositório Institucional da Universidade de Aveiro

On the Properties of Kullback-Leibler Divergence Between Multivariate Gaussian Distributions

Author: Chen Zhenbang
Li Kenli
Liu Wanwei
Wang Ji
Zhang Yufeng
Publication venue
Publication date: 22/01/2023
Field of study

Kullback-Leibler (KL) divergence is one of the most important divergence measures between probability distributions. In this paper, we prove several properties of KL divergence between multivariate Gaussian distributions. First, for any two

n

-dimensional Gaussian distributions

\mathcal{N}_1

and

\mathcal{N}_2

, we give the supremum of

KL(\mathcal{N}_1||\mathcal{N}_2)

when

KL(\mathcal{N}_2||\mathcal{N}_1)\leq \varepsilon\ (\varepsilon>0)

. For small

\varepsilon

, we show that the supremum is

\varepsilon + 2\varepsilon^{1.5} + O(\varepsilon^2)

. This quantifies the approximate symmetry of small KL divergence between Gaussians. We also find the infimum of

KL(\mathcal{N}_1||\mathcal{N}_2)

when

KL(\mathcal{N}_2||\mathcal{N}_1)\geq M\ (M>0)

. We give the conditions when the supremum and infimum can be attained. Second, for any three

n

-dimensional Gaussians

\mathcal{N}_1

\mathcal{N}_2

, and

\mathcal{N}_3

, we find an upper bound of

KL(\mathcal{N}_1||\mathcal{N}_3)

KL(\mathcal{N}_1||\mathcal{N}_2)\leq \varepsilon_1

and

KL(\mathcal{N}_2||\mathcal{N}_3)\leq \varepsilon_2

for

\varepsilon_1,\varepsilon_2\ge 0

. For small

\varepsilon_1

and

\varepsilon_2

, we show the upper bound is

3\varepsilon_1+3\varepsilon_2+2\sqrt{\varepsilon_1\varepsilon_2}+o(\varepsilon_1)+o(\varepsilon_2)

. This reveals that KL divergence between Gaussians follows a relaxed triangle inequality. Importantly, all the bounds in the theorems presented in this paper are independent of the dimension

n

. Finally, We discuss the applications of our theorems in explaining counterintuitive phenomenon of flow-based model, deriving deep anomaly detection algorithm, and extending one-step robustness guarantee to multiple steps in safe reinforcement learning.Comment: arXiv admin note: text overlap with arXiv:2002.0332

arXiv.org e-Print Archive

Guaranteed Bounds on the Kullback–Leibler Divergence of Univariate Mixtures

Author: Frank Nielsen
Ke Sun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref