12 research outputs found

    Guaranteed bounds on the Kullback-Leibler divergence of univariate mixtures using piecewise log-sum-exp inequalities

    Full text link
    Information-theoretic measures such as the entropy, cross-entropy and the Kullback-Leibler divergence between two mixture models is a core primitive in many signal processing tasks. Since the Kullback-Leibler divergence of mixtures provably does not admit a closed-form formula, it is in practice either estimated using costly Monte-Carlo stochastic integration, approximated, or bounded using various techniques. We present a fast and generic method that builds algorithmically closed-form lower and upper bounds on the entropy, the cross-entropy and the Kullback-Leibler divergence of mixtures. We illustrate the versatile method by reporting on our experiments for approximating the Kullback-Leibler divergence between univariate exponential mixtures, Gaussian mixtures, Rayleigh mixtures, and Gamma mixtures.Comment: 20 pages, 3 figure

    MU-MIMO Communications with MIMO Radar: From Co-existence to Joint Transmission

    Get PDF
    Beamforming techniques are proposed for a joint multi-input-multi-output (MIMO) radar-communication (RadCom) system, where a single device acts both as a radar and a communication base station (BS) by simultaneously communicating with downlink users and detecting radar targets. Two operational options are considered, where we first split the antennas into two groups, one for radar and the other for communication. Under this deployment, the radar signal is designed to fall into the null-space of the downlink channel. The communication beamformer is optimized such that the beampattern obtained matches the radar's beampattern while satisfying the communication performance requirements. To reduce the optimizations' constraints, we consider a second operational option, where all the antennas transmit a joint waveform that is shared by both radar and communications. In this case, we formulate an appropriate probing beampattern, while guaranteeing the performance of the downlink communications. By incorporating the SINR constraints into objective functions as penalty terms, we further simplify the original beamforming designs to weighted optimizations, and solve them by efficient manifold algorithms. Numerical results show that the shared deployment outperforms the separated case significantly, and the proposed weighted optimizations achieve a similar performance to the original optimizations, despite their significantly lower computational complexity.Comment: 15 pages, 15 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    LSEMINK: A Modified Newton-Krylov Method for Log-Sum-Exp Minimization

    Full text link
    This paper introduces LSEMINK, an effective modified Newton-Krylov algorithm geared toward minimizing the log-sum-exp function for a linear model. Problems of this kind arise commonly, for example, in geometric programming and multinomial logistic regression. Although the log-sum-exp function is smooth and convex, standard line search Newton-type methods can become inefficient because the quadratic approximation of the objective function can be unbounded from below. To circumvent this, LSEMINK modifies the Hessian by adding a shift in the row space of the linear model. We show that the shift renders the quadratic approximation to be bounded from below and that the overall scheme converges to a global minimizer under mild assumptions. Our convergence proof also shows that all iterates are in the row space of the linear model, which can be attractive when the model parameters do not have an intuitive meaning, as is common in machine learning. Since LSEMINK uses a Krylov subspace method to compute the search direction, it only requires matrix-vector products with the linear model, which is critical for large-scale problems. Our numerical experiments on image classification and geometric programming illustrate that LSEMINK considerably reduces the time-to-solution and increases the scalability compared to geometric programming and natural gradient descent approaches. It has significantly faster initial convergence than standard Newton-Krylov methods, which is particularly attractive in applications like machine learning. In addition, LSEMINK is more robust to ill-conditioning arising from the nonsmoothness of the problem. We share our MATLAB implementation at https://github.com/KelvinKan/LSEMINK

    Developing an app to interpret chest X-rays to support the diagnosis of respiratory pathology with artificial intelligence

    Get PDF
    Background: Medical images, including results from X-rays, are an integral part of medical diagnosis. Their interpretation requires an experienced radiologist. One of the main problems in developing countries is access to timely medical diagnosis. Lack of investment in health care infrastructure, geographical isolation and shortage of trained specialists are common obstacles to providing adequate health care in many areas of the world. In this work we show how to build and deploy a Deep Learning computer vision application for the classification of 14 common thorax disease using X-rays images. Methods: We make use of the FAST.AI and pytorch framework to create and train the DenseNet-121 model to classify the X-ray images from the ChestX-ray14 data set which contains 112,120 frontal-view X-ray images of 30,805 unique patients. After training and validate our model we create a web-app using Heroku, this web-app can be accessed by any mobile device with internet connection. Results: We obtained 70% for detecting pneumothorax for the one-vs-all task. Meanwhile, for the multilabel-multiclass task we are able to achieve state-of-the-art accuracy with fewer epochs, reducing drastically the training time of the model. We also demonstrate the feature localization of our model by using the Grad-CAM methodologies, feature which can be useful for early diagnostic of dangerous illnesses. Conclusions: In this work we present our study of the use of machine learning techniques to identify diseases using X-ray information. We have used the new framework of Fast.AI, and imported the resulting model to an app which can be tested by any user. The app has an intuitive interface where the user can upload an image and obtain a likelihood for the given image be classified as one of the 14 labeled diseases. This classification could assist diagnosis by medical providers and broaden access to medical services to remote areas.publishe

    On the Properties of Kullback-Leibler Divergence Between Multivariate Gaussian Distributions

    Full text link
    Kullback-Leibler (KL) divergence is one of the most important divergence measures between probability distributions. In this paper, we prove several properties of KL divergence between multivariate Gaussian distributions. First, for any two nn-dimensional Gaussian distributions N1\mathcal{N}_1 and N2\mathcal{N}_2, we give the supremum of KL(N1∣∣N2)KL(\mathcal{N}_1||\mathcal{N}_2) when KL(N2∣∣N1)≤ε (ε>0)KL(\mathcal{N}_2||\mathcal{N}_1)\leq \varepsilon\ (\varepsilon>0). For small ε\varepsilon, we show that the supremum is ε+2ε1.5+O(ε2)\varepsilon + 2\varepsilon^{1.5} + O(\varepsilon^2). This quantifies the approximate symmetry of small KL divergence between Gaussians. We also find the infimum of KL(N1∣∣N2)KL(\mathcal{N}_1||\mathcal{N}_2) when KL(N2∣∣N1)≥M (M>0)KL(\mathcal{N}_2||\mathcal{N}_1)\geq M\ (M>0). We give the conditions when the supremum and infimum can be attained. Second, for any three nn-dimensional Gaussians N1\mathcal{N}_1, N2\mathcal{N}_2, and N3\mathcal{N}_3, we find an upper bound of KL(N1∣∣N3)KL(\mathcal{N}_1||\mathcal{N}_3) if KL(N1∣∣N2)≤ε1KL(\mathcal{N}_1||\mathcal{N}_2)\leq \varepsilon_1 and KL(N2∣∣N3)≤ε2KL(\mathcal{N}_2||\mathcal{N}_3)\leq \varepsilon_2 for ε1,ε2≥0\varepsilon_1,\varepsilon_2\ge 0. For small ε1\varepsilon_1 and ε2\varepsilon_2, we show the upper bound is 3ε1+3ε2+2ε1ε2+o(ε1)+o(ε2)3\varepsilon_1+3\varepsilon_2+2\sqrt{\varepsilon_1\varepsilon_2}+o(\varepsilon_1)+o(\varepsilon_2). This reveals that KL divergence between Gaussians follows a relaxed triangle inequality. Importantly, all the bounds in the theorems presented in this paper are independent of the dimension nn. Finally, We discuss the applications of our theorems in explaining counterintuitive phenomenon of flow-based model, deriving deep anomaly detection algorithm, and extending one-step robustness guarantee to multiple steps in safe reinforcement learning.Comment: arXiv admin note: text overlap with arXiv:2002.0332

    Guaranteed Bounds on the Kullback–Leibler Divergence of Univariate Mixtures

    No full text
    corecore