457 research outputs found

    Mod-phi convergence I: Normality zones and precise deviations

    Full text link
    In this paper, we use the framework of mod-ϕ\phi convergence to prove precise large or moderate deviations for quite general sequences of real valued random variables (Xn)nN(X_{n})_{n \in \mathbb{N}}, which can be lattice or non-lattice distributed. We establish precise estimates of the fluctuations P[XntnB]P[X_{n} \in t_{n}B], instead of the usual estimates for the rate of exponential decay log(P[XntnB])\log( P[X_{n}\in t_{n}B]). Our approach provides us with a systematic way to characterise the normality zone, that is the zone in which the Gaussian approximation for the tails is still valid. Besides, the residue function measures the extent to which this approximation fails to hold at the edge of the normality zone. The first sections of the article are devoted to a proof of these abstract results and comparisons with existing results. We then propose new examples covered by this theory and coming from various areas of mathematics: classical probability theory, number theory (statistics of additive arithmetic functions), combinatorics (statistics of random permutations), random matrix theory (characteristic polynomials of random matrices in compact Lie groups), graph theory (number of subgraphs in a random Erd\H{o}s-R\'enyi graph), and non-commutative probability theory (asymptotics of random character values of symmetric groups). In particular, we complete our theory of precise deviations by a concrete method of cumulants and dependency graphs, which applies to many examples of sums of "weakly dependent" random variables. The large number as well as the variety of examples hint at a universality class for second order fluctuations.Comment: 103 pages. New (final) version: multiple small improvements ; a new section on mod-Gaussian convergence coming from the factorization of the generating function ; the multi-dimensional results have been moved to a forthcoming paper ; and the introduction has been reworke

    Four lectures on probabilistic methods for data science

    Full text link
    Methods of high-dimensional probability play a central role in applications for statistics, signal processing theoretical computer science and related fields. These lectures present a sample of particularly useful tools of high-dimensional probability, focusing on the classical and matrix Bernstein's inequality and the uniform matrix deviation inequality. We illustrate these tools with applications for dimension reduction, network analysis, covariance estimation, matrix completion and sparse signal recovery. The lectures are geared towards beginning graduate students who have taken a rigorous course in probability but may not have any experience in data science applications.Comment: Lectures given at 2016 PCMI Graduate Summer School in Mathematics of Data. Some typos, inaccuracies fixe

    Graph ambiguity

    Get PDF
    In this paper, we propose a rigorous way to define the concept of ambiguity in the domain of graphs. In past studies, the classical definition of ambiguity has been derived starting from fuzzy set and fuzzy information theories. Our aim is to show that also in the domain of the graphs it is possible to derive a formulation able to capture the same semantic and mathematical concept. To strengthen the theoretical results, we discuss the application of the graph ambiguity concept to the graph classification setting, conceiving a new kind of inexact graph matching procedure. The results prove that the graph ambiguity concept is a characterizing and discriminative property of graphs. (C) 2013 Elsevier B.V. All rights reserved

    Automatic Delineation of Water Bodies in SAR Images with a Novel Stochastic Distance Approach

    Get PDF
    Coastal regions and surface waters are among the fundamental biological and social development resources worldwide. For this reason, it is essential to thoroughly monitor these regions to determine and characterize their geographical features and environmental health. These geographical regions, however, present several monitoring challenges when using remotely sensed imagery. Small water bodies tend to be surrounded by swamps, marshes, or vegetation, making accurate border detection difficult. Coastal waters, in turn, experience several phenomena due to winds, undercurrents, and waves, which also hamper the detection of environmental hazards like oil spills. In this work, we propose an automated segmentation algorithm that can be applied to these targets in airborne and spaceborne SAR images. The method is based on pointwise detection in fuzzy borders using a parameter estimation of the (Formula presented.) distribution, which has been successfully used in similar contexts. The underlying assumption is that the sought-for border separates regions with different textures, each having different distribution parameters. Then, stochastic distances can identify the most likely point where this parameter change occurs. A curve interpolation algorithm then estimates the actual contour of the body given the detected points. We assess the adequacy of eight stochastic distances that are mostly applied in the literature. We evaluate the performance of our method in terms of similarity between true and detected boundaries on simulated and actual SAR images, achieving promising results. The performance of our proposal is assessed by Hausdorff distance and Intersection over Union. In the case of synthetic data, the selection of the best stochastic distance depends on the parameters of the (Formula presented.) distribution. In contrast, the harmonic-mean and triangular distances produced the best results in detecting borders in three actual SAR images of lagoons. Finally, we present the results of our proposal applied to an image with oil spills using Bhattacharyya, Hellinger, and Jensen–Shannon distances.Fil: Rey, Andrea Alejandra. Universidad Tecnológica Nacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Revollo Sarmiento, Natalia Veronica. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; ArgentinaFil: Frery, Alejandro César. Victoria University Of Wellington; Nueva ZelandaFil: Delrieux, Claudio Augusto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentin
    corecore