203 research outputs found

    PersLay: A Neural Network Layer for Persistence Diagrams and New Graph Topological Signatures

    Full text link
    Persistence diagrams, the most common descriptors of Topological Data Analysis, encode topological properties of data and have already proved pivotal in many different applications of data science. However, since the (metric) space of persistence diagrams is not Hilbert, they end up being difficult inputs for most Machine Learning techniques. To address this concern, several vectorization methods have been put forward that embed persistence diagrams into either finite-dimensional Euclidean space or (implicit) infinite dimensional Hilbert space with kernels. In this work, we focus on persistence diagrams built on top of graphs. Relying on extended persistence theory and the so-called heat kernel signature, we show how graphs can be encoded by (extended) persistence diagrams in a provably stable way. We then propose a general and versatile framework for learning vectorizations of persistence diagrams, which encompasses most of the vectorization techniques used in the literature. We finally showcase the experimental strength of our setup by achieving competitive scores on classification tasks on real-life graph datasets

    Optimal quantization of the mean measure and applications to statistical learning

    Get PDF
    This paper addresses the case where data come as point sets, or more generally as discrete measures. Our motivation is twofold: first we intend to approximate with a compactly supported measure the mean of the measure generating process, that coincides with the intensity measure in the point process framework, or with the expected persistence diagram in the framework of persistence-based topological data analysis. To this aim we provide two algorithms that we prove almost minimax optimal. Second we build from the estimator of the mean measure a vectorization map, that sends every measure into a finite-dimensional Euclidean space, and investigate its properties through a clustering-oriented lens. In a nutshell, we show that in a mixture of measure generating process, our technique yields a representation in Rk\mathbb{R}^k, for k∈N∗k \in \mathbb{N}^* that guarantees a good clustering of the data points with high probability. Interestingly, our results apply in the framework of persistence-based shape classification via the ATOL procedure described in \cite{Royer19}

    Optimal quantization of the mean measure and application to clustering of measures

    Get PDF
    This paper addresses the case where data come as point sets, or more generally as discrete measures. Our motivation is twofold: first we intend to approximate with a compactly supported measure the mean of the measure generating process, that coincides with the intensity measure in the point process framework, or with the expected persistence diagram in the framework of persistence-based topological data analysis. To this aim we provide two algorithms that we prove almost minimax optimal. Second we build from the estimator of the mean measure a vectorization map, that sends every measure into a finite-dimensional Euclidean space, and investigate its properties through a clustering-oriented lens. In a nutshell, we show that in a mixture of measure generating process, our technique yields a representation in Rk\mathbb{R}^k, for k∈N∗k \in \mathbb{N}^* that guarantees a good clustering of the data points with high probability. Interestingly, our results apply in the framework of persistence-based shape classification via the ATOL procedure described in \cite{Royer19}

    New distance and depth estimates from observations of eclipsing binaries in the SMC

    Get PDF
    A sample of 33 eclipsing binaries observed in a field of the SMC with FLAMES@VLT is presented. The radial velocity curves obtained, together with existing OGLE light curves, allowed the determination of all stellar and orbital parameters of these binary systems. The mean distance modulus of the observed part of the SMC is 19.05 mag, based on the 26 most reliable systems. Assuming an average error of 0.1 mag on the distance modulus to an individual system, and a gaussian distribution of the distance moduli, we obtain a 2-σ depth of 0.36 mag or 10.6 kpc. Some results on the kinematics of the binary stars and of the H ii gas are also give

    ATOL: Measure Vectorisation for Automatic Topologically-Oriented Learning

    Get PDF
    Robust topological information commonly comes in the form of a set of persistence diagrams, finite measures that are in nature uneasy to affix to generic machine learning frameworks. We introduce a learnt, unsupervised measure vectorisation method and use it for reflecting underlying changes in topological behaviour in machine learning contexts. Relying on optimal measure quantisation results the method is tailored to efficiently discriminate important plane regions where meaningful differences arise. We showcase the strength and robustness of our approach on a number of applications, from emulous and modern graph collections where the method reaches state-of-the-art performance to a geometric synthetic dynamical orbits problem. The proposed methodology comes with only high level tuning parameters such as the total measure encoding budget, and we provide a completely open access software

    Pea Albumin 1 Subunit b (PA1b), a Promising Bioinsecticide of Plant Origin

    Get PDF
    PA1b (Pea Albumin 1, subunit b) is a peptide extract from pea seeds showing significant insecticidal activity against certain insects, such as cereal weevils (genus Sitophilus), the mosquitoes Culex pipiens and Aedes aegyptii, and certain species of aphids. PA1b has great potential for use on an industrial scale and for use in organic farming: it is extracted from a common plant; it is a peptide (and therefore suitable for transgenic applications); it can withstand many steps of extraction and purification without losing its activity; and it is present in a seed regularly consumed by humans and mammals without any known toxicity or allergenicity. The potential of this peptide to limit pest damage has stimulated research concerning its host range, its mechanism of action, its three-dimensional structure, the natural diversity of PA1b and its structure-function relationships

    Stragglers of the thick disc

    Full text link
    Young alpha-rich (YAR) stars have been detected in the past as outliers to the local age −\rm- [α\alpha/Fe] relation. These objects are enhanced in α\alpha-elements but apparently younger than typical thick disc stars. We study the global kinematics and chemical properties of YAR giant stars in APOGEE DR17 survey and show that they have properties similar to those of the standard thick disc stellar population. This leads us to conclude that YAR are rejuvenated thick disc objects, most probably evolved blue stragglers. This is confirmed by their position in the Hertzsprung-Russel diagram (HRD). Extending our selection to dwarfs allows us to obtain the first general straggler distribution in an HRD of field stars. We also compare the elemental abundances of our sample with those of standard thick disc stars, and find that our YAR stars are shifted in oxygen, magnesium, sodium, and the slow neutron-capture element cerium. Although we detect no sign of binarity for most objects, the enhancement in cerium may be the signature of a mass transfer from an asymptotic giant branch companion. The most massive YAR stars suggest that mass transfer from an evolved star may not be the only formation pathway, and that other scenarios, such as collision or coalescence should be considered.Comment: 18 Pages, 20 Figures, 1 Table; accepted for publication in Astronomy & Astrophysic

    New distance and depth estimates from observations of eclipsing binaries in the SMC

    Get PDF
    A sample of 33 eclipsing binaries observed in a field of the SMC with FLAMES@VLT is presented. The radial velocity curves obtained, together with existing OGLE light curves, allowed the determination of all stellar and orbital parameters of these binary systems. The mean distance modulus of the observed part of the SMC is 19.05, based on the 26 most reliable systems. Assuming an average error of 0.1 mag on the distance modulus to an individual system, and a gaussian distribution of the distance moduli, we obtain a 2-sigma depth of 0.36 mag or 10.6 kpc. Some results on the kinematics of the binary stars and of the H II gas are also given.Comment: 6 pages, 4 figures, Proc. IAU Symp. No 256, The Magellanic System: Stars, Gas and Galaxies, eds. Jacco Th. van Loon & Joana M. Oliveir

    Optimal quantization of the mean measure and applications to statistical learning

    Get PDF
    This paper addresses the case where data come as point sets, or more generally as discrete measures. Our motivation is twofold: first we intend to approximate with a compactly supported measure the mean of the measure generating process, that coincides with the intensity measure in the point process framework, or with the expected persistence diagram in the framework of persistence-based topological data analysis. To this aim we provide two algorithms that we prove almost minimax optimal. Second we build from the estimator of the mean measure a vectorization map, that sends every measure into a finite-dimensional Euclidean space, and investigate its properties through a clustering-oriented lens. In a nutshell, we show that in a mixture of measure generating process, our technique yields a representation in Rk\mathbb{R}^k, for k∈N∗k \in \mathbb{N}^* that guarantees a good clustering of the data points with high probability. Interestingly, our results apply in the framework of persistence-based shape classification via the ATOL procedure described in \cite{Royer19}
    • 

    corecore