10 research outputs found

    The {\alpha}-divergences associated with a pair of strictly comparable quasi-arithmetic means

    Full text link
    We generalize the family of α\alpha-divergences using a pair of strictly comparable weighted means. In particular, we obtain the 11-divergence in the limit case α→1\alpha\rightarrow 1 (a generalization of the Kullback-Leibler divergence) and the 00-divergence in the limit case α→0\alpha\rightarrow 0 (a generalization of the reverse Kullback-Leibler divergence). We state the condition for a pair of quasi-arithmetic means to be strictly comparable, and report the formula for the quasi-arithmetic α\alpha-divergences and its subfamily of bipower homogeneous α\alpha-divergences which belong to the Csis\'ar's ff-divergences. Finally, we show that these generalized quasi-arithmetic 11-divergences and 00-divergences can be decomposed as the sum of generalized cross-entropies minus entropies, and rewritten as conformal Bregman divergences using monotone embeddings.Comment: 18 page

    New Directions for Contact Integrators

    Get PDF
    Contact integrators are a family of geometric numerical schemes which guarantee the conservation of the contact structure. In this work we review the construction of both the variational and Hamiltonian versions of these methods. We illustrate some of the advantages of geometric integration in the dissipative setting by focusing on models inspired by recent studies in celestial mechanics and cosmology.Comment: To appear as Chapter 24 in GSI 2021, Springer LNCS 1282

    Similarity measures for clustering sequences and sets of data

    Get PDF
    The main object of this PhD. Thesis is the definition of new similarity measures for data sequences, with the final purpose of clustering those sequences. Clustering consists in the partitioning of a dataset into isolated subsets or clusters. Data within a given cluster should be similar, and at the same different from data in other clusters. The relevance of data sequences clustering is ever-increasing, due to the abundance of this kind of data (multimedia sequences, movement analysis, stock market evolution, etc.) and the usefulness of clustering as an unsupervised exploratory analysis method. It is this lack of supervision that makes similarity measures extremely important for clustering, since it is the only guide of the learning process. The first part of the Thesis focuses on the development of similarity measures leveraging dynamical models, which can capture relationships between the elements of a given sequence. Following this idea, two lines are explored: • Likelihood-based measures: Based on the popular framework of likelihood-matrix-based similarity measures, we present a novel method based on a re-interpretation of such a matrix. That interpretations stems from the assumption of a latent model space, so models used to build the likelihood matrix are seen as samples from that space. The method is extremely flexible since it allows for the use of any probabilistic model for representing the individual sequences. • State-space trajectories based measures: We introduce a new way of defining affinities between sequences, addressing the main drawbacks of the likelihood-based methods. Working with state-space models makes it possible to identify sequences with the trajectories that they induce in the state-space. This way, comparisons between sequences amounts to comparisons between the corresponding trajectories. Using a common hidden Markov model for all the sequences in the dataset makes those comparisons extremely simple, since trajectories can be identified with transition matrices. This new paradigm improves the scalability of the affinity measures with respect to the dataset size, as well as the performance of those measures when the sequences are short. The second part of the Thesis deals with the case where the dynamics of the sequences are discarded, so the sequences become sets of vectors or points. This step to be taken, without harming the learning process, when the statical features (probability densities) of the different sets are informative enough for the task at hand, which is true for many real scenarios. Work along this line can be further subdivided in two areas: • Sets-of-vectors clustering based on the support of the distributions in a feature space: We propose clustering the sets using a notion of similarity related to the intersection of the supports of their underlying distributions in a Hilbert space. Such a clustering can be efficiently carried out in a hierarchical fashion, in spite of the potentially infinite dimensionality of the feature space. To this end, we propose an algorithm based on simple geometrical arguments. Support estimation is inherently a simpler problem than density estimation, which is the usual starting step for obtaining similarities between probability distributions. • Classifer-based affinity and divergence measures: It is quite natural to link the notion of similarity between sets with the separability between those sets. That separability can be quantified using binary classifiers. This intuitive idea is then extended via generalizations of the family of f-divergences, which originally contains many of the best-known divergences in statistics and machine learning. The proposed generalizations present interesting theoretical properties, and at the same time they have promising practical applications, such as the development of new estimators for standard divergences. -----------------------------------------------------------------------------------------------------------------------------------------------------------------------El objetivo de esta Tesis Doctoral es la definición de nuevas medidas de similitud para secuencias y conjuntos de datos, con la finalidad de servir de entrada a un algoritmo de agrupamiento o clustering [Xu and Wunsch-II, 2005]. El agrupamiento es una de las tareas más habituales dentro del ámbito del aprendizaje máquina (machine learning) [Mitchell, 1997]. Dicha tarea consiste en la partición de un conjunto de datos en subconjuntos aislados (clusters), de tal forma que los datos asignados a un mismo subconjunto sean parecidos entre sí, y distintos a los datos pertenecientes a otros subconjuntos. Una de sus principales particularidades es que se trata de una tarea no supervisada, lo cual implica que no requiere de un conjunto de ejemplos etiquetados. De esta forma se reduce la interacción humana necesaria para el aprendizaje, haciendo del agrupamiento una herramienta ideal para el análisis exploratorio de datos complejos. Por otro lado, es precisamente esta falta de supervisión la que hace fundamental el disponer de una medida adecuada de similitud entre elementos, ya que es la única guía durante el proceso de aprendizaje. El agrupamiento de secuencias es una tarea cada día más importante debido al reciente auge de este tipo de datos. Cabe destacar el ámbito multimedia, en el que muchos contenidos presentan características secuenciales: señales de voz, audio, vídeo, etc. No es un ejemplo aislado, ya que en muchos otros ámbitos se producen casuísticas similares: desde los datos de bolsa y mercados financieros diversos al problema del análisis de movimiento. En la mayoría de estos casos la complejidad de los datos de entrada se une a la dificultad y elevado coste del etiquetado manual de dichos datos. Es precisamente en este tipo de escenarios en los que el agrupamiento es especialmente útil, debido a que no requiere de un etiquetado previo. En muchos casos es posible prescindir de la dinámica de las secuencias sin perjudicar el proceso de aprendizaje. Son aquellos casos en los que las características estáticas de los datos de entrada son suficientemente discriminativas. Al obviar la dinámica, las secuencias se transforman en conjuntos de datos, que se interpretan como muestras (no necesariamente independientes) de unas determinadas distribuciones de probabilidad subyacentes. Ejemplos prácticos de ámbitos en los que se trabaja con conjuntos de datos incluyen el agrupamiento de locutores [Campbell, 1997], los modelos de bolsa de palabras (bag of words) para texto/imagen [Dance et al., 2004], etc. En la presente Tesis propondremos métodos y, sobre todo, puntos de vista innovadores para la definición de similitudes entre secuencias o conjuntos de datos. Todos los métodos propuestos han sido analizados desde un punto de vista tanto teórico como empírico. Desde la perspectiva experimental se ha tratado de trabajar con la mayor cantidad de datos reales posibles, haciendo especial hincapié en las tareas de agrupamiento de locutores y reconocimiento de género musical

    Information Geometry

    Get PDF
    This Special Issue of the journal Entropy, titled “Information Geometry I”, contains a collection of 17 papers concerning the foundations and applications of information geometry. Based on a geometrical interpretation of probability, information geometry has become a rich mathematical field employing the methods of differential geometry. It has numerous applications to data science, physics, and neuroscience. Presenting original research, yet written in an accessible, tutorial style, this collection of papers will be useful for scientists who are new to the field, while providing an excellent reference for the more experienced researcher. Several papers are written by authorities in the field, and topics cover the foundations of information geometry, as well as applications to statistics, Bayesian inference, machine learning, complex systems, physics, and neuroscience

    Information geometry

    Get PDF
    This Special Issue of the journal Entropy, titled “Information Geometry I”, contains a collection of 17 papers concerning the foundations and applications of information geometry. Based on a geometrical interpretation of probability, information geometry has become a rich mathematical field employing the methods of differential geometry. It has numerous applications to data science, physics, and neuroscience. Presenting original research, yet written in an accessible, tutorial style, this collection of papers will be useful for scientists who are new to the field, while providing an excellent reference for the more experienced researcher. Several papers are written by authorities in the field, and topics cover the foundations of information geometry, as well as applications to statistics, Bayesian inference, machine learning, complex systems, physics, and neuroscience

    Robust Procedures for Estimating and Testing in the Framework of Divergence Measures

    Get PDF
    The scope of the contributions to this book will be to present new and original research papers based on MPHIE, MHD, and MDPDE, as well as test statistics based on these estimators from a theoretical and applied point of view in different statistical problems with special emphasis on robustness. Manuscripts given solutions to different statistical problems as model selection criteria based on divergence measures or in statistics for high-dimensional data with divergence measures as loss function are considered. Reviews making emphasis in the most recent state-of-the art in relation to the solution of statistical problems base on divergence measures are also presented

    Advances in knowledge discovery and data mining Part II

    Get PDF
    19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II</p

    MS FT-2-2 7 Orthogonal polynomials and quadrature: Theory, computation, and applications

    Get PDF
    Quadrature rules find many applications in science and engineering. Their analysis is a classical area of applied mathematics and continues to attract considerable attention. This seminar brings together speakers with expertise in a large variety of quadrature rules. It is the aim of the seminar to provide an overview of recent developments in the analysis of quadrature rules. The computation of error estimates and novel applications also are described

    Generalized averaged Gaussian quadrature and applications

    Get PDF
    A simple numerical method for constructing the optimal generalized averaged Gaussian quadrature formulas will be presented. These formulas exist in many cases in which real positive GaussKronrod formulas do not exist, and can be used as an adequate alternative in order to estimate the error of a Gaussian rule. We also investigate the conditions under which the optimal averaged Gaussian quadrature formulas and their truncated variants are internal
    corecore