Search CORE

264 research outputs found

Statistical distances for model validation and clustering. Applications to flow cytometry and fair learning.

Author: Inouzhe Valdés Hristo
Publication venue: 'Universidad de Valladolid'
Publication date: 01/01/2020
Field of study

This thesis has been developed at the University of Valladolid and IMUVA within the framework of the project Sampling, trimming, and probabilistic metric techniques. Statis- tical applications whose main researchers are Carlos Matr an Bea and Eustasio del Barrio Tellado. Among the lines of research associated with the project are: model validation, Wasserstein distances and robust cluster analysis. It is precisely the work carried out in these elds that gives rise to chapters 1,2 and 4 of this report. The work done in the eld of fair learning with Professor Jean-Michel Loubes, frequent collaborator with Valladolid's team, during the international stay at the Paul Sabatier University of Toulouse, is the basis of Chapter 3 of this report. Therefore, this thesis is an exposition of the problems and results obtained in the di erent elds previously mentioned. Due to the diversity of topics, we have decided to base chapters on the works published or submitted to the present date, and therefore each chapter has a structure relatively independent of the others. In this way Chapter 1 is based on the works [del Barrio et al., 2019e,del Barrio et al., 2019d], Chapter 2 is based on the work [del Barrio et al., 2019c], Chapter 3 on the work [del Barrio et al., 2019b] and Chapter 4 shows results of a work in progress. In this introduction our objective is to present the main challenges we have faced, as well as to brie y present our most relevant results. On the other hand, each chapter will have its own introduction where we will delve into the topics discussed below. With this in mind, our intention is that the reader will have a general idea of what he or she will nd in each chapter and in this way will have the necessary information to face the more technical discussions that will be found there. Due to the diversity of topics dealt with in this report, we propose a non-linear reading. We suggest that the reader, after reading a section of the Introduction, moves to the corresponding chapter. In this way the reader will have the relevant information more at hand and will be able to follow better the exposition in each chapter. If on the other hand there is a sequential reading of the document, we apologize in advance for some repetitions and reiterations, which nevertheless seem to us to contribute positively to the understanding of this work.Departamento de Estadística e Investigación OperativaDoctorado en Matemática

Repositorio Documental de la Universidad de Valladolid

The triangulation of manifolds

Author: Quinn Frank
Publication venue
Publication date: 12/11/2013
Field of study

A mostly expository account of old questions about the relationship between polyhedra and topological manifolds. Topics are old topological results, new gauge theory results (with speculations about next directions), and history of the questions.Comment: 26 pages, 2 figures. version 2: spellings corrected, analytic speculations in 4.8.2 sharpene

arXiv.org e-Print Archive

CiteSeerX

Drawing Binary Tanglegrams: An Experimental Evaluation

Author: Holten Danny
Nöllenburg Martin
Völker Markus
Wolff Alexander
Publication venue
Publication date: 01/01/2008
Field of study

A binary tanglegram is a pair of binary trees whose leaf sets are in one-to-one correspondence; matching leaves are connected by inter-tree edges. For applications, for example in phylogenetics or software engineering, it is required that the individual trees are drawn crossing-free. A natural optimization problem, denoted tanglegram layout problem, is thus to minimize the number of crossings between inter-tree edges. The tanglegram layout problem is NP-hard and is currently considered both in application domains and theory. In this paper we present an experimental comparison of a recursive algorithm of Buchin et al., our variant of their algorithm, the algorithm hierarchy sort of Holten and van Wijk, and an integer quadratic program that yields optimal solutions.Comment: see http://www.siam.org/proceedings/alenex/2009/alx09_011_nollenburgm.pd

arXiv.org e-Print Archive

CiteSeerX

Pure OAI Repository

Euclidean distance geometry and applications

Author: Lavor Carlile
Liberti Leo
Maculan Nelson
Mucherino Antonio
Publication venue
Publication date: 02/05/2012
Field of study

Euclidean distance geometry is the study of Euclidean geometry based on the concept of distance. This is useful in several applications where the input data consists of an incomplete set of distances, and the output is a set of points in Euclidean space that realizes the given distances. We survey some of the theory of Euclidean distance geometry and some of the most important applications: molecular conformation, localization of sensor networks and statics.Comment: 64 pages, 21 figure

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

INRIA a CCSD electronic archive server

Repositorio da Producao Cientifica e Intelectual da Unicamp

HAL-Polytechnique

HAL-Rennes 1

Times series averaging from a probabilistic interpretation of time-elastic kernel

Author: Marteau Pierre-François
Publication venue
Publication date: 01/05/2015
Field of study

At the light of regularized dynamic time warping kernels, this paper reconsider the concept of time elastic centroid (TEC) for a set of time series. From this perspective, we show first how TEC can easily be addressed as a preimage problem. Unfortunately this preimage problem is ill-posed, may suffer from over-fitting especially for long time series and getting a sub-optimal solution involves heavy computational costs. We then derive two new algorithms based on a probabilistic interpretation of kernel alignment matrices that expresses in terms of probabilistic distributions over sets of alignment paths. The first algorithm is an iterative agglomerative heuristics inspired from the state of the art DTW barycenter averaging (DBA) algorithm proposed specifically for the Dynamic Time Warping measure. The second proposed algorithm achieves a classical averaging of the aligned samples but also implements an averaging of the time of occurrences of the aligned samples. It exploits a straightforward progressive agglomerative heuristics. An experimentation that compares for 45 time series datasets classification error rates obtained by first near neighbors classifiers exploiting a single medoid or centroid estimate to represent each categories show that: i) centroids based approaches significantly outperform medoids based approaches, ii) on the considered experience, the two proposed algorithms outperform the state of the art DBA algorithm, and iii) the second proposed algorithm that implements an averaging jointly in the sample space and along the time axes emerges as the most significantly robust time elastic averaging heuristic with an interesting noise reduction capability. Index Terms-Time series averaging Time elastic kernel Dynamic Time Warping Time series clustering and classification

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Asymptotics of Some Plancherel Averages via Polynomiality Results

Author: Schachinger Werner
Publication venue
Publication date: 30/08/2023
Field of study

Consider Young diagrams of

n

boxes distributed according to the Plancherel measure. So those diagrams could be the output of the RSK algorithm, when applied to random permutations of the set

\{1,\ldots,n\}

. Here we are interested in asymptotics, as

n\to \infty

, of expectations of certain functions of random Young diagrams, such as the number of bumping steps of the RSK algorithm that leads to that diagram, the side length of its Durfee square, or the logarithm of its probability. We can express these functions in terms of hook lengths or contents of the boxes of the diagram, which opens the door for application of known polynomiality results for Plancherel averages. We thus obtain representations of expectations as binomial convolutions, that can be further analyzed with the help of Rice's integral or Poisson generating functions. Among our results is a very explicit expression for the constant appearing in the almost equipartition property of the Plancherel measure

arXiv.org e-Print Archive

Spectral inequalities in quantitative form

Author: Brasco Lorenzo
De Philippis Guido
Publication venue
Publication date: 18/04/2016
Field of study

We review some results about quantitative improvements of sharp inequalities for eigenvalues of the Laplacian.Comment: 71 pages, 4 figures, 6 open problems, 76 references. This is a chapter of the forthcoming book "Shape Optimization and Spectral Theory", edited by Antoine Henrot and published by De Gruyte

arXiv.org e-Print Archive

HAL AMU

HAL Descartes

Archivio istituzionale della ricerca - Università di Ferrara

The Bounded Confidence Model Of Opinion Dynamics

Author: Boudec Jean-Yves Le
Graham Carl
Gómez-Serrano Javier
Publication venue
Publication date: 01/01/2010
Field of study

The bounded confidence model of opinion dynamics, introduced by Deffuant et al, is a stochastic model for the evolution of continuous-valued opinions within a finite group of peers. We prove that, as time goes to infinity, the opinions evolve globally into a random set of clusters too far apart to interact, and thereafter all opinions in every cluster converge to their barycenter. We then prove a mean-field limit result, propagation of chaos: as the number of peers goes to infinity in adequately started systems and time is rescaled accordingly, the opinion processes converge to i.i.d. nonlinear Markov (or McKean-Vlasov) processes; the limit opinion processes evolves as if under the influence of opinions drawn from its own instantaneous law, which are the unique solution of a nonlinear integro-differential equation of Kac type. This implies that the (random) empirical distribution processes converges to this (deterministic) solution. We then prove that, as time goes to infinity, this solution converges to a law concentrated on isolated opinions too far apart to interact, and identify sufficient conditions for the limit not to depend on the initial condition, and to be concentrated at a single opinion. Finally, we prove that if the equation has an initial condition with a density, then its solution has a density at all times, develop a numerical scheme for the corresponding functional equation, and show numerically that bifurcations may occur.Comment: 43 pages, 7 figure

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne