176 research outputs found

    SPOT: Sliced Partial Optimal Transport

    Get PDF
    International audienceOptimal transport research has surged in the last decade with wide applications in computer graphics. In most cases, however, it has focused on the special case of the so-called ``balanced'' optimal transport problem, that is, the problem of optimally matching positive measures of equal total mass. While this approach is suitable for handling probability distributions as their total mass is always equal to one, it precludes other applications manipulating disparate measures.Our paper proposes a fast approach to the optimal transport of constant distributions supported on point sets of different cardinality via one-dimensional slices. This leads to one-dimensional partial assignment problems akin to alignment problems encountered in genomics or text comparison. Contrary to one-dimensional balanced optimal transport that leads to a trivial linear-time algorithm, such partial optimal transport, even in 1-d, has not seen any closed-form solution nor very efficient algorithms to date.We provide the first efficient 1-d partial optimal transport solver. Along with a quasilinear time problem decomposition algorithm, it solves 1-d assignment problems consisting of up to millions of Dirac distributions within fractions of a second in parallel.We handle higher dimensional problems via a slicing approach, and further extend the popular iterative closest point algorithm using optimal transport -- an algorithm we call Fast Iterative Sliced Transport. We illustrate our method on computer graphics applications such a color transfer and point cloud registration

    Exploring Crosslingual Word Embeddings for Semantic Classification in Text and Dialogue

    Get PDF
    Current approaches to learning crosslingual word emebeddings provide a decent performance when based on a big amount of parallel data. Considering the fact, that most of the languages are under-resourced and lack structured lexical materials, it makes it difficult to implement them into such methods, and, respectively, into any human language technologies. In this thesis we explore whether crosslingual mapping between two sets of monolingual word embeddings obtained separately is strong enough to present competitive results on semantic classification tasks. Our experiment involves learning crosslingual transfer between German and French word vectors based on the combination of adversarial approach and the Procrustes algorithm. We evaluate embeddings on topic classification, sentiment analysis and humour detection tasks. We use a German subset of a multilingual data set for training, and a French subset for testing our models. Results across German and French languages prove that word vectors mapped into a shared vector space are able to obtain and transfer semantic information from one language to another successfully. We also show that crosslingual mapping does not weaken the monolingual connections between words in one language

    Arabidopsis phenotyping through geometric morphometrics

    Get PDF
    Background: Recently, great technical progress has been achieved in the field of plant phenotyping. High-throughput platforms and the development of improved algorithms for rosette image segmentation make it possible to extract shape and size parameters for genetic, physiological, and environmental studies on a large scale. The development of low-cost phenotyping platforms and freeware resources make it possible to widely expand phenotypic analysis tools for Arabidopsis. However, objective descriptors of shape parameters that could be used independently of the platform and segmentation software used are still lacking, and shape descriptions still rely on ad hoc or even contradictory descriptors, which could make comparisons difficult and perhaps inaccurate. Modern geometric morphometrics is a family of methods in quantitative biology proposed to be the main source of data and analytical tools in the emerging field of phenomics studies. Based on the location of landmarks (corresponding points) over imaged specimens and by combining geometry, multivariate analysis, and powerful statistical techniques, these tools offer the possibility to reproducibly and accurately account for shape variations among groups and measure them in shape distance units. Results: Here, a particular scheme of landmark placement on Arabidopsis rosette images is proposed to study shape variation in viral infection processes. Shape differences between controls and infected plants are quantified throughout the infectious process and visualized. Quantitative comparisons between two unrelated ssRNA+ viruses are shown, and reproducibility issues are assessed. Conclusions: Combined with the newest automated platforms and plant segmentation procedures, geometric morphometric tools could boost phenotypic features extraction and processing in an objective, reproducible manner.Instituto de BiotecnologĂ­aFil: Manacorda, Carlos Augusto. Instituto Nacional de TecnologĂ­a Agropecuaria (INTA). Instituto de BiotecnologĂ­a; ArgentinaFil: Asurmendi, Sebastian. Instituto Nacional de TecnologĂ­a Agropecuaria (INTA). Instituto de BiotecnologĂ­a; Argentina. Consejo Nacional de Investigaciones CientĂ­ficas y TĂ©cnicas; Argentin

    Arabidopsis phenotyping through geometric morphometrics

    Get PDF
    Background: Recently, great technical progress has been achieved in the field of plant phenotyping. High-throughput platforms and the development of improved algorithms for rosette image segmentation make it possible to extract shape and size parameters for genetic, physiological, and environmental studies on a large scale. The development of low-cost phenotyping platforms and freeware resources make it possible to widely expand phenotypic analysis tools for Arabidopsis. However, objective descriptors of shape parameters that could be used independently of the platform and segmentation software used are still lacking, and shape descriptions still rely on ad hoc or even contradictory descriptors, which could make comparisons difficult and perhaps inaccurate. Modern geometric morphometrics is a family of methods in quantitative biology proposed to be the main source of data and analytical tools in the emerging field of phenomics studies. Based on the location of landmarks (corresponding points) over imaged specimens and by combining geometry, multivariate analysis, and powerful statistical techniques, these tools offer the possibility to reproducibly and accurately account for shape variations among groups and measure them in shape distance units. Results: Here, a particular scheme of landmark placement on Arabidopsis rosette images is proposed to study shape variation in viral infection processes. Shape differences between controls and infected plants are quantified throughout the infectious process and visualized. Quantitative comparisons between two unrelated ssRNA+ viruses are shown, and reproducibility issues are assessed. Conclusions: Combined with the newest automated platforms and plant segmentation procedures, geometric morphometric tools could boost phenotypic features extraction and processing in an objective, reproducible manner.Fil: Manacorda, Carlos Augusto. Instituto Nacional de TecnologĂ­a Agropecuaria. Centro de InvestigaciĂłn en Ciencias Veterinarias y AgronĂłmicas. Instituto de BiotecnologĂ­a; ArgentinaFil: Asurmendi, Sebastian. Instituto Nacional de TecnologĂ­a Agropecuaria. Centro de InvestigaciĂłn en Ciencias Veterinarias y AgronĂłmicas. Instituto de BiotecnologĂ­a; Argentina. Consejo Nacional de Investigaciones CientĂ­ficas y TĂ©cnicas; Argentin

    The Unbalanced Gromov Wasserstein Distance: Conic Formulation and Relaxation

    Full text link
    Comparing metric measure spaces (i.e. a metric space endowed with aprobability distribution) is at the heart of many machine learning problems. The most popular distance between such metric measure spaces is theGromov-Wasserstein (GW) distance, which is the solution of a quadratic assignment problem. The GW distance is however limited to the comparison of metric measure spaces endowed with a probability distribution.To alleviate this issue, we introduce two Unbalanced Gromov-Wasserstein formulations: a distance and a more tractable upper-bounding relaxation.They both allow the comparison of metric spaces equipped with arbitrary positive measures up to isometries. The first formulation is a positive and definite divergence based on a relaxation of the mass conservation constraint using a novel type of quadratically-homogeneous divergence. This divergence works hand in hand with the entropic regularization approach which is popular to solve large scale optimal transport problems. We show that the underlying non-convex optimization problem can be efficiently tackled using a highly parallelizable and GPU-friendly iterative scheme. The second formulation is a distance between mm-spaces up to isometries based on a conic lifting. Lastly, we provide numerical experiments onsynthetic examples and domain adaptation data with a Positive-Unlabeled learning task to highlight the salient features of the unbalanced divergence and its potential applications in ML
    • …
    corecore