Search CORE

6,824 research outputs found

Regression analysis with compositional data containing zero values

Author: Tsagris Michail
Publication venue
Publication date: 08/08/2015
Field of study

Regression analysis with compositional data containing zero valuesComment: The paper has been accepted for publication in the Chilean Journal of Statistics. It consists of 12 pages with 4 figure

arXiv.org e-Print Archive

Munich RePEc Personal Archive

Stochastic Attraction-Repulsion Embedding for Large Scale Image Localization

Author: Dai Yuchao
Li Hongdong
Liu Liu
Publication venue
Publication date: 06/08/2019
Field of study

This paper tackles the problem of large-scale image-based localization (IBL) where the spatial location of a query image is determined by finding out the most similar reference images in a large database. For solving this problem, a critical task is to learn discriminative image representation that captures informative information relevant for localization. We propose a novel representation learning method having higher location-discriminating power. It provides the following contributions: 1) we represent a place (location) as a set of exemplar images depicting the same landmarks and aim to maximize similarities among intra-place images while minimizing similarities among inter-place images; 2) we model a similarity measure as a probability distribution on L_2-metric distances between intra-place and inter-place image representations; 3) we propose a new Stochastic Attraction and Repulsion Embedding (SARE) loss function minimizing the KL divergence between the learned and the actual probability distributions; 4) we give theoretical comparisons between SARE, triplet ranking and contrastive losses. It provides insights into why SARE is better by analyzing gradients. Our SARE loss is easy to implement and pluggable to any CNN. Experiments show that our proposed method improves the localization performance on standard benchmarks by a large margin. Demonstrating the broad applicability of our method, we obtained the third place out of 209 teams in the 2018 Google Landmark Retrieval Challenge. Our code and model are available at https://github.com/Liumouliu/deepIBL.Comment: ICC

arXiv.org e-Print Archive

Crossref

Extracting Biomolecular Interactions Using Semantic Parsing of Biomedical Text

Author: Galstyan Aram
Garg Sahil
Hermjakob Ulf
Marcu Daniel
Publication venue
Publication date: 04/12/2015
Field of study

We advance the state of the art in biomolecular interaction extraction with three contributions: (i) We show that deep, Abstract Meaning Representations (AMR) significantly improve the accuracy of a biomolecular interaction extraction system when compared to a baseline that relies solely on surface- and syntax-based features; (ii) In contrast with previous approaches that infer relations on a sentence-by-sentence basis, we expand our framework to enable consistent predictions over sets of sentences (documents); (iii) We further modify and expand a graph kernel learning framework to enable concurrent exploitation of automatically induced AMR (semantic) and dependency structure (syntactic) representations. Our experiments show that our approach yields interaction extraction systems that are more robust in environments where there is a significant mismatch between training and test conditions.Comment: Appearing in Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Approximate Bayesian computation via the energy statistic

Author: Arbel Julyan
Forbes Florence
Lü Hongliang
Nguyen Hien D.
Publication venue
Publication date: 30/06/2020
Field of study

Approximate Bayesian computation (ABC) has become an essential part of the Bayesian toolbox for addressing problems in which the likelihood is prohibitively expensive or entirely unknown, making it intractable. ABC defines a pseudo-posterior by comparing observed data with simulated data, traditionally based on some summary statistics, the elicitation of which is regarded as a key difficulty. Recently, using data discrepancy measures has been proposed in order to bypass the construction of summary statistics. Here we propose to use the importance-sampling ABC (IS-ABC) algorithm relying on the so-called two-sample energy statistic. We establish a new asymptotic result for the case where both the observed sample size and the simulated data sample size increase to infinity, which highlights to what extent the data discrepancy measure impacts the asymptotic pseudo-posterior. The result holds in the broad setting of IS-ABC methodologies, thus generalizing previous results that have been established only for rejection ABC algorithms. Furthermore, we propose a consistent V-statistic estimator of the energy statistic, under which we show that the large sample result holds, and prove that the rejection ABC algorithm, based on the energy statistic, generates pseudo-posterior distributions that achieves convergence to the correct limits, when implemented with rejection thresholds that converge to zero, in the finite sample setting. Our proposed energy statistic based ABC algorithm is demonstrated on a variety of models, including a Gaussian mixture, a moving-average model of order two, a bivariate beta and a multivariate

g

-and-

k

distribution. We find that our proposed method compares well with alternative discrepancy measures.Comment: 25 pages, 6 figures, 5 table

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Information theoretic novelty detection

Author: Anderson
Barnett
Bishop
Eguchi
Fisher
Guido Sanguinetti
Hayton
He
Horton
Markou
Martinez
Maurizio Filippone
Quinn
Roberts
Schölkopf
Singer
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

We present a novel approach to online change detection problems when the training sample size is small. The proposed approach is based on estimating the expected information content of a new data point and allows an accurate control of the false positive rate even for small data sets. In the case of the Gaussian distribution, our approach is analytically tractable and closely related to classical statistical tests. We then propose an approximation scheme to extend our approach to the case of the mixture of Gaussians. We evaluate extensively our approach on synthetic data and on three real benchmark data sets. The experimental validation shows that our method maintains a good overall accuracy, but significantly improves the control over the false positive rate

CiteSeerX

Crossref

Enlighten

White Rose Research Online