Search CORE

110,474 research outputs found

Large Margin Nearest Neighbor Embedding for Knowledge Representation

Author: Fan Miao
Grishman Ralph
Zheng Thomas Fang
Zhou Qiang
Publication venue
Publication date: 07/04/2015
Field of study

Traditional way of storing facts in triplets ({\it head\_entity, relation, tail\_entity}), abbreviated as ({\it h, r, t}), makes the knowledge intuitively displayed and easily acquired by mankind, but hardly computed or even reasoned by AI machines. Inspired by the success in applying {\it Distributed Representations} to AI-related fields, recent studies expect to represent each entity and relation with a unique low-dimensional embedding, which is different from the symbolic and atomic framework of displaying knowledge in triplets. In this way, the knowledge computing and reasoning can be essentially facilitated by means of a simple {\it vector calculation}, i.e.

{\bf h} + {\bf r} \approx {\bf t}

. We thus contribute an effective model to learn better embeddings satisfying the formula by pulling the positive tail entities

{\bf t^{+}}

to get together and close to {\bf h} + {\bf r} ({\it Nearest Neighbor}), and simultaneously pushing the negatives

{\bf t^{-}}

away from the positives

{\bf t^{+}}

via keeping a {\it Large Margin}. We also design a corresponding learning algorithm to efficiently find the optimal solution based on {\it Stochastic Gradient Descent} in iterative fashion. Quantitative experiments illustrate that our approach can achieve the state-of-the-art performance, compared with several latest methods on some benchmark datasets for two classical applications, i.e. {\it Link prediction} and {\it Triplet classification}. Moreover, we analyze the parameter complexities among all the evaluated models, and analytical results indicate that our model needs fewer computational resources on outperforming the other methods.Comment: arXiv admin note: text overlap with arXiv:1503.0815

arXiv.org e-Print Archive

Crossref

Approximated Computation of Belief Functions for Robust Design Optimization

Author: A.
Agarwal H.
Croisard N.
Denoeux T.
Helton J. C.
Helton J. C.
Larson J.
Mourela Z. P.
Patel Mukund
Vasile M.
Vasile M.
Publication venue
Publication date: 23/04/2012
Field of study

This paper presents some ideas to reduce the computational cost of evidence-based robust design optimization. Evidence Theory crystallizes both the aleatory and epistemic uncertainties in the design parameters, providing two quantitative measures, Belief and Plausibility, of the credibility of the computed value of the design budgets. The paper proposes some techniques to compute an approximation of Belief and Plausibility at a cost that is a fraction of the one required for an accurate calculation of the two values. Some simple test cases will show how the proposed techniques scale with the dimension of the problem. Finally a simple example of spacecraft system design is presented.Comment: AIAA-2012-1932 14th AIAA Non-Deterministic Approaches Conference. 23-26 April 2012 Sheraton Waikiki, Honolulu, Hawai

arXiv.org e-Print Archive

Crossref

University of Strathclyde Institutional Repository

Infinite factorization of multiple non-parametric views

Author: A. Gelman
A. Klami
A. Klami
A. Rodriguez
A. Vinokourov
Arto Klami
C. Archambeau
C. Rasmussen
D. Blackwell
D. Blei
D. Cohn
D. Lee
D. M. Blei
D. M. Roy
G. Englebienne
I. Rivals
I. S. Dhillon
Janne Sinkkonen
K. Barnard
M. Welling
Mark Girolami
N. Friedman
N. L. Johnson
R. M. Neal
S. Becker
S. Rogers
Samuel Kaski
Simon Rogers
T. Hofmann
Y. W. Teh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Combined analysis of multiple data sources has increasing application interest, in particular for distinguishing shared and source-specific aspects. We extend this rationale of classical canonical correlation analysis into a flexible, generative and non-parametric clustering setting, by introducing a novel non-parametric hierarchical mixture model. The lower level of the model describes each source with a flexible non-parametric mixture, and the top level combines these to describe commonalities of the sources. The lower-level clusters arise from hierarchical Dirichlet Processes, inducing an infinite-dimensional contingency table between the views. The commonalities between the sources are modeled by an infinite block model of the contingency table, interpretable as non-negative factorization of infinite matrices, or as a prior for infinite contingency tables. With Gaussian mixture components plugged in for continuous measurements, the model is applied to two views of genes, mRNA expression and abundance of the produced proteins, to expose groups of genes that are co-regulated in either or both of the views. Cluster analysis of co-expression is a standard simple way of screening for co-regulation, and the two-view analysis extends the approach to distinguishing between pre- and post-translational regulation

CUED - Cambridge University Engineering Department