Search CORE

1,718 research outputs found

Hyperspherical Prototype Networks

Author: Mettes Pascal
Snoek Cees G. M.
van der Pol Elise
Publication venue
Publication date: 25/10/2019
Field of study

This paper introduces hyperspherical prototype networks, which unify classification and regression with prototypes on hyperspherical output spaces. For classification, a common approach is to define prototypes as the mean output vector over training examples per class. Here, we propose to use hyperspheres as output spaces, with class prototypes defined a priori with large margin separation. We position prototypes through data-independent optimization, with an extension to incorporate priors from class semantics. By doing so, we do not require any prototype updating, we can handle any training size, and the output dimensionality is no longer constrained to the number of classes. Furthermore, we generalize to regression, by optimizing outputs as an interpolation between two prototypes on the hypersphere. Since both tasks are now defined by the same loss function, they can be jointly trained for multi-task problems. Experimentally, we show the benefit of hyperspherical prototype networks for classification, regression, and their combination over other prototype methods, softmax cross-entropy, and mean squared error approaches.Comment: NeurIPS 201

arXiv.org e-Print Archive

Neural Metric Learning for Fast End-to-End Relation Extraction

Author: Kavuluru Ramakanth
Tran Tung
Publication venue
Publication date: 27/08/2019
Field of study

Relation extraction (RE) is an indispensable information extraction task in several disciplines. RE models typically assume that named entity recognition (NER) is already performed in a previous step by another independent model. Several recent efforts, under the theme of end-to-end RE, seek to exploit inter-task correlations by modeling both NER and RE tasks jointly. Earlier work in this area commonly reduces the task to a table-filling problem wherein an additional expensive decoding step involving beam search is applied to obtain globally consistent cell labels. In efforts that do not employ table-filling, global optimization in the form of CRFs with Viterbi decoding for the NER component is still necessary for competitive performance. We introduce a novel neural architecture utilizing the table structure, based on repeated applications of 2D convolutions for pooling local dependency and metric-based features, that improves on the state-of-the-art without the need for global optimization. We validate our model on the ADE and CoNLL04 datasets for end-to-end RE and demonstrate

\approx 1\%

gain (in F-score) over prior best results with training and testing times that are seven to ten times faster --- the latter highly advantageous for time-sensitive end user applications

arXiv.org e-Print Archive

Word, graph and manifold embedding from Markov processes

Author: Alvarez-Melis David
Hashimoto Tatsunori B.
Jaakkola Tommi S.
Publication venue
Publication date: 18/09/2015
Field of study

Continuous vector representations of words and objects appear to carry surprisingly rich semantic content. In this paper, we advance both the conceptual and theoretical understanding of word embeddings in three ways. First, we ground embeddings in semantic spaces studied in cognitive-psychometric literature and introduce new evaluation tasks. Second, in contrast to prior work, we take metric recovery as the key object of study, unify existing algorithms as consistent metric recovery methods based on co-occurrence counts from simple Markov random walks, and propose a new recovery algorithm. Third, we generalize metric recovery to graphs and manifolds, relating co-occurence counts on random walks in graphs and random processes on manifolds to the underlying metric to be recovered, thereby reconciling manifold estimation and embedding algorithms. We compare embedding algorithms across a range of tasks, from nonlinear dimensionality reduction to three semantic language tasks, including analogies, sequence completion, and classification

arXiv.org e-Print Archive

On Approximation Guarantees for Greedy Low Rank Optimization

Author: Dimakis Alexandros G.
Elenberg Ethan
Khanna Rajiv
Negahban Sahand
Publication venue
Publication date: 08/03/2017
Field of study

We provide new approximation guarantees for greedy low rank matrix estimation under standard assumptions of restricted strong convexity and smoothness. Our novel analysis also uncovers previously unknown connections between the low rank estimation and combinatorial optimization, so much so that our bounds are reminiscent of corresponding approximation bounds in submodular maximization. Additionally, we also provide statistical recovery guarantees. Finally, we present empirical comparison of greedy estimation with established baselines on two important real-world problems

arXiv.org e-Print Archive

Learning Cross-lingual Embeddings from Twitter via Distant Supervision

Author: Barbieri Francesco
Camacho-Collados Jose
Doval Yerai
Espinosa-Anke Luis
Martínez-Cámara Eugenio
Schockaert Steven
Publication venue
Publication date: 31/03/2020
Field of study

Cross-lingual embeddings represent the meaning of words from different languages in the same vector space. Recent work has shown that it is possible to construct such representations by aligning independently learned monolingual embedding spaces, and that accurate alignments can be obtained even without external bilingual data. In this paper we explore a research direction that has been surprisingly neglected in the literature: leveraging noisy user-generated text to learn cross-lingual embeddings particularly tailored towards social media applications. While the noisiness and informal nature of the social media genre poses additional challenges to cross-lingual embedding methods, we find that it also provides key opportunities due to the abundance of code-switching and the existence of a shared vocabulary of emoji and named entities. Our contribution consists of a very simple post-processing step that exploits these phenomena to significantly improve the performance of state-of-the-art alignment methods.Comment: Accepted to ICWSM 2020. 11 pages, 1 appendix. Pre-trained embeddings available at https://github.com/pedrada88/crossembeddings-twitte

arXiv.org e-Print Archive

Adversarial Gain

Author: Henderson Peter
Ke Rosemary Nan
Pineau Joelle
Sinha Koustuv
Publication venue
Publication date: 03/11/2018
Field of study

Adversarial examples can be defined as inputs to a model which induce a mistake - where the model output is different than that of an oracle, perhaps in surprising or malicious ways. Original models of adversarial attacks are primarily studied in the context of classification and computer vision tasks. While several attacks have been proposed in natural language processing (NLP) settings, they often vary in defining the parameters of an attack and what a successful attack would look like. The goal of this work is to propose a unifying model of adversarial examples suitable for NLP tasks in both generative and classification settings. We define the notion of adversarial gain: based in control theory, it is a measure of the change in the output of a system relative to the perturbation of the input (caused by the so-called adversary) presented to the learner. This definition, as we show, can be used under different feature spaces and distance conditions to determine attack or defense effectiveness across different intuitive manifolds. This notion of adversarial gain not only provides a useful way for evaluating adversaries and defenses, but can act as a building block for future work in robustness under adversaries due to its rooted nature in stability and manifold theory

arXiv.org e-Print Archive

Unsupervised Inductive Graph-Level Representation Learning via Graph-Graph Proximity

Author: Bai Yunsheng
Chen Ting
Ding Hao
Gu Ken
Marinovic Agustin
Qiao Yang
Sun Yizhou
Wang Wei
Publication venue
Publication date: 02/06/2019
Field of study

We introduce a novel approach to graph-level representation learning, which is to embed an entire graph into a vector space where the embeddings of two graphs preserve their graph-graph proximity. Our approach, UGRAPHEMB, is a general framework that provides a novel means to performing graph-level embedding in a completely unsupervised and inductive manner. The learned neural network can be considered as a function that receives any graph as input, either seen or unseen in the training set, and transforms it into an embedding. A novel graph-level embedding generation mechanism called Multi-Scale Node Attention (MSNA), is proposed. Experiments on five real graph datasets show that UGRAPHEMB achieves competitive accuracy in the tasks of graph classification, similarity ranking, and graph visualization.Comment: IJCAI 2019 camera ready version with supplementary materia

arXiv.org e-Print Archive

Representing Sets as Summed Semantic Vectors

Author: Li Dandan
Summers-Stay Douglas
Sutor Peter
Publication venue
Publication date: 24/09/2018
Field of study

Representing meaning in the form of high dimensional vectors is a common and powerful tool in biologically inspired architectures. While the meaning of a set of concepts can be summarized by taking a (possibly weighted) sum of their associated vectors, this has generally been treated as a one-way operation. In this paper we show how a technique built to aid sparse vector decomposition allows in many cases the exact recovery of the inputs and weights to such a sum, allowing a single vector to represent an entire set of vectors from a dictionary. We characterize the number of vectors that can be recovered under various conditions, and explore several ways such a tool can be used for vector-based reasoning.Comment: In Biologically Inspired Cognitive Architectures 201

arXiv.org e-Print Archive

Interactions of Computational Complexity Theory and Mathematics

Author: Wigderson Avi
Publication venue
Publication date: 26/10/2017
Field of study

[This paper is a (self contained) chapter in a new book, Mathematics and Computation, whose draft is available on my homepage at https://www.math.ias.edu/avi/book ]. We survey some concrete interaction areas between computational complexity theory and different fields of mathematics. We hope to demonstrate here that hardly any area of modern mathematics is untouched by the computational connection (which in some cases is completely natural and in others may seem quite surprising). In my view, the breadth, depth, beauty and novelty of these connections is inspiring, and speaks to a great potential of future interactions (which indeed, are quickly expanding). We aim for variety. We give short, simple descriptions (without proofs or much technical detail) of ideas, motivations, results and connections; this will hopefully entice the reader to dig deeper. Each vignette focuses only on a single topic within a large mathematical filed. We cover the following:

\bullet

Number Theory: Primality testing

\bullet

Combinatorial Geometry: Point-line incidences

\bullet

Operator Theory: The Kadison-Singer problem

\bullet

Metric Geometry: Distortion of embeddings

\bullet

Group Theory: Generation and random generation

\bullet

Statistical Physics: Monte-Carlo Markov chains

\bullet

Analysis and Probability: Noise stability

\bullet

Lattice Theory: Short vectors

\bullet

Invariant Theory: Actions on matrix tuplesComment: 27 page

arXiv.org e-Print Archive

The Order Dimension of the Poset of Regions in a Hyperplane Arrangement

Author: Reading Nathan
Publication venue
Publication date: 12/08/2003
Field of study

We show that the order dimension of the weak order on a Coxeter group of type A, B or D is equal to the rank of the Coxeter group, and give bounds on the order dimensions for the other finite types. This result arises from a unified approach which, in particular, leads to a simpler treatment of the previously known cases, types A and B. The result for weak orders follows from an upper bound on the dimension of the poset of regions of an arbitrary hyperplane arrangement. In some cases, including the weak orders, the upper bound is the chromatic number of a certain graph. For the weak orders, this graph has the positive roots as its vertex set, and the edges are related to the pairwise inner products of the roots.Comment: Minor changes, including a correction and an added figure in the proof of Proposition 2.2. 19 pages, 6 figure

arXiv.org e-Print Archive

CiteSeerX