724 research outputs found
Non-deterministic Behavior of Ranking-based Metrics when Evaluating Embeddings
Embedding data into vector spaces is a very popular strategy of pattern
recognition methods. When distances between embeddings are quantized,
performance metrics become ambiguous. In this paper, we present an analysis of
the ambiguity quantized distances introduce and provide bounds on the effect.
We demonstrate that it can have a measurable effect in empirical data in
state-of-the-art systems. We also approach the phenomenon from a computer
security perspective and demonstrate how someone being evaluated by a third
party can exploit this ambiguity and greatly outperform a random predictor
without even access to the input data. We also suggest a simple solution making
the performance metrics, which rely on ranking, totally deterministic and
impervious to such exploits
Acquiring Common Sense Spatial Knowledge through Implicit Spatial Templates
Spatial understanding is a fundamental problem with wide-reaching real-world
applications. The representation of spatial knowledge is often modeled with
spatial templates, i.e., regions of acceptability of two objects under an
explicit spatial relationship (e.g., "on", "below", etc.). In contrast with
prior work that restricts spatial templates to explicit spatial prepositions
(e.g., "glass on table"), here we extend this concept to implicit spatial
language, i.e., those relationships (generally actions) for which the spatial
arrangement of the objects is only implicitly implied (e.g., "man riding
horse"). In contrast with explicit relationships, predicting spatial
arrangements from implicit spatial language requires significant common sense
spatial understanding. Here, we introduce the task of predicting spatial
templates for two objects under a relationship, which can be seen as a spatial
question-answering task with a (2D) continuous output ("where is the man w.r.t.
a horse when the man is walking the horse?"). We present two simple
neural-based models that leverage annotated images and structured text to learn
this task. The good performance of these models reveals that spatial locations
are to a large extent predictable from implicit spatial language. Crucially,
the models attain similar performance in a challenging generalized setting,
where the object-relation-object combinations (e.g.,"man walking dog") have
never been seen before. Next, we go one step further by presenting the models
with unseen objects (e.g., "dog"). In this scenario, we show that leveraging
word embeddings enables the models to output accurate spatial predictions,
proving that the models acquire solid common sense spatial knowledge allowing
for such generalization.Comment: To appear at AAAI 2018 Conferenc
Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning
Knowledge bases (KB), both automatically and manually constructed, are often
incomplete --- many valid facts can be inferred from the KB by synthesizing
existing information. A popular approach to KB completion is to infer new
relations by combinatory reasoning over the information found along other paths
connecting a pair of entities. Given the enormous size of KBs and the
exponential number of paths, previous path-based models have considered only
the problem of predicting a missing relation given two entities or evaluating
the truth of a proposed triple. Additionally, these methods have traditionally
used random paths between fixed entity pairs or more recently learned to pick
paths between them. We propose a new algorithm MINERVA, which addresses the
much more difficult and practical task of answering questions where the
relation is known, but only one entity. Since random walks are impractical in a
setting with combinatorially many destinations from a start node, we present a
neural reinforcement learning approach which learns how to navigate the graph
conditioned on the input query to find predictive paths. Empirically, this
approach obtains state-of-the-art results on several datasets, significantly
outperforming prior methods.Comment: ICLR 201
Towards Generating Functionally Correct Code Edits from Natural Language Issue Descriptions
Large language models (LLMs), such as OpenAI's Codex, have demonstrated their
potential to generate code from natural language descriptions across a wide
range of programming tasks. Several benchmarks have recently emerged to
evaluate the ability of LLMs to generate functionally correct code from natural
language intent with respect to a set of hidden test cases. This has enabled
the research community to identify significant and reproducible advancements in
LLM capabilities. However, there is currently a lack of benchmark datasets for
assessing the ability of LLMs to generate functionally correct code edits based
on natural language descriptions of intended changes. This paper aims to
address this gap by motivating the problem NL2Fix of translating natural
language descriptions of code changes (namely bug fixes described in Issue
reports in repositories) into correct code fixes. To this end, we introduce
Defects4J-NL2Fix, a dataset of 283 Java programs from the popular Defects4J
dataset augmented with high-level descriptions of bug fixes, and empirically
evaluate the performance of several state-of-the-art LLMs for the this task.
Results show that these LLMS together are capable of generating plausible fixes
for 64.6% of the bugs, and the best LLM-based technique can achieve up to
21.20% top-1 and 35.68% top-5 accuracy on this benchmark
Bio-inspired Structure Identification in Language Embeddings
Word embeddings are a popular way to improve downstream performances in
contemporary language modeling. However, the underlying geometric structure of
the embedding space is not well understood. We present a series of explorations
using bio-inspired methodology to traverse and visualize word embeddings,
demonstrating evidence of discernible structure. Moreover, our model also
produces word similarity rankings that are plausible yet very different from
common similarity metrics, mainly cosine similarity and Euclidean distance. We
show that our bio-inspired model can be used to investigate how different word
embedding techniques result in different semantic outputs, which can emphasize
or obscure particular interpretations in textual data.Comment: 7 pages, 8 figures, 2 tables, Visualisation for the Digital
Humanities 2020. Comments: Fixed white spaces in abstrac
Skill Rating for Generative Models
We explore a new way to evaluate generative models using insights from
evaluation of competitive games between human players. We show experimentally
that tournaments between generators and discriminators provide an effective way
to evaluate generative models. We introduce two methods for summarizing
tournament outcomes: tournament win rate and skill rating. Evaluations are
useful in different contexts, including monitoring the progress of a single
model as it learns during the training process, and comparing the capabilities
of two different fully trained models. We show that a tournament consisting of
a single model playing against past and future versions of itself produces a
useful measure of training progress. A tournament containing multiple separate
models (using different seeds, hyperparameters, and architectures) provides a
useful relative comparison between different trained GANs. Tournament-based
rating methods are conceptually distinct from numerous previous categories of
approaches to evaluation of generative models, and have complementary
advantages and disadvantages
RoboCSE: Robot Common Sense Embedding
Autonomous service robots require computational frameworks that allow them to
generalize knowledge to new situations in a manner that models uncertainty
while scaling to real-world problem sizes. The Robot Common Sense Embedding
(RoboCSE) showcases a class of computational frameworks, multi-relational
embeddings, that have not been leveraged in robotics to model semantic
knowledge. We validate RoboCSE on a realistic home environment simulator
(AI2Thor) to measure how well it generalizes learned knowledge about object
affordances, locations, and materials. Our experiments show that RoboCSE can
perform prediction better than a baseline that uses pre-trained embeddings,
such as Word2Vec, achieving statistically significant improvements while using
orders of magnitude less memory than our Bayesian Logic Network baseline. In
addition, we show that predictions made by RoboCSE are robust to significant
reductions in data available for training as well as domain transfer to
MatterPort3D, achieving statistically significant improvements over a baseline
that memorizes training data.Comment: 7 pages, 7 figures, accepted for publication in 2019 IEEE ICRA
proceeding
Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian Nonparametric Approach
A significant amount of search queries originate from some real world
information need or tasks. In order to improve the search experience of the end
users, it is important to have accurate representations of tasks. As a result,
significant amount of research has been devoted to extracting proper
representations of tasks in order to enable search systems to help users
complete their tasks, as well as providing the end user with better query
suggestions, for better recommendations, for satisfaction prediction, and for
improved personalization in terms of tasks. Most existing task extraction
methodologies focus on representing tasks as flat structures. However, tasks
often tend to have multiple subtasks associated with them and a more
naturalistic representation of tasks would be in terms of a hierarchy, where
each task can be composed of multiple (sub)tasks. To this end, we propose an
efficient Bayesian nonparametric model for extracting hierarchies of such tasks
\& subtasks. We evaluate our method based on real world query log data both
through quantitative and crowdsourced experiments and highlight the importance
of considering task/subtask hierarchies.Comment: 10 pages. Accepted at SIGIR 2017 as a full pape
On Inductive Abilities of Latent Factor Models for Relational Learning
Latent factor models are increasingly popular for modeling multi-relational
knowledge graphs. By their vectorial nature, it is not only hard to interpret
why this class of models works so well, but also to understand where they fail
and how they might be improved. We conduct an experimental survey of
state-of-the-art models, not towards a purely comparative end, but as a means
to get insight about their inductive abilities. To assess the strengths and
weaknesses of each model, we create simple tasks that exhibit first, atomic
properties of binary relations, and then, common inter-relational inference
through synthetic genealogies. Based on these experimental results, we propose
new research directions to improve on existing models.Comment: 30+3 pages, submitted to the Journal of Artificial Intelligence
Researc
Knowledge Graphs for Multilingual Language Translation and Generation
The Natural Language Processing (NLP) community has recently seen outstanding
progress, catalysed by the release of different Neural Network (NN)
architectures. Neural-based approaches have proven effective by significantly
increasing the output quality of a large number of automated solutions for NLP
tasks (Belinkov and Glass, 2019). Despite these notable advancements, dealing
with entities still poses a difficult challenge as they are rarely seen in
training data. Entities can be classified into two groups, i.e., proper nouns
and common nouns. Proper nouns are also known as Named Entities (NE) and
correspond to the name of people, organizations, or locations, e.g., John, WHO,
or Canada. Common nouns describe classes of objects, e.g., spoon or cancer.
Both types of entities can be found in a Knowledge Graph (KG). Recent work has
successfully exploited the contribution of KGs in NLP tasks, such as Natural
Language Inference (NLI) (KM et al.,2018) and Question Answering (QA) (Sorokin
and Gurevych, 2018). Only a few works had exploited the benefits of KGs in
Neural Machine Translation (NMT) when the work presented herein began.
Additionally, few works had studied the contribution of KGs to Natural Language
Generation (NLG) tasks. Moreover, the multilinguality also remained an open
research area in these respective tasks (Young et al., 2018). In this thesis,
we focus on the use of KGs for machine translation and the generation of texts
to deal with the problems caused by entities and consequently enhance the
quality of automatically generated texts
- …