Search CORE

2,996 research outputs found

Open Vocabulary Learning on Source Code with a Graph-Structured Cache

Author: Anandkumar Anima
Cvitkovic Milan
Singh Badal
Publication venue
Publication date: 19/05/2019
Field of study

Machine learning models that take computer program source code as input typically use Natural Language Processing (NLP) techniques. However, a major challenge is that code is written using an open, rapidly changing vocabulary due to, e.g., the coinage of new variable and method names. Reasoning over such a vocabulary is not something for which most NLP methods are designed. We introduce a Graph-Structured Cache to address this problem; this cache contains a node for each new word the model encounters with edges connecting each word to its occurrences in the code. We find that combining this graph-structured cache strategy with recent Graph-Neural-Network-based models for supervised learning on code improves the models' performance on a code completion task and a variable naming task --- with over

100\%

relative improvement on the latter --- at the cost of a moderate increase in computation time.Comment: Published in the International Conference on Machine Learning (ICML 2019), 13 page

arXiv.org e-Print Archive

Caltech Authors

Inference by Minimizing Size, Divergence, or their Sum

Author: McCallum Andrew
Riedel Sebastian
Smith David A.
Publication venue
Publication date: 01/01/2012
Field of study

We speed up marginal inference by ignoring factors that do not significantly contribute to overall accuracy. In order to pick a suitable subset of factors to ignore, we propose three schemes: minimizing the number of model factors under a bound on the KL divergence between pruned and full models; minimizing the KL divergence under a bound on factor count; and minimizing the weighted sum of KL divergence and factor count. All three problems are solved using an approximation of the KL divergence than can be calculated in terms of marginals computed on a simple seed graph. Applied to synthetic image denoising and to three different types of NLP parsing models, this technique performs marginal inference up to 11 times faster than loopy BP, with graph sizes reduced up to 98%-at comparable error in marginals and parsing accuracy. We also show that minimizing the weighted sum of divergence and size is substantially faster than minimizing either of the other objectives based on the approximation to divergence presented here.Comment: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010

arXiv.org e-Print Archive

CiteSeerX

ScholarWorks@UMass Amherst

Coarse-to-Fine Lifted MAP Inference in Computer Vision

Author: Anand Ankit
Habeeb Haroun
Mausam
Singla Parag
Publication venue: 'International Joint Conferences on Artificial Intelligence'
Publication date: 22/07/2017
Field of study

There is a vast body of theoretical research on lifted inference in probabilistic graphical models (PGMs). However, few demonstrations exist where lifting is applied in conjunction with top of the line applied algorithms. We pursue the applicability of lifted inference for computer vision (CV), with the insight that a globally optimal (MAP) labeling will likely have the same label for two symmetric pixels. The success of our approach lies in efficiently handling a distinct unary potential on every node (pixel), typical of CV applications. This allows us to lift the large class of algorithms that model a CV problem via PGM inference. We propose a generic template for coarse-to-fine (C2F) inference in CV, which progressively refines an initial coarsely lifted PGM for varying quality-time trade-offs. We demonstrate the performance of C2F inference by developing lifted versions of two near state-of-the-art CV algorithms for stereo vision and interactive image segmentation. We find that, against flat algorithms, the lifted versions have a much superior anytime performance, without any loss in final solution quality.Comment: Published in IJCAI 201

arXiv.org e-Print Archive

Crossref

When Are Tree Structures Necessary for Deep Learning of Representations?

Author: Hovy Eudard
Jurafsky Dan
Li Jiwei
Luong Minh-Thang
Publication venue
Publication date: 01/01/2015
Field of study

Recursive neural models, which use syntactic parse trees to recursively generate representations bottom-up, are a popular architecture. But there have not been rigorous evaluations showing for exactly which tasks this syntax-based method is appropriate. In this paper we benchmark {\bf recursive} neural models against sequential {\bf recurrent} neural models (simple recurrent and LSTM models), enforcing apples-to-apples comparison as much as possible. We investigate 4 tasks: (1) sentiment classification at the sentence level and phrase level; (2) matching questions to answer-phrases; (3) discourse parsing; (4) semantic relation extraction (e.g., {\em component-whole} between nouns). Our goal is to understand better when, and why, recursive models can outperform simpler models. We find that recursive models help mainly on tasks (like semantic relation extraction) that require associating headwords across a long distance, particularly on very long sequences. We then introduce a method for allowing recurrent models to achieve similar performance: breaking long sentences into clause-like units at punctuation and processing them separately before combining. Our results thus help understand the limitations of both classes of models, and suggest directions for improving recurrent models

arXiv.org e-Print Archive

CiteSeerX

Crossref

Integrative Use of Information Extraction, Semantic Matchmaking and Adaptive Coupling Techniques in Support of Distributed Information Processing and Decision-Making

Author: Braines Dave
Kao Anne
Poteet Stephen
Preece Alun
Smart Paul R
Xue Ping
Publication venue
Publication date: 17/08/2011
Field of study

In order to press maximal cognitive benefit from their social, technological and informational environments, military coalitions need to understand how best to exploit available information assets as well as how best to organize their socially-distributed information processing activities. The International Technology Alliance (ITA) program is beginning to address the challenges associated with enhanced cognition in military coalition environments by integrating a variety of research and development efforts. In particular, research in one component of the ITA ('Project 4: Shared Understanding and Information Exploitation') is seeking to develop capabilities that enable military coalitions to better exploit and distribute networked information assets in the service of collective cognitive outcomes (e.g. improved decision-making). In this paper, we provide an overview of the various research activities in Project 4. We also show how these research activities complement one another in terms of supporting coalition-based collective cognition

Southampton (e-Prints Soton)