8,240 research outputs found
Information-gain computation
Despite large incentives, ecorrectness in software remains an elusive goal.
Declarative programming techniques, where algorithms are derived from a
specification of the desired behavior, offer hope to address this problem,
since there is a combinatorial reduction in complexity in programming in terms
of specifications instead of algorithms, and arbitrary desired properties can
be expressed and enforced in specifications directly. However, limitations on
performance have prevented programming with declarative specifications from
becoming a mainstream technique for general-purpose programming. To address the
performance bottleneck in deriving an algorithm from a specification, I propose
information-gain computation, a framework where an adaptive evaluation strategy
is used to efficiently perform a search which derives algorithms that provide
information about a query most directly. Within this framework, opportunities
to compress the search space present themselves, which suggest that
information-theoretic bounds on the performance of such a system might be
articulated and a system designed to achieve them. In a preliminary empirical
study of adaptive evaluation for a simple test program, the evaluation strategy
adapts successfully to evaluate a query efficiently.Comment: Accepted, PLP 2017 (http://www.stoics.org.uk/plp/plp2017/) Revised
version of project for Phys 256B @ Davis
(http://csc.ucdavis.edu/~chaos/courses/ncaso/) Reduces to practice ideas
previously presented in arXiv:1505.00002 and at
https://www.meetup.com/SF-Types-Theorems-and-Programming-Languages/events/232908199
Variational Particle Approximations
Approximate inference in high-dimensional, discrete probabilistic models is a
central problem in computational statistics and machine learning. This paper
describes discrete particle variational inference (DPVI), a new approach that
combines key strengths of Monte Carlo, variational and search-based techniques.
DPVI is based on a novel family of particle-based variational approximations
that can be fit using simple, fast, deterministic search techniques. Like Monte
Carlo, DPVI can handle multiple modes, and yields exact results in a
well-defined limit. Like unstructured mean-field, DPVI is based on optimizing a
lower bound on the partition function; when this quantity is not of intrinsic
interest, it facilitates convergence assessment and debugging. Like both Monte
Carlo and combinatorial search, DPVI can take advantage of factorization,
sequential structure, and custom search operators. This paper defines DPVI
particle-based approximation family and partition function lower bounds, along
with the sequential DPVI and local DPVI algorithm templates for optimizing
them. DPVI is illustrated and evaluated via experiments on lattice Markov
Random Fields, nonparametric Bayesian mixtures and block-models, and parametric
as well as non-parametric hidden Markov models. Results include applications to
real-world spike-sorting and relational modeling problems, and show that DPVI
can offer appealing time/accuracy trade-offs as compared to multiple
alternatives.Comment: First two authors contributed equally to this wor
GMNN: Graph Markov Neural Networks
This paper studies semi-supervised object classification in relational data,
which is a fundamental problem in relational data modeling. The problem has
been extensively studied in the literature of both statistical relational
learning (e.g. relational Markov networks) and graph neural networks (e.g.
graph convolutional networks). Statistical relational learning methods can
effectively model the dependency of object labels through conditional random
fields for collective classification, whereas graph neural networks learn
effective object representations for classification through end-to-end
training. In this paper, we propose the Graph Markov Neural Network (GMNN) that
combines the advantages of both worlds. A GMNN models the joint distribution of
object labels with a conditional random field, which can be effectively trained
with the variational EM algorithm. In the E-step, one graph neural network
learns effective object representations for approximating the posterior
distributions of object labels. In the M-step, another graph neural network is
used to model the local label dependency. Experiments on object classification,
link classification, and unsupervised node representation learning show that
GMNN achieves state-of-the-art results.Comment: icml 201
A Credit Assignment Compiler for Joint Prediction
Many machine learning applications involve jointly predicting multiple
mutually dependent output variables. Learning to search is a family of methods
where the complex decision problem is cast into a sequence of decisions via a
search space. Although these methods have shown promise both in theory and in
practice, implementing them has been burdensomely awkward. In this paper, we
show the search space can be defined by an arbitrary imperative program,
turning learning to search into a credit assignment compiler. Altogether with
the algorithmic improvements for the compiler, we radically reduce the
complexity of programming and the running time. We demonstrate the feasibility
of our approach on multiple joint prediction tasks. In all cases, we obtain
accuracies as high as alternative approaches, at drastically reduced execution
and programming time
Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering
We propose a new class of probabilistic neural-symbolic models, that have
symbolic functional programs as a latent, stochastic variable. Instantiated in
the context of visual question answering, our probabilistic formulation offers
two key conceptual advantages over prior neural-symbolic models for VQA.
Firstly, the programs generated by our model are more understandable while
requiring lesser number of teaching examples. Secondly, we show that one can
pose counterfactual scenarios to the model, to probe its beliefs on the
programs that could lead to a specified answer given an image. Our results on
the CLEVR and SHAPES datasets verify our hypotheses, showing that the model
gets better program (and answer) prediction accuracy even in the low data
regime, and allows one to probe the coherence and consistency of reasoning
performed.Comment: ICML 2019 Camera Ready + Appendi
Hinge-Loss Markov Random Fields and Probabilistic Soft Logic
A fundamental challenge in developing high-impact machine learning
technologies is balancing the need to model rich, structured domains with the
ability to scale to big data. Many important problem areas are both richly
structured and large scale, from social and biological networks, to knowledge
graphs and the Web, to images, video, and natural language. In this paper, we
introduce two new formalisms for modeling structured data, and show that they
can both capture rich structure and scale to big data. The first, hinge-loss
Markov random fields (HL-MRFs), is a new kind of probabilistic graphical model
that generalizes different approaches to convex inference. We unite three
approaches from the randomized algorithms, probabilistic graphical models, and
fuzzy logic communities, showing that all three lead to the same inference
objective. We then define HL-MRFs by generalizing this unified objective. The
second new formalism, probabilistic soft logic (PSL), is a probabilistic
programming language that makes HL-MRFs easy to define using a syntax based on
first-order logic. We introduce an algorithm for inferring most-probable
variable assignments (MAP inference) that is much more scalable than
general-purpose convex optimization methods, because it uses message passing to
take advantage of sparse dependency structures. We then show how to learn the
parameters of HL-MRFs. The learned HL-MRFs are as accurate as analogous
discrete models, but much more scalable. Together, these algorithms enable
HL-MRFs and PSL to model rich, structured data at scales not previously
possible
Adaptive Neighborhood Graph Construction for Inference in Multi-Relational Networks
A neighborhood graph, which represents the instances as vertices and their
relations as weighted edges, is the basis of many semi-supervised and
relational models for node labeling and link prediction. Most methods employ a
sequential process to construct the neighborhood graph. This process often
consists of generating a candidate graph, pruning the candidate graph to make a
neighborhood graph, and then performing inference on the variables (i.e.,
nodes) in the neighborhood graph. In this paper, we propose a framework that
can dynamically adapt the neighborhood graph based on the states of variables
from intermediate inference results, as well as structural properties of the
relations connecting them. A key strength of our framework is its ability to
handle multi-relational data and employ varying amounts of relations for each
instance based on the intermediate inference results. We formulate the link
prediction task as inference on neighborhood graphs, and include preliminary
results illustrating the effects of different strategies in our proposed
framework.Comment: Presented at SIGKDD 12th International Workshop on Mining and
Learning with Graphs (MLG'16
Cognitive Database: A Step towards Endowing Relational Databases with Artificial Intelligence Capabilities
We propose Cognitive Databases, an approach for transparently enabling
Artificial Intelligence (AI) capabilities in relational databases. A novel
aspect of our design is to first view the structured data source as meaningful
unstructured text, and then use the text to build an unsupervised neural
network model using a Natural Language Processing (NLP) technique called word
embedding. This model captures the hidden inter-/intra-column relationships
between database tokens of different types. For each database token, the model
includes a vector that encodes contextual semantic relationships. We seamlessly
integrate the word embedding model into existing SQL query infrastructure and
use it to enable a new class of SQL-based analytics queries called cognitive
intelligence (CI) queries. CI queries use the model vectors to enable complex
queries such as semantic matching, inductive reasoning queries such as
analogies, predictive queries using entities not present in a database, and,
more generally, using knowledge from external sources. We demonstrate unique
capabilities of Cognitive Databases using an Apache Spark based prototype to
execute inductive reasoning CI queries over a multi-modal database containing
text and images. We believe our first-of-a-kind system exemplifies using AI
functionality to endow relational databases with capabilities that were
previously very hard to realize in practice
Janossy Pooling: Learning Deep Permutation-Invariant Functions for Variable-Size Inputs
We consider a simple and overarching representation for permutation-invariant
functions of sequences (or multiset functions). Our approach, which we call
Janossy pooling, expresses a permutation-invariant function as the average of a
permutation-sensitive function applied to all reorderings of the input
sequence. This allows us to leverage the rich and mature literature on
permutation-sensitive functions to construct novel and flexible
permutation-invariant functions. If carried out naively, Janossy pooling can be
computationally prohibitive. To allow computational tractability, we consider
three kinds of approximations: canonical orderings of sequences, functions with
-order interactions, and stochastic optimization algorithms with random
permutations. Our framework unifies a variety of existing work in the
literature, and suggests possible modeling and algorithmic extensions. We
explore a few in our experiments, which demonstrate improved performance over
current state-of-the-art methods.Comment: This version clarifies and adds detail to some of the argument
Ensemble Relational Learning based on Selective Propositionalization
Dealing with structured data needs the use of expressive representation
formalisms that, however, puts the problem to deal with the computational
complexity of the machine learning process. Furthermore, real world domains
require tools able to manage their typical uncertainty. Many statistical
relational learning approaches try to deal with these problems by combining the
construction of relevant relational features with a probabilistic tool. When
the combination is static (static propositionalization), the constructed
features are considered as boolean features and used offline as input to a
statistical learner; while, when the combination is dynamic (dynamic
propositionalization), the feature construction and probabilistic tool are
combined into a single process. In this paper we propose a selective
propositionalization method that search the optimal set of relational features
to be used by a probabilistic learner in order to minimize a loss function. The
new propositionalization approach has been combined with the random subspace
ensemble method. Experiments on real-world datasets shows the validity of the
proposed method.Comment: 10 pages. arXiv admin note: text overlap with arXiv:1006.518
- …