121,389 research outputs found
Learning parameters for the sequence constraint from solutions
This paper studies the problem of learning parameters for
global constraints such as Sequence from a small set of positive examples.
The proposed technique computes the probability of observing a
given constraint in a random solution. This probability is used to select
the more likely constraint in a list of candidates. The learning method
can be applied to both soft and hard constraint
Exterior-point Optimization for Nonconvex Learning
In this paper we present the nonconvex exterior-point optimization solver
(NExOS) -- a novel first-order algorithm tailored to constrained nonconvex
learning problems. We consider the problem of minimizing a convex function over
nonconvex constraints, where the projection onto the constraint set is
single-valued around local minima. A wide range of nonconvex learning problems
have this structure including (but not limited to) sparse and low-rank
optimization problems. By exploiting the underlying geometry of the constraint
set, NExOS finds a locally optimal point by solving a sequence of penalized
problems with strictly decreasing penalty parameters. NExOS solves each
penalized problem by applying a first-order algorithm, which converges linearly
to a local minimum of the corresponding penalized formulation under regularity
conditions. Furthermore, the local minima of the penalized problems converge to
a local minimum of the original problem as the penalty parameter goes to zero.
We implement NExOS in the open-source Julia package NExOS.jl, which has been
extensively tested on many instances from a wide variety of learning problems.
We demonstrate that our algorithm, in spite of being general purpose,
outperforms specialized methods on several examples of well-known nonconvex
learning problems involving sparse and low-rank optimization. For sparse
regression problems, NExOS finds locally optimal solutions which dominate
glmnet in terms of support recovery, yet its training loss is smaller by an
order of magnitude. For low-rank optimization with real-world data, NExOS
recovers solutions with 3 fold training loss reduction, but with a proportion
of explained variance that is 2 times better compared to the nuclear norm
heuristic.Comment: 40 pages, 6 figure
Inference with Constrained Hidden Markov Models in PRISM
A Hidden Markov Model (HMM) is a common statistical model which is widely
used for analysis of biological sequence data and other sequential phenomena.
In the present paper we show how HMMs can be extended with side-constraints and
present constraint solving techniques for efficient inference. Defining HMMs
with side-constraints in Constraint Logic Programming have advantages in terms
of more compact expression and pruning opportunities during inference.
We present a PRISM-based framework for extending HMMs with side-constraints
and show how well-known constraints such as cardinality and all different are
integrated. We experimentally validate our approach on the biologically
motivated problem of global pairwise alignment
Solution Path Clustering with Adaptive Concave Penalty
Fast accumulation of large amounts of complex data has created a need for
more sophisticated statistical methodologies to discover interesting patterns
and better extract information from these data. The large scale of the data
often results in challenging high-dimensional estimation problems where only a
minority of the data shows specific grouping patterns. To address these
emerging challenges, we develop a new clustering methodology that introduces
the idea of a regularization path into unsupervised learning. A regularization
path for a clustering problem is created by varying the degree of sparsity
constraint that is imposed on the differences between objects via the minimax
concave penalty with adaptive tuning parameters. Instead of providing a single
solution represented by a cluster assignment for each object, the method
produces a short sequence of solutions that determines not only the cluster
assignment but also a corresponding number of clusters for each solution. The
optimization of the penalized loss function is carried out through an MM
algorithm with block coordinate descent. The advantages of this clustering
algorithm compared to other existing methods are as follows: it does not
require the input of the number of clusters; it is capable of simultaneously
separating irrelevant or noisy observations that show no grouping pattern,
which can greatly improve data interpretation; it is a general methodology that
can be applied to many clustering problems. We test this method on various
simulated datasets and on gene expression data, where it shows better or
competitive performance compared against several clustering methods.Comment: 36 page
Combination Strategies for Semantic Role Labeling
This paper introduces and analyzes a battery of inference models for the
problem of semantic role labeling: one based on constraint satisfaction, and
several strategies that model the inference as a meta-learning problem using
discriminative classifiers. These classifiers are developed with a rich set of
novel features that encode proposition and sentence-level information. To our
knowledge, this is the first work that: (a) performs a thorough analysis of
learning-based inference models for semantic role labeling, and (b) compares
several inference strategies in this context. We evaluate the proposed
inference strategies in the framework of the CoNLL-2005 shared task using only
automatically-generated syntactic information. The extensive experimental
evaluation and analysis indicates that all the proposed inference strategies
are successful -they all outperform the current best results reported in the
CoNLL-2005 evaluation exercise- but each of the proposed approaches has its
advantages and disadvantages. Several important traits of a state-of-the-art
SRL combination strategy emerge from this analysis: (i) individual models
should be combined at the granularity of candidate arguments rather than at the
granularity of complete solutions; (ii) the best combination strategy uses an
inference model based in learning; and (iii) the learning-based inference
benefits from max-margin classifiers and global feedback
- …