4,292 research outputs found
Integrated Inference and Learning of Neural Factors in Structural Support Vector Machines
Tackling pattern recognition problems in areas such as computer vision,
bioinformatics, speech or text recognition is often done best by taking into
account task-specific statistical relations between output variables. In
structured prediction, this internal structure is used to predict multiple
outputs simultaneously, leading to more accurate and coherent predictions.
Structural support vector machines (SSVMs) are nonprobabilistic models that
optimize a joint input-output function through margin-based learning. Because
SSVMs generally disregard the interplay between unary and interaction factors
during the training phase, final parameters are suboptimal. Moreover, its
factors are often restricted to linear combinations of input features, limiting
its generalization power. To improve prediction accuracy, this paper proposes:
(i) Joint inference and learning by integration of back-propagation and
loss-augmented inference in SSVM subgradient descent; (ii) Extending SSVM
factors to neural networks that form highly nonlinear functions of input
features. Image segmentation benchmark results demonstrate improvements over
conventional SSVM training methods in terms of accuracy, highlighting the
feasibility of end-to-end SSVM training with neural factors
Fast, Exact and Multi-Scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs
In this work we propose a structured prediction technique that combines the
virtues of Gaussian Conditional Random Fields (G-CRF) with Deep Learning: (a)
our structured prediction task has a unique global optimum that is obtained
exactly from the solution of a linear system (b) the gradients of our model
parameters are analytically computed using closed form expressions, in contrast
to the memory-demanding contemporary deep structured prediction approaches that
rely on back-propagation-through-time, (c) our pairwise terms do not have to be
simple hand-crafted expressions, as in the line of works building on the
DenseCRF, but can rather be `discovered' from data through deep architectures,
and (d) out system can trained in an end-to-end manner. Building on standard
tools from numerical analysis we develop very efficient algorithms for
inference and learning, as well as a customized technique adapted to the
semantic segmentation task. This efficiency allows us to explore more
sophisticated architectures for structured prediction in deep learning: we
introduce multi-resolution architectures to couple information across scales in
a joint optimization framework, yielding systematic improvements. We
demonstrate the utility of our approach on the challenging VOC PASCAL 2012
image segmentation benchmark, showing substantial improvements over strong
baselines. We make all of our code and experiments available at
{https://github.com/siddharthachandra/gcrf}Comment: Our code is available at https://github.com/siddharthachandra/gcr
Solving the Optimal Trading Trajectory Problem Using a Quantum Annealer
We solve a multi-period portfolio optimization problem using D-Wave Systems'
quantum annealer. We derive a formulation of the problem, discuss several
possible integer encoding schemes, and present numerical examples that show
high success rates. The formulation incorporates transaction costs (including
permanent and temporary market impact), and, significantly, the solution does
not require the inversion of a covariance matrix. The discrete multi-period
portfolio optimization problem we solve is significantly harder than the
continuous variable problem. We present insight into how results may be
improved using suitable software enhancements, and why current quantum
annealing technology limits the size of problem that can be successfully solved
today. The formulation presented is specifically designed to be scalable, with
the expectation that as quantum annealing technology improves, larger problems
will be solvable using the same techniques.Comment: 7 pages; expanded and update
A Domain-Specific Language and Editor for Parallel Particle Methods
Domain-specific languages (DSLs) are of increasing importance in scientific
high-performance computing to reduce development costs, raise the level of
abstraction and, thus, ease scientific programming. However, designing and
implementing DSLs is not an easy task, as it requires knowledge of the
application domain and experience in language engineering and compilers.
Consequently, many DSLs follow a weak approach using macros or text generators,
which lack many of the features that make a DSL a comfortable for programmers.
Some of these features---e.g., syntax highlighting, type inference, error
reporting, and code completion---are easily provided by language workbenches,
which combine language engineering techniques and tools in a common ecosystem.
In this paper, we present the Parallel Particle-Mesh Environment (PPME), a DSL
and development environment for numerical simulations based on particle methods
and hybrid particle-mesh methods. PPME uses the meta programming system (MPS),
a projectional language workbench. PPME is the successor of the Parallel
Particle-Mesh Language (PPML), a Fortran-based DSL that used conventional
implementation strategies. We analyze and compare both languages and
demonstrate how the programmer's experience can be improved using static
analyses and projectional editing. Furthermore, we present an explicit domain
model for particle abstractions and the first formal type system for particle
methods.Comment: Submitted to ACM Transactions on Mathematical Software on Dec. 25,
201
DOPE: Distributed Optimization for Pairwise Energies
We formulate an Alternating Direction Method of Mul-tipliers (ADMM) that
systematically distributes the computations of any technique for optimizing
pairwise functions, including non-submodular potentials. Such discrete
functions are very useful in segmentation and a breadth of other vision
problems. Our method decomposes the problem into a large set of small
sub-problems, each involving a sub-region of the image domain, which can be
solved in parallel. We achieve consistency between the sub-problems through a
novel constraint that can be used for a large class of pair-wise functions. We
give an iterative numerical solution that alternates between solving the
sub-problems and updating consistency variables, until convergence. We report
comprehensive experiments, which demonstrate the benefit of our general
distributed solution in the case of the popular serial algorithm of Boykov and
Kolmogorov (BK algorithm) and, also, in the context of non-submodular
functions.Comment: Accepted at CVPR 201
Automatic linearity detection
Given a function, or more generally an operator, the question "Is it linear?" seems simple to answer. In many applications of scientific computing it might be worth determining the answer to this question in an automated way; some functionality, such as operator exponentiation, is only defined for linear operators, and in other problems, time saving is available if it is known that the problem being solved is linear. Linearity detection is closely connected to sparsity detection of Hessians, so for large-scale applications, memory savings can be made if linearity information is known. However, implementing such an automated detection is not as straightforward as one might expect. This paper describes how automatic linearity detection can be implemented in combination with automatic differentiation, both for standard scientific computing software, and within the Chebfun software system. The key ingredients for the method are the observation that linear operators have constant derivatives, and the propagation of two logical vectors, and , as computations are carried out. The values of and are determined by whether output variables have constant derivatives and constant values with respect to each input variable. The propagation of their values through an evaluation trace of an operator yields the desired information about the linearity of that operator
- …