18,021 research outputs found
Scalable Full Flow with Learned Binary Descriptors
We propose a method for large displacement optical flow in which local
matching costs are learned by a convolutional neural network (CNN) and a
smoothness prior is imposed by a conditional random field (CRF). We tackle the
computation- and memory-intensive operations on the 4D cost volume by a
min-projection which reduces memory complexity from quadratic to linear and
binary descriptors for efficient matching. This enables evaluation of the cost
on the fly and allows to perform learning and CRF inference on high resolution
images without ever storing the 4D cost volume. To address the problem of
learning binary descriptors we propose a new hybrid learning scheme. In
contrast to current state of the art approaches for learning binary CNNs we can
compute the exact non-zero gradient within our model. We compare several
methods for training binary descriptors and show results on public available
benchmarks.Comment: GCPR 201
Probabilistic Programming in Python using PyMC
Probabilistic programming (PP) allows flexible specification of Bayesian
statistical models in code. PyMC3 is a new, open-source PP framework with an
intutive and readable, yet powerful, syntax that is close to the natural syntax
statisticians use to describe models. It features next-generation Markov chain
Monte Carlo (MCMC) sampling algorithms such as the No-U-Turn Sampler (NUTS;
Hoffman, 2014), a self-tuning variant of Hamiltonian Monte Carlo (HMC; Duane,
1987). Probabilistic programming in Python confers a number of advantages
including multi-platform compatibility, an expressive yet clean and readable
syntax, easy integration with other scientific libraries, and extensibility via
C, C++, Fortran or Cython. These features make it relatively straightforward to
write and use custom statistical distributions, samplers and transformation
functions, as required by Bayesian analysis
Efficient Multi-way Theta-Join Processing Using MapReduce
Multi-way Theta-join queries are powerful in describing complex relations and
therefore widely employed in real practices. However, existing solutions from
traditional distributed and parallel databases for multi-way Theta-join queries
cannot be easily extended to fit a shared-nothing distributed computing
paradigm, which is proven to be able to support OLAP applications over immense
data volumes. In this work, we study the problem of efficient processing of
multi-way Theta-join queries using MapReduce from a cost-effective perspective.
Although there have been some works using the (key,value) pair-based
programming model to support join operations, efficient processing of multi-way
Theta-join queries has never been fully explored. The substantial challenge
lies in, given a number of processing units (that can run Map or Reduce tasks),
mapping a multi-way Theta-join query to a number of MapReduce jobs and having
them executed in a well scheduled sequence, such that the total processing time
span is minimized. Our solution mainly includes two parts: 1) cost metrics for
both single MapReduce job and a number of MapReduce jobs executed in a certain
order; 2) the efficient execution of a chain-typed Theta-join with only one
MapReduce job. Comparing with the query evaluation strategy proposed in [23]
and the widely adopted Pig Latin and Hive SQL solutions, our method achieves
significant improvement of the join processing efficiency.Comment: VLDB201
- …