1,701 research outputs found
Shampoo: Preconditioned Stochastic Tensor Optimization
Preconditioned gradient methods are among the most general and powerful tools
in optimization. However, preconditioning requires storing and manipulating
prohibitively large matrices. We describe and analyze a new structure-aware
preconditioning algorithm, called Shampoo, for stochastic optimization over
tensor spaces. Shampoo maintains a set of preconditioning matrices, each of
which operates on a single dimension, contracting over the remaining
dimensions. We establish convergence guarantees in the stochastic convex
setting, the proof of which builds upon matrix trace inequalities. Our
experiments with state-of-the-art deep learning models show that Shampoo is
capable of converging considerably faster than commonly used optimizers.
Although it involves a more complex update rule, Shampoo's runtime per step is
comparable to that of simple gradient methods such as SGD, AdaGrad, and Adam
Direct fluorophore conjugation to genomic DNA for microarray-based epigenomic profiling
A methodology for microarray based epigenomic profiling is presented. The method relies on platinum-based fluorescence labeling reagents for direct (non-enzymatic) labeling of DNA and RNA. This is a work in progress and only preliminary data is presented here
Memory-Efficient Adaptive Optimization
Adaptive gradient-based optimizers such as Adagrad and Adam are crucial for
achieving state-of-the-art performance in machine translation and language
modeling. However, these methods maintain second-order statistics for each
parameter, thus introducing significant memory overheads that restrict the size
of the model being used as well as the number of examples in a mini-batch. We
describe an effective and flexible adaptive optimization method with greatly
reduced memory overhead. Our method retains the benefits of per-parameter
adaptivity while allowing significantly larger models and batch sizes. We give
convergence guarantees for our method, and demonstrate its effectiveness in
training very large translation and language models with up to 2-fold speedups
compared to the state-of-the-art
Approximate reasoning for real-time probabilistic processes
We develop a pseudo-metric analogue of bisimulation for generalized
semi-Markov processes. The kernel of this pseudo-metric corresponds to
bisimulation; thus we have extended bisimulation for continuous-time
probabilistic processes to a much broader class of distributions than
exponential distributions. This pseudo-metric gives a useful handle on
approximate reasoning in the presence of numerical information -- such as
probabilities and time -- in the model. We give a fixed point characterization
of the pseudo-metric. This makes available coinductive reasoning principles for
reasoning about distances. We demonstrate that our approach is insensitive to
potentially ad hoc articulations of distance by showing that it is intrinsic to
an underlying uniformity. We provide a logical characterization of this
uniformity using a real-valued modal logic. We show that several quantitative
properties of interest are continuous with respect to the pseudo-metric. Thus,
if two processes are metrically close, then observable quantitative properties
of interest are indeed close.Comment: Preliminary version appeared in QEST 0
Accurate Iris Localization Using Edge Map Generation and Adaptive Circular Hough Transform for Less Constrained Iris Images
This paper proposes an accurate iris localization algorithm for the iris images acquired under near infrared (NIR) illuminations and having noise due to eyelids, eyelashes, lighting reflections, non-uniform illumination, eyeglasses and eyebrow hair etc. The two main contributions in the paper are an edge map generation technique for pupil boundary detection and an adaptive circular Hough transform (CHT) algorithm for limbic boundary detection, which not only make the iris localization more accurate but faster also. The edge map for pupil boundary detection is generated on intersection (logical AND) of two binary edge maps obtained using thresholding, morphological operations and Sobel edge detection, which results in minimal false edges caused by the noise. The adaptive CHT algorithm for limbic boundary detection searches for a set of two arcs in an image instead of a full circle that counters iris-occlusions by the eyelids and eyelashes. The proposed CHT and adaptive CHT implementations for pupil and limbic boundary detection respectively use a two-dimensional accumulator array that reduces memory requirements. The proposed algorithm gives the accuracies of 99.7% and 99.38% for the challenging CASIA-Iris-Thousand (version 4.0) and CASIA-Iris-Lamp (version 3.0) databases respectively. The average time cost per image is 905 msec. The proposed algorithm is compared with the previous work and shows better results
MNL-Bandit in non-stationary environments
In this paper, we study the MNL-Bandit problem in a non-stationary
environment and present an algorithm with a worst-case expected regret of
. Here is the number of arms, is the number of
changes and is a variation measure of the unknown
parameters. Furthermore, we show matching lower bounds on the expected regret
(up to logarithmic factors), implying that our algorithm is optimal. Our
approach builds upon the epoch-based algorithm for stationary MNL-Bandit in
Agrawal et al. 2016. However, non-stationarity poses several challenges and we
introduce new techniques and ideas to address these. In particular, we give a
tight characterization for the bias introduced in the estimators due to non
stationarity and derive new concentration bounds
- …