3,629 research outputs found
Geometric Ergodicity of Gibbs Samplers in Bayesian Penalized Regression Models
We consider three Bayesian penalized regression models and show that the
respective deterministic scan Gibbs samplers are geometrically ergodic
regardless of the dimension of the regression problem. We prove geometric
ergodicity of the Gibbs samplers for the Bayesian fused lasso, the Bayesian
group lasso, and the Bayesian sparse group lasso. Geometric ergodicity along
with a moment condition results in the existence of a Markov chain central
limit theorem for Monte Carlo averages and ensures reliable output analysis.
Our results of geometric ergodicity allow us to also provide default starting
values for the Gibbs samplers
High-Dimensional Screening Using Multiple Grouping of Variables
Screening is the problem of finding a superset of the set of non-zero entries
in an unknown p-dimensional vector \beta* given n noisy observations.
Naturally, we want this superset to be as small as possible. We propose a novel
framework for screening, which we refer to as Multiple Grouping (MuG), that
groups variables, performs variable selection over the groups, and repeats this
process multiple number of times to estimate a sequence of sets that contains
the non-zero entries in \beta*. Screening is done by taking an intersection of
all these estimated sets. The MuG framework can be used in conjunction with any
group based variable selection algorithm. In the high-dimensional setting,
where p >> n, we show that when MuG is used with the group Lasso estimator,
screening can be consistently performed without using any tuning parameter. Our
numerical simulations clearly show the merits of using the MuG framework in
practice.Comment: This paper will appear in the IEEE Transactions on Signal Processing.
See http://www.ima.umn.edu/~dvats/MuGScreening.html for more detail
Revisiting the Gelman-Rubin Diagnostic
Gelman and Rubin's (1992) convergence diagnostic is one of the most popular
methods for terminating a Markov chain Monte Carlo (MCMC) sampler. Since the
seminal paper, researchers have developed sophisticated methods for estimating
variance of Monte Carlo averages. We show that these estimators find immediate
use in the Gelman-Rubin statistic, a connection not previously established in
the literature. We incorporate these estimators to upgrade both the univariate
and multivariate Gelman-Rubin statistics, leading to improved stability in MCMC
termination time. An immediate advantage is that our new Gelman-Rubin statistic
can be calculated for a single chain. In addition, we establish a one-to-one
relationship between the Gelman-Rubin statistic and effective sample size.
Leveraging this relationship, we develop a principled termination criterion for
the Gelman-Rubin statistic. Finally, we demonstrate the utility of our improved
diagnostic via examples
On-the-fly Historical Handwritten Text Annotation
The performance of information retrieval algorithms depends upon the
availability of ground truth labels annotated by experts. This is an important
prerequisite, and difficulties arise when the annotated ground truth labels are
incorrect or incomplete due to high levels of degradation. To address this
problem, this paper presents a simple method to perform on-the-fly annotation
of degraded historical handwritten text in ancient manuscripts. The proposed
method aims at quick generation of ground truth and correction of inaccurate
annotations such that the bounding box perfectly encapsulates the word, and
contains no added noise from the background or surroundings. This method will
potentially be of help to historians and researchers in generating and
correcting word labels in a document dynamically. The effectiveness of the
annotation method is empirically evaluated on an archival manuscript collection
from well-known publicly available datasets
Learning Surrogate Models of Document Image Quality Metrics for Automated Document Image Processing
Computation of document image quality metrics often depends upon the
availability of a ground truth image corresponding to the document. This limits
the applicability of quality metrics in applications such as hyperparameter
optimization of image processing algorithms that operate on-the-fly on unseen
documents. This work proposes the use of surrogate models to learn the behavior
of a given document quality metric on existing datasets where ground truth
images are available. The trained surrogate model can later be used to predict
the metric value on previously unseen document images without requiring access
to ground truth images. The surrogate model is empirically evaluated on the
Document Image Binarization Competition (DIBCO) and the Handwritten Document
Image Binarization Competition (H-DIBCO) datasets
Telescoping Recursive Representations and Estimation of Gauss-Markov Random Fields
We present \emph{telescoping} recursive representations for both continuous
and discrete indexed noncausal Gauss-Markov random fields. Our recursions start
at the boundary (a hypersurface in , ) and telescope inwards.
For example, for images, the telescoping representation reduce recursions from
to , i.e., to recursions on a single dimension. Under
appropriate conditions, the recursions for the random field are linear
stochastic differential/difference equations driven by white noise, for which
we derive recursive estimation algorithms, that extend standard algorithms,
like the Kalman-Bucy filter and the Rauch-Tung-Striebel smoother, to noncausal
Markov random fields.Comment: To appear in the Transactions on Information Theor
- …