50,459 research outputs found
Human Image Preference and Document Degradation Models
Because most degraded documents are created by people, the preferences individuals have in relation to degraded documents are quite important. Their preferences may determine whether or not the documents they created are appropriate for machines. The goal of this study was to find relationships between preference and several parameters of a scanner degradation model. It was found that the difference in binarization threshold and the difference in edge displacement caused by the degradation both had strong linear relationships to preference. The width of the point spread function did not show such a relationship. These relationships were counterintuitive because degraded characters with thicker stroke widths than the original were preferred to those that had stroke widths closer to the original character
Director Field Model of the Primary Visual Cortex for Contour Detection
We aim to build the simplest possible model capable of detecting long, noisy
contours in a cluttered visual scene. For this, we model the neural dynamics in
the primate primary visual cortex in terms of a continuous director field that
describes the average rate and the average orientational preference of active
neurons at a particular point in the cortex. We then use a linear-nonlinear
dynamical model with long range connectivity patterns to enforce long-range
statistical context present in the analyzed images. The resulting model has
substantially fewer degrees of freedom than traditional models, and yet it can
distinguish large contiguous objects from the background clutter by suppressing
the clutter and by filling-in occluded elements of object contours. This
results in high-precision, high-recall detection of large objects in cluttered
scenes. Parenthetically, our model has a direct correspondence with the Landau
- de Gennes theory of nematic liquid crystal in two dimensions.Comment: 9 pages, 7 figure
Learning Surrogate Models of Document Image Quality Metrics for Automated Document Image Processing
Computation of document image quality metrics often depends upon the
availability of a ground truth image corresponding to the document. This limits
the applicability of quality metrics in applications such as hyperparameter
optimization of image processing algorithms that operate on-the-fly on unseen
documents. This work proposes the use of surrogate models to learn the behavior
of a given document quality metric on existing datasets where ground truth
images are available. The trained surrogate model can later be used to predict
the metric value on previously unseen document images without requiring access
to ground truth images. The surrogate model is empirically evaluated on the
Document Image Binarization Competition (DIBCO) and the Handwritten Document
Image Binarization Competition (H-DIBCO) datasets
Learning Visual Features from Snapshots for Web Search
When applying learning to rank algorithms to Web search, a large number of
features are usually designed to capture the relevance signals. Most of these
features are computed based on the extracted textual elements, link analysis,
and user logs. However, Web pages are not solely linked texts, but have
structured layout organizing a large variety of elements in different styles.
Such layout itself can convey useful visual information, indicating the
relevance of a Web page. For example, the query-independent layout (i.e., raw
page layout) can help identify the page quality, while the query-dependent
layout (i.e., page rendered with matched query words) can further tell rich
structural information (e.g., size, position and proximity) of the matching
signals. However, such visual information of layout has been seldom utilized in
Web search in the past. In this work, we propose to learn rich visual features
automatically from the layout of Web pages (i.e., Web page snapshots) for
relevance ranking. Both query-independent and query-dependent snapshots are
considered as the new inputs. We then propose a novel visual perception model
inspired by human's visual search behaviors on page viewing to extract the
visual features. This model can be learned end-to-end together with traditional
human-crafted features. We also show that such visual features can be
efficiently acquired in the online setting with an extended inverted indexing
scheme. Experiments on benchmark collections demonstrate that learning visual
features from Web page snapshots can significantly improve the performance of
relevance ranking in ad-hoc Web retrieval tasks.Comment: CIKM 201
Improving Image Restoration with Soft-Rounding
Several important classes of images such as text, barcode and pattern images
have the property that pixels can only take a distinct subset of values. This
knowledge can benefit the restoration of such images, but it has not been
widely considered in current restoration methods. In this work, we describe an
effective and efficient approach to incorporate the knowledge of distinct pixel
values of the pristine images into the general regularized least squares
restoration framework. We introduce a new regularizer that attains zero at the
designated pixel values and becomes a quadratic penalty function in the
intervals between them. When incorporated into the regularized least squares
restoration framework, this regularizer leads to a simple and efficient step
that resembles and extends the rounding operation, which we term as
soft-rounding. We apply the soft-rounding enhanced solution to the restoration
of binary text/barcode images and pattern images with multiple distinct pixel
values. Experimental results show that soft-rounding enhanced restoration
methods achieve significant improvement in both visual quality and quantitative
measures (PSNR and SSIM). Furthermore, we show that this regularizer can also
benefit the restoration of general natural images.Comment: 9 pages, 6 figure
- …