4,090 research outputs found
Multiscale Discriminant Saliency for Visual Attention
The bottom-up saliency, an early stage of humans' visual attention, can be
considered as a binary classification problem between center and surround
classes. Discriminant power of features for the classification is measured as
mutual information between features and two classes distribution. The estimated
discrepancy of two feature classes very much depends on considered scale
levels; then, multi-scale structure and discriminant power are integrated by
employing discrete wavelet features and Hidden markov tree (HMT). With wavelet
coefficients and Hidden Markov Tree parameters, quad-tree like label structures
are constructed and utilized in maximum a posterior probability (MAP) of hidden
class variables at corresponding dyadic sub-squares. Then, saliency value for
each dyadic square at each scale level is computed with discriminant power
principle and the MAP. Finally, across multiple scales is integrated the final
saliency map by an information maximization rule. Both standard quantitative
tools such as NSS, LCC, AUC and qualitative assessments are used for evaluating
the proposed multiscale discriminant saliency method (MDIS) against the
well-know information-based saliency method AIM on its Bruce Database wity
eye-tracking data. Simulation results are presented and analyzed to verify the
validity of MDIS as well as point out its disadvantages for further research
direction.Comment: 16 pages, ICCSA 2013 - BIOCA sessio
Multispectral texture synthesis
Synthesizing texture involves the ordering of pixels in a 2D arrangement so as to display certain known spatial correlations, generally as described by a sample texture. In an abstract sense, these pixels could be gray-scale values, RGB color values, or entire spectral curves. The focus of this work is to develop a practical synthesis framework that maintains this abstract view while synthesizing texture with high spectral dimension, effectively achieving spectral invariance. The principle idea is to use a single monochrome texture synthesis step to capture the spatial information in a multispectral texture. The first step is to use a global color space transform to condense the spatial information in a sample texture into a principle luminance channel. Then, a monochrome texture synthesis step generates the corresponding principle band in the synthetic texture. This spatial information is then used to condition the generation of spectral information. A number of variants of this general approach are introduced. The first uses a multiresolution transform to decompose the spatial information in the principle band into an equivalent scale/space representation. This information is encapsulated into a set of low order statistical constraints that are used to iteratively coerce white noise into the desired texture. The residual spectral information is then generated using a non-parametric Markov Ran dom field model (MRF). The remaining variants use a non-parametric MRF to generate the spatial and spectral components simultaneously. In this ap proach, multispectral texture is grown from a seed region by sampling from the set of nearest neighbors in the sample texture as identified by a template matching procedure in the principle band. The effectiveness of both algorithms is demonstrated on a number of texture examples ranging from greyscale to RGB textures, as well as 16, 22, 32 and 63 band spectral images. In addition to the standard visual test that predominates the literature, effort is made to quantify the accuracy of the synthesis using informative and effective metrics. These include first and second order statistical comparisons as well as statistical divergence tests
Measurment of spatial orientation using a biologically plausible gradient model
A Thesis submitted for the degree of Doctor of Philosophy
Psychophysical investigations of visual density discrimination
Work in spatial vision is reviewed and a new effect of spatial averaging is reported. This shows that dot separation discriminations are improved if the cue is represented in the intervals within a collection of dots arranged in a lattice, compared to simple 2 dot separation discriminations. This phenomenon may be related to integrative processes that mediate texture density estimation.
Four models for density discrimination are described. One involves measurements of spatial filter outputs. Computer simulations show that in principle, density cues can be encoded by a system of four DOG filters with peak sensitivities spanning a range of 3 octaves.
Alternative models involve operations performed over representations in which spatial features are made explicit. One of these involves estimations of numerosity or coverage of the texture elements. Another involves averaging of the interval values between adjacent elements. A neural model for measuring the relevant intervals is described.
It is argued that in principle the input to a density processor does not require the full sequence of operations in the MIRAGE transformation (eg. Watt and Morgan 1985). In particular, the regions of activity in the second derivative do not need to be interpreted in terms of edges, bars and blobs in order for density estimation to commence. This also implies that explicit coding of texture elements may be unnecessary.
Data for density discrimination in regular and random dot patterns are reported. These do not support the coverage and counting models and observed performance shows significant departures from predictions based on an analysis of the statistics of the interval distribution in the stimuli. But this result can be understood in relation to other factors in the interval averaging process, and there is empirical support for the hypothesized method for measuring the intervals.
Other experiments show that density is scaled according to stimulus size and possibly perceived depth. It is also shown that information from density analysis can be combined with size estimations to produce highly accurate discriminations of image expansion or object depth changes
The visual representation of texture
This research is concerned with texture: a source of visual information, that has motivated a huge amount of psychophysical and computational research. This thesis questions how useful the accepted view of texture perception is. From a theoretical point of view, work to date has largely avoided two critical aspects of a computational theory of texture perception. Firstly, what is texture? Secondly, what is an appropriate representation for texture? This thesis argues that a task dependent definition of texture is necessary, and
proposes a multi-local, statistical scheme for representing texture orientation.
Human performance on a series of psychophysical orientation discrimination tasks are compared to specific predictions from the scheme.
The first set of experiments investigate observers' ability to directly derive statistical estimates from texture. An analogy is reported between the way texture statistics are derived, and the visual processing of spatio-luminance features.
The second set of experiments are concerned with the way texture elements are extracted
from images (an example of the generic grouping problem in vision). The use of
highly constrained experimental tasks, typically texture orientation discriminations, allows for the formulation of simple statistical criteria for setting critical parameters of the model (such as the spatial scale of analysis). It is shown that schemes based on isotropic filtering and symbolic matching do not suffice for performing this grouping, but that the
scheme proposed, base on oriented mechanisms, does.
Taken together these results suggest a view of visual texture processing, not as a
disparate collection of processes, but as a general strategy for deriving statistical representations of images common to a range of visual tasks
Variance Predicts Salience in Central Sensory Processing
Information processing in the sensory periphery is shaped by natural stimulus statistics. In the periphery, a transmission bottleneck constrains performance; thus efficient coding implies that natural signal components with a predictably wider range should be compressed. In a different regime—when sampling limitations constrain performance—efficient coding implies that more resources should be allocated to informative features that are more variable. We propose that this regime is relevant for sensory cortex when it extracts complex features from limited numbers of sensory samples. To test this prediction, we use central visual processing as a model: we show that visual sensitivity for local multi-point spatial correlations, described by dozens of independently-measured parameters, can be quantitatively predicted from the structure of natural images. This suggests that efficient coding applies centrally, where it extends to higher-order sensory features and operates in a regime in which sensitivity increases with feature variability
Sparse Modeling for Image and Vision Processing
In recent years, a large amount of multi-disciplinary research has been
conducted on sparse models and their applications. In statistics and machine
learning, the sparsity principle is used to perform model selection---that is,
automatically selecting a simple model among a large collection of them. In
signal processing, sparse coding consists of representing data with linear
combinations of a few dictionary elements. Subsequently, the corresponding
tools have been widely adopted by several scientific communities such as
neuroscience, bioinformatics, or computer vision. The goal of this monograph is
to offer a self-contained view of sparse modeling for visual recognition and
image processing. More specifically, we focus on applications where the
dictionary is learned and adapted to data, yielding a compact representation
that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics
and Visio
Variable illumination and invariant features for detecting and classifying varnish defects
This work presents a method to detect and classify varnish defects on wood surfaces. Since these defects are only partially visible under certain illumination directions, one image doesn\u27t provide enough information for a recognition task. A classification requires inspecting the surface under different illumination directions, which results in image series. The information is distributed along this series and can be extracted by merging the knowledge about the defect shape and light direction
- …