16 research outputs found
Sparse Image Representation with Epitomes
Sparse coding, which is the decomposition of a vector using only a few basis
elements, is widely used in machine learning and image processing. The basis
set, also called dictionary, is learned to adapt to specific data. This
approach has proven to be very effective in many image processing tasks.
Traditionally, the dictionary is an unstructured "flat" set of atoms. In this
paper, we study structured dictionaries which are obtained from an epitome, or
a set of epitomes. The epitome is itself a small image, and the atoms are all
the patches of a chosen size inside this image. This considerably reduces the
number of parameters to learn and provides sparse image decompositions with
shiftinvariance properties. We propose a new formulation and an algorithm for
learning the structured dictionaries associated with epitomes, and illustrate
their use in image denoising tasks.Comment: Computer Vision and Pattern Recognition, Colorado Springs : United
States (2011
Recommended from our members
A demonstration of 'broken' visual space
It has long been assumed that there is a distorted mapping between real and ‘perceived’ space, based on demonstrations of systematic errors in judgements of slant, curvature, direction and separation. Here, we have applied a direct test to the notion of a coherent visual space. In an immersive virtual environment, participants judged the relative distance of two squares displayed in separate intervals. On some trials, the virtual scene expanded by a factor of four between intervals although, in line with recent results, participants did not report any noticeable change in the scene. We found that there was no consistent depth ordering of objects that can explain the distance matches participants made in this environment (e.g. A > B > D yet also A < C < D) and hence no single one-to-one mapping between participants’ perceived space and any real 3D environment. Instead, factors that affect pairwise comparisons of distances dictate participants’ performance. These data contradict, more directly than previous experiments, the idea that the visual system builds and uses a coherent 3D internal representation of a scene
Monte Carlo localization for teach-and-repeat feature-based navigation
This work presents a combination of a teach-and-replay visual navigation and Monte Carlo localization methods. It improves a reliable teach-and-replay navigation method by replacing its dependency on precise dead-reckoning by introducing Monte Carlo localization to determine robot position along the learned path. In consequence, the navigation method becomes robust to dead-reckoning errors, can be started from at any point in the map and can deal with the `kidnapped robot' problem. Furthermore, the robot is localized with MCL only along the taught path, i.e. in one dimension, which does not require a high number of particles and significantly reduces the computational cost.
Thus, the combination of MCL and teach-and-replay navigation mitigates the disadvantages of both methods. The method was tested using a P3-AT ground robot and a Parrot AR.Drone aerial robot over a long indoor corridor. Experiments show the validity of the approach and establish a solid base for continuing this work
Recommended from our members
Comparison of view-based and reconstruction-based models of human navigational strategy
There is good evidence that simple animals such as bees use view-based strategies to return to a familiar location but humans could use a 3D reconstruction to achieve the same goal. Assuming some noise in the storage and retrieval process, these two types of strategy give rise to different patterns of predicted errors in homing. We describe an experiment that can help distinguish between these models. Participants wore a head mounted display to carry out a homing task in immersive virtual reality. They viewed three long thin vertical poles and had to remember where they were in relation to the poles before being transported (virtually) to a new location in the scene from where they had to walk back to the original location. The experiment was conducted in both a rich-cue scene (a furnished room) and a sparse scene (no background and no floor or ceiling). As one would expect, in a rich-cue environment the overall error was smaller and in this case the ability to separate the models was reduced. However, for the sparse-cue environment the view-based model outperforms the reconstruction-based model. Specifically, the likelihood of the experimental data is similar to the likelihood of samples drawn from the view-based model (but assessed under both models) while this is not true for samples drawn from the reconstruction-based model