185,566 research outputs found
Unsupervised Feature Learning by Deep Sparse Coding
In this paper, we propose a new unsupervised feature learning framework,
namely Deep Sparse Coding (DeepSC), that extends sparse coding to a multi-layer
architecture for visual object recognition tasks. The main innovation of the
framework is that it connects the sparse-encoders from different layers by a
sparse-to-dense module. The sparse-to-dense module is a composition of a local
spatial pooling step and a low-dimensional embedding process, which takes
advantage of the spatial smoothness information in the image. As a result, the
new method is able to learn several levels of sparse representation of the
image which capture features at a variety of abstraction levels and
simultaneously preserve the spatial smoothness between the neighboring image
patches. Combining the feature representations from multiple layers, DeepSC
achieves the state-of-the-art performance on multiple object recognition tasks.Comment: 9 pages, submitted to ICL
Object segregation and local gist vision using low-level geometry
Multi-scale representations of lines, edges and keypoints on the basis of simple, complex, and end-stopped cells can be used for object categorisation and recognition. These representations are complemented by saliency maps of colour, texture, disparity and motion information, which also serve to model extremely fast gist vision in parallel with object segregation. We present a low-level geometry model based on a single type of self-adjusting grouping cell, with a circular array of dendrites connected to edge cells located at several angles. Different angles between active edge cells allow the grouping cell to detect geometric primitives like corners, bars and blobs. Such primitives forming different configurations can then be grouped to identify more complex geometry, like object shapes, without much additional effort. The speed of the model permits it to be used for fast gist vision, assuming that edge cells respond to transients in colour, texture, disparity and motion. The big advantage of combining this information at a low level is that local (object) gist can be extracted first, ie, which types of objects are about where in a scene, after which global (scene) gist can be processed at a semantic level
Visual pathways from the perspective of cost functions and multi-task deep neural networks
Vision research has been shaped by the seminal insight that we can understand
the higher-tier visual cortex from the perspective of multiple functional
pathways with different goals. In this paper, we try to give a computational
account of the functional organization of this system by reasoning from the
perspective of multi-task deep neural networks. Machine learning has shown that
tasks become easier to solve when they are decomposed into subtasks with their
own cost function. We hypothesize that the visual system optimizes multiple
cost functions of unrelated tasks and this causes the emergence of a ventral
pathway dedicated to vision for perception, and a dorsal pathway dedicated to
vision for action. To evaluate the functional organization in multi-task deep
neural networks, we propose a method that measures the contribution of a unit
towards each task, applying it to two networks that have been trained on either
two related or two unrelated tasks, using an identical stimulus set. Results
show that the network trained on the unrelated tasks shows a decreasing degree
of feature representation sharing towards higher-tier layers while the network
trained on related tasks uniformly shows high degree of sharing. We conjecture
that the method we propose can be used to analyze the anatomical and functional
organization of the visual system and beyond. We predict that the degree to
which tasks are related is a good descriptor of the degree to which they share
downstream cortical-units.Comment: 16 pages, 5 figure
- …