227,308 research outputs found
Fully Point-wise Convolutional Neural Network for Modeling Statistical Regularities in Natural Images
Modeling statistical regularity plays an essential role in ill-posed image
processing problems. Recently, deep learning based methods have been presented
to implicitly learn statistical representation of pixel distributions in
natural images and leverage it as a constraint to facilitate subsequent tasks,
such as color constancy and image dehazing. However, the existing CNN
architecture is prone to variability and diversity of pixel intensity within
and between local regions, which may result in inaccurate statistical
representation. To address this problem, this paper presents a novel fully
point-wise CNN architecture for modeling statistical regularities in natural
images. Specifically, we propose to randomly shuffle the pixels in the origin
images and leverage the shuffled image as input to make CNN more concerned with
the statistical properties. Moreover, since the pixels in the shuffled image
are independent identically distributed, we can replace all the large
convolution kernels in CNN with point-wise () convolution kernels while
maintaining the representation ability. Experimental results on two
applications: color constancy and image dehazing, demonstrate the superiority
of our proposed network over the existing architectures, i.e., using
1/101/100 network parameters and computational cost while achieving
comparable performance.Comment: 9 pages, 7 figures. To appear in ACM MM 201
A Physiologically Based System Theory of Consciousness
A system which uses large numbers of devices to perform a complex functionality is forced to adopt a simple functional architecture by the needs to construct copies of, repair, and modify the system. A simple functional architecture means that functionality is partitioned into relatively equal sized components on many levels of detail down to device level, a mapping exists between the different levels, and exchange of information between components is minimized. In the instruction architecture functionality is partitioned on every level into instructions, which exchange unambiguous system information and therefore output system commands. The von Neumann architecture is a special case of the instruction architecture in which instructions are coded as unambiguous system information. In the recommendation (or pattern extraction) architecture functionality is partitioned on every level into repetition elements, which can freely exchange ambiguous information and therefore output only system action recommendations which must compete for control of system behavior. Partitioning is optimized to the best tradeoff between even partitioning and minimum cost of distributing data. Natural pressures deriving from the need to construct copies under DNA control, recover from errors, failures and damage, and add new functionality derived from random mutations has resulted in biological brains being constrained to adopt the recommendation architecture. The resultant hierarchy of functional separations can be the basis for understanding psychological phenomena in terms of physiology. A theory of consciousness is described based on the recommendation architecture model for biological brains. Consciousness is defined at a high level in terms of sensory independent image sequences including self images with the role of extending the search of records of individual experience for behavioral guidance in complex social situations. Functional components of this definition of consciousness are developed, and it is demonstrated that these components can be translated through subcomponents to descriptions in terms of known and postulated physiological mechanisms
A processing element architecture for high-density focal plane analog programmable array processors
The architecture of the elementary Processing Element - PE- used in a recently designed 128×128 Focal Plane Analog Programmable Array Processor is presented. The PE architecture contains the required building blocks to implement bifurcated data flow vision algorithms based on the execution of 3 × 3 convolution masks. The vision chip has been implemented in a standard 0.35μm CMOS technology. The main PE related figures are: 180 cells/mm2, 18 MOPS/cell, and 180 μW/cell.Office of Naval Research (USA) N68171-98-C-9004Euopean Union IST-1999-19007Comisión Interministerial de Ciencia y Tecnología TIC1 999-082
ReSeg: A Recurrent Neural Network-based Model for Semantic Segmentation
We propose a structured prediction architecture, which exploits the local
generic features extracted by Convolutional Neural Networks and the capacity of
Recurrent Neural Networks (RNN) to retrieve distant dependencies. The proposed
architecture, called ReSeg, is based on the recently introduced ReNet model for
image classification. We modify and extend it to perform the more challenging
task of semantic segmentation. Each ReNet layer is composed of four RNN that
sweep the image horizontally and vertically in both directions, encoding
patches or activations, and providing relevant global information. Moreover,
ReNet layers are stacked on top of pre-trained convolutional layers, benefiting
from generic local features. Upsampling layers follow ReNet layers to recover
the original image resolution in the final predictions. The proposed ReSeg
architecture is efficient, flexible and suitable for a variety of semantic
segmentation tasks. We evaluate ReSeg on several widely-used semantic
segmentation datasets: Weizmann Horse, Oxford Flower, and CamVid; achieving
state-of-the-art performance. Results show that ReSeg can act as a suitable
architecture for semantic segmentation tasks, and may have further applications
in other structured prediction problems. The source code and model
hyperparameters are available on https://github.com/fvisin/reseg.Comment: In CVPR Deep Vision Workshop, 201
Efficient Yet Deep Convolutional Neural Networks for Semantic Segmentation
Semantic Segmentation using deep convolutional neural network pose more
complex challenge for any GPU intensive task. As it has to compute million of
parameters, it results to huge memory consumption. Moreover, extracting finer
features and conducting supervised training tends to increase the complexity.
With the introduction of Fully Convolutional Neural Network, which uses finer
strides and utilizes deconvolutional layers for upsampling, it has been a go to
for any image segmentation task. In this paper, we propose two segmentation
architecture which not only needs one-third the parameters to compute but also
gives better accuracy than the similar architectures. The model weights were
transferred from the popular neural net like VGG19 and VGG16 which were trained
on Imagenet classification data-set. Then we transform all the fully connected
layers to convolutional layers and use dilated convolution for decreasing the
parameters. Lastly, we add finer strides and attach four skip architectures
which are element-wise summed with the deconvolutional layers in steps. We
train and test on different sparse and fine data-sets like Pascal VOC2012,
Pascal-Context and NYUDv2 and show how better our model performs in this tasks.
On the other hand our model has a faster inference time and consumes less
memory for training and testing on NVIDIA Pascal GPUs, making it more efficient
and less memory consuming architecture for pixel-wise segmentation.Comment: 8 page
- …