105 research outputs found
A deep representation for depth images from synthetic data
Convolutional Neural Networks (CNNs) trained on large scale RGB databases
have become the secret sauce in the majority of recent approaches for object
categorization from RGB-D data. Thanks to colorization techniques, these
methods exploit the filters learned from 2D images to extract meaningful
representations in 2.5D. Still, the perceptual signature of these two kind of
images is very different, with the first usually strongly characterized by
textures, and the second mostly by silhouettes of objects. Ideally, one would
like to have two CNNs, one for RGB and one for depth, each trained on a
suitable data collection, able to capture the perceptual properties of each
channel for the task at hand. This has not been possible so far, due to the
lack of a suitable depth database. This paper addresses this issue, proposing
to opt for synthetically generated images rather than collecting by hand a 2.5D
large scale database. While being clearly a proxy for real data, synthetic
images allow to trade quality for quantity, making it possible to generate a
virtually infinite amount of data. We show that the filters learned from such
data collection, using the very same architecture typically used on visual
data, learns very different filters, resulting in depth features (a) able to
better characterize the different facets of depth images, and (b) complementary
with respect to those derived from CNNs pre-trained on 2D datasets. Experiments
on two publicly available databases show the power of our approach
Object Detection from a Vehicle Using Deep Learning Network and Future Integration with Multi-Sensor Fusion Algorithm
Accuracy in detecting a moving object is critical to autonomous driving or advanced driver assistance systems (ADAS). By including the object classification from multiple sensor detections, the model of the object or environment can be identified more accurately. The critical parameters involved in improving the accuracy are the size and the speed of the moving object. All sensor data are to be used in defining a composite object representation so that it could be used for the class information in the core object’s description. This composite data can then be used by a deep learning network for complete perception fusion in order to solve the detection and tracking of moving objects problem. Camera image data from subsequent frames along the time axis in conjunction with the speed and size of the object will further contribute in developing better recognition algorithms. In this paper, we present preliminary results using only camera images for detecting various objects using deep learning network, as a first step toward multi-sensor fusion algorithm development. The simulation experiments based on camera images show encouraging results where the proposed deep learning network based detection algorithm was able to detect various objects with certain degree of confidence. A laboratory experimental setup is being commissioned where three different types of sensors, a digital camera with 8 megapixel resolution, a LIDAR with 40m range, and ultrasonic distance transducer sensors will be used for multi-sensor fusion to identify the object in real-time
Indoor Semantic Segmentation using depth information
This work addresses multi-class segmentation of indoor scenes with RGB-D
inputs. While this area of research has gained much attention recently, most
works still rely on hand-crafted features. In contrast, we apply a multiscale
convolutional network to learn features directly from the images and the depth
information. We obtain state-of-the-art on the NYU-v2 depth dataset with an
accuracy of 64.5%. We illustrate the labeling of indoor scenes in videos
sequences that could be processed in real-time using appropriate hardware such
as an FPGA.Comment: 8 pages, 3 figure
Understanding Deep Architectures using a Recursive Convolutional Network
A key challenge in designing convolutional network models is sizing them
appropriately. Many factors are involved in these decisions, including number
of layers, feature maps, kernel sizes, etc. Complicating this further is the
fact that each of these influence not only the numbers and dimensions of the
activation units, but also the total number of parameters. In this paper we
focus on assessing the independent contributions of three of these linked
variables: The numbers of layers, feature maps, and parameters. To accomplish
this, we employ a recursive convolutional network whose weights are tied
between layers; this allows us to vary each of the three factors in a
controlled setting. We find that while increasing the numbers of layers and
parameters each have clear benefit, the number of feature maps (and hence
dimensionality of the representation) appears ancillary, and finds most of its
benefit through the introduction of more weights. Our results (i) empirically
confirm the notion that adding layers alone increases computational power,
within the context of convolutional layers, and (ii) suggest that precise
sizing of convolutional feature map dimensions is itself of little concern;
more attention should be paid to the number of parameters in these layers
instead
Clustering Learning for Robotic Vision
We present the clustering learning technique applied to multi-layer
feedforward deep neural networks. We show that this unsupervised learning
technique can compute network filters with only a few minutes and a much
reduced set of parameters. The goal of this paper is to promote the technique
for general-purpose robotic vision systems. We report its use in static image
datasets and object tracking datasets. We show that networks trained with
clustering learning can outperform large networks trained for many hours on
complex datasets.Comment: Code for this paper is available here:
https://github.com/culurciello/CL_paper1_cod
Using Neural Networks for Image Classification
This paper will focus on applying neural network machine learning methods to images for the purpose of automatic detection and classification. The main advantage of using neural network methods in this project is its adeptness at fitting nonÂlinear data and its ability to work as an unsupervised algorithm. The algorithms will be run on common, publically available datasets, namely the MNIST and CIFARÂ10, so that our results will be easily reproducible
- …