25 research outputs found
Learning from minimally labeled data with accelerated convolutional neural networks
The main objective of an Artificial Vision Algorithm is to design a mapping function that takes an image as an input and correctly classifies it into one of the user-determined categories. There are several important properties to be satisfied by the mapping function for visual understanding. First, the function should produce good representations of the visual world, which will be able to recognize images independently of pose, scale and illumination. Furthermore, the designed artificial vision system has to learn these representations by itself. Recent studies on Convolutional Neural Networks (ConvNets) produced promising advancements in visual understanding. These networks attain significant performance upgrades by relying on hierarchical structures inspired by biological vision systems. In my research, I work mainly in two areas: 1) how ConvNets can be programmed to learn the optimal mapping function using the minimum amount of labeled data, and 2) how these networks can be accelerated for practical purposes. In this work, algorithms that learn from unlabeled data are studied. A new framework that exploits unlabeled data is proposed. The proposed framework obtains state-of-the-art performance results in different tasks.
Furthermore, this study presents an optimized streaming method for ConvNets’ hardware accelerator on an embedded platform. It is tested on object classification and detection applications using ConvNets. Experimental results indicate high computational efficiency, and significant performance upgrades over all other existing platforms
An Analysis of the Connections Between Layers of Deep Neural Networks
We present an analysis of different techniques for selecting the connection
be- tween layers of deep neural networks. Traditional deep neural networks use
ran- dom connection tables between layers to keep the number of connections
small and tune to different image features. This kind of connection performs
adequately in supervised deep networks because their values are refined during
the training. On the other hand, in unsupervised learning, one cannot rely on
back-propagation techniques to learn the connections between layers. In this
work, we tested four different techniques for connecting the first layer of the
network to the second layer on the CIFAR and SVHN datasets and showed that the
accuracy can be im- proved up to 3% depending on the technique used. We also
showed that learning the connections based on the co-occurrences of the
features does not confer an advantage over a random connection table in small
networks. This work is helpful to improve the efficiency of connections between
the layers of unsupervised deep neural networks
Diverse Semantic Image Editing with Style Codes
Semantic image editing requires inpainting pixels following a semantic map.
It is a challenging task since this inpainting requires both harmony with the
context and strict compliance with the semantic maps. The majority of the
previous methods proposed for this task try to encode the whole information
from erased images. However, when an object is added to a scene such as a car,
its style cannot be encoded from the context alone. On the other hand, the
models that can output diverse generations struggle to output images that have
seamless boundaries between the generated and unerased parts. Additionally,
previous methods do not have a mechanism to encode the styles of visible and
partially visible objects differently for better performance. In this work, we
propose a framework that can encode visible and partially visible objects with
a novel mechanism to achieve consistency in the style encoding and final
generations. We extensively compare with previous conditional image generation
and semantic image editing algorithms. Our extensive experiments show that our
method significantly improves over the state-of-the-art. Our method not only
achieves better quantitative results but also provides diverse results. Please
refer to the project web page for the released code and demo:
https://github.com/hakansivuk/DivSem
Clustering Learning for Robotic Vision
We present the clustering learning technique applied to multi-layer
feedforward deep neural networks. We show that this unsupervised learning
technique can compute network filters with only a few minutes and a much
reduced set of parameters. The goal of this paper is to promote the technique
for general-purpose robotic vision systems. We report its use in static image
datasets and object tracking datasets. We show that networks trained with
clustering learning can outperform large networks trained for many hours on
complex datasets.Comment: Code for this paper is available here:
https://github.com/culurciello/CL_paper1_cod