128,531 research outputs found
Recommended from our members
DATA-DRIVEN APPROACH TO IMAGE CLASSIFICATION
Image classification has been a core topic in the computer vision community. Its recent success with convolutional neural network (CNN) algorithm has led to various real world applications such as large scale management of photos/videos on cloud/social-media, image based search for online retailers, self-driving cars, building robots and healthcare. Image classification can be broadly categorized into binary, multi-class and multi-label classification problems. Binary classification involves assigning one of the two class labels to an instance. In multi-class classification problem, an instance should be categorized into one of more than two classes. Multi-label classification is a generalized version of the multi-class classification problem where each image is assigned multiple labels as opposed to a single label.
In this work, we first present various methods that take advantage of deep representations (fully connected layer of pre-trained CNN on the ImageNet dataset) and yield better performance on multi-label classification when compared to methods that use over a dozen conventional visual features. Following the success of deep representations, we intend to build a generic end-to-end deep learning framework to address all three problem categories of image classification. However, there are still no well established guidelines (in terms of choosing the number of layers to go deeper, the number of kernels and the size, the type of regularizer, the choice of non-linear function, etc.) to build an efficient deep neural network and often network architecture design is specific to a problem/dataset. Hence, we present some initial efforts in building a computational framework called Deep Decision Network (DDN) which is completely data-driven. DDN is a tree-like structured built stage-wise. During the learning phase, starting from the root network node, DDN automatically builds a network that splits the data into disjoint clusters of classes which would be handled by the subsequent expert networks. This results in a tree-like structured network driven by the data. The proposed approach provides an insight into the data by identifying the group of classes that are hard to classify and require more attention when compared to others. This feature is crucial for people trying to solve the problem with little or no domain knowledge, especially for applications in medical domain. Initially, we evaluate DDN on a binary classification problem and later extend it to more challenging multi-class and multi-label classification problems. The extension of DDN to multi-class and multi-label involves some changes but they still operate under the same underlying principle. In all the three cases, the proposed approach is tested for its recognition performance and scalability on publicly available datasets providing comparison to other methods
End-to-End Tracking and Semantic Segmentation Using Recurrent Neural Networks
In this work we present a novel end-to-end framework for tracking and
classifying a robot's surroundings in complex, dynamic and only partially
observable real-world environments. The approach deploys a recurrent neural
network to filter an input stream of raw laser measurements in order to
directly infer object locations, along with their identity in both visible and
occluded areas. To achieve this we first train the network using unsupervised
Deep Tracking, a recently proposed theoretical framework for end-to-end space
occupancy prediction. We show that by learning to track on a large amount of
unsupervised data, the network creates a rich internal representation of its
environment which we in turn exploit through the principle of inductive
transfer of knowledge to perform the task of it's semantic classification. As a
result, we show that only a small amount of labelled data suffices to steer the
network towards mastering this additional task. Furthermore we propose a novel
recurrent neural network architecture specifically tailored to tracking and
semantic classification in real-world robotics applications. We demonstrate the
tracking and classification performance of the method on real-world data
collected at a busy road junction. Our evaluation shows that the proposed
end-to-end framework compares favourably to a state-of-the-art, model-free
tracking solution and that it outperforms a conventional one-shot training
scheme for semantic classification
- …