Search CORE

5,322 research outputs found

Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction

Author: Lee Honglak
Pan Gang
Sohn Kihyuk
Villegas Ruben
Zhang Yuting
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/01/2016
Field of study

Object detection systems based on the deep convolutional neural network (CNN) have recently made ground- breaking advances on several object detection benchmarks. While the features learned by these high-capacity neural networks are discriminative for categorization, inaccurate localization is still a major source of error for detection. Building upon high-capacity CNN architectures, we address the localization problem by 1) using a search algorithm based on Bayesian optimization that sequentially proposes candidate regions for an object bounding box, and 2) training the CNN with a structured loss that explicitly penalizes the localization inaccuracy. In experiments, we demonstrated that each of the proposed methods improves the detection performance over the baseline method on PASCAL VOC 2007 and 2012 datasets. Furthermore, two methods are complementary and significantly outperform the previous state-of-the-art when combined.Comment: CVPR 201

arXiv.org e-Print Archive

Crossref

Gaussian Processes with Context-Supported Priors for Active Object Localization

Author: Jedynak Bruno
Mitchell Melanie
Rhodes Anthony D.
Witte Jordan
Publication venue
Publication date: 20/09/2017
Field of study

We devise an algorithm using a Bayesian optimization framework in conjunction with contextual visual data for the efficient localization of objects in still images. Recent research has demonstrated substantial progress in object localization and related tasks for computer vision. However, many current state-of-the-art object localization procedures still suffer from inaccuracy and inefficiency, in addition to failing to provide a principled and interpretable system amenable to high-level vision tasks. We address these issues with the current research. Our method encompasses an active search procedure that uses contextual data to generate initial bounding-box proposals for a target object. We train a convolutional neural network to approximate an offset distance from the target object. Next, we use a Gaussian Process to model this offset response signal over the search space of the target. We then employ a Bayesian active search for accurate localization of the target. In experiments, we compare our approach to a state-of-theart bounding-box regression method for a challenging pedestrian localization task. Our method exhibits a substantial improvement over this baseline regression method.Comment: 10 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Multi-Object Classification and Unsupervised Scene Understanding Using Deep Learning Features and Latent Tree Probabilistic Models

Author: Anandkumar Anima
Nimmagadda Tejaswi
Publication venue
Publication date: 01/01/2015
Field of study

Deep learning has shown state-of-art classification performance on datasets such as ImageNet, which contain a single object in each image. However, multi-object classification is far more challenging. We present a unified framework which leverages the strengths of multiple machine learning methods, viz deep learning, probabilistic models and kernel methods to obtain state-of-art performance on Microsoft COCO, consisting of non-iconic images. We incorporate contextual information in natural images through a conditional latent tree probabilistic model (CLTM), where the object co-occurrences are conditioned on the extracted fc7 features from pre-trained Imagenet CNN as input. We learn the CLTM tree structure using conditional pairwise probabilities for object co-occurrences, estimated through kernel methods, and we learn its node and edge potentials by training a new 3-layer neural network, which takes fc7 features as input. Object classification is carried out via inference on the learnt conditional tree model, and we obtain significant gain in precision-recall and F-measures on MS-COCO, especially for difficult object categories. Moreover, the latent variables in the CLTM capture scene information: the images with top activations for a latent node have common themes such as being a grasslands or a food scene, and on on. In addition, we show that a simple k-means clustering of the inferred latent nodes alone significantly improves scene classification performance on the MIT-Indoor dataset, without the need for any retraining, and without using scene labels during training. Thus, we present a unified framework for multi-object classification and unsupervised scene understanding

arXiv.org e-Print Archive

eScholarship - University of California

Caltech Authors

A Survey on Deep Learning in Medical Image Analysis

Author: Bejnordi Babak Ehteshami
Ciompi Francesco
Ghafoorian Mohsen
Kooi Thijs
Litjens Geert
Setio Arnaud Arindra Adiyoso
Sánchez Clara I.
van der Laak Jeroen A. W. M.
van Ginneken Bram
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks and provide concise overviews of studies per application area. Open challenges and directions for future research are discussed.Comment: Revised survey includes expanded discussion section and reworked introductory section on common deep architectures. Added missed papers from before Feb 1st 201

arXiv.org e-Print Archive

Radboud Repository

Multi-task CNN Model for Attribute Prediction

Author: Abdulnabi Abrar H.
Jia Kui
Lu Jiwen
Wang Gang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

This paper proposes a joint multi-task learning algorithm to better predict attributes in images using deep convolutional neural networks (CNN). We consider learning binary semantic attributes through a multi-task CNN model, where each CNN will predict one binary attribute. The multi-task learning allows CNN models to simultaneously share visual knowledge among different attribute categories. Each CNN will generate attribute-specific feature representations, and then we apply multi-task learning on the features to predict their attributes. In our multi-task framework, we propose a method to decompose the overall model's parameters into a latent task matrix and combination matrix. Furthermore, under-sampled classifiers can leverage shared statistics from other classifiers to improve their performance. Natural grouping of attributes is applied such that attributes in the same group are encouraged to share more knowledge. Meanwhile, attributes in different groups will generally compete with each other, and consequently share less knowledge. We show the effectiveness of our method on two popular attribute datasets.Comment: 11 pages, 3 figures, ieee transaction pape

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Improving region based CNN object detector using bayesian optimization

Author: Muhammad Amgad
Publication venue: AUC Knowledge Fountain
Publication date: 01/06/2018
Field of study

Using Deep Neural Networks for object detection tasks has had groundbreaking results on several object detection benchmarks. Although the trained models have high capacity and strong discrimination power, yet inaccurate localization is a major source of error for those detection systems. In my work, I\u27m developing a sequential searching algorithm based on Bayesian Optimization to propose better candidate bounding boxes for the objects of interest. The work is focusing on formulating effective region proposal as an optimization problem and using Bayesian Optimization algorithm as a black-box optimizer to sequentially solve this problem. The proposed algorithm demonstrated the state-of-the-art performance on PASCAL VOC 2007 benchmark under the standard localization requirements

AUC Knowledge Fountain (American Univ. in Cairo)