8,976 research outputs found
Energy-Efficient Object Detection using Semantic Decomposition
Machine-learning algorithms offer immense possibilities in the development of
several cognitive applications. In fact, large scale machine-learning
classifiers now represent the state-of-the-art in a wide range of object
detection/classification problems. However, the network complexities of
large-scale classifiers present them as one of the most challenging and energy
intensive workloads across the computing spectrum. In this paper, we present a
new approach to optimize energy efficiency of object detection tasks using
semantic decomposition to build a hierarchical classification framework. We
observe that certain semantic information like color/texture are common across
various images in real-world datasets for object detection applications. We
exploit these common semantic features to distinguish the objects of interest
from the remaining inputs (non-objects of interest) in a dataset at a lower
computational effort. We propose a 2-stage hierarchical classification
framework, with increasing levels of complexity, wherein the first stage is
trained to recognize the broad representative semantic features relevant to the
object of interest. The first stage rejects the input instances that do not
have the representative features and passes only the relevant instances to the
second stage. Our methodology thus allows us to reject certain information at
lower complexity and utilize the full computational effort of a network only on
a smaller fraction of inputs to perform detection. We use color and texture as
distinctive traits to carry out several experiments for object detection. Our
experiments on the Caltech101/CIFAR10 dataset show that the proposed method
yields 1.93x/1.46x improvement in average energy, respectively, over the
traditional single classifier model.Comment: 10 pages, 13 figures, 3 algorithms, Submitted to IEEE TVLSI(Under
Review
Network Decoupling: From Regular to Depthwise Separable Convolutions
Depthwise separable convolution has shown great efficiency in network design,
but requires time-consuming training procedure with full training-set
available. This paper first analyzes the mathematical relationship between
regular convolutions and depthwise separable convolutions, and proves that the
former one could be approximated with the latter one in closed form. We show
depthwise separable convolutions are principal components of regular
convolutions. And then we propose network decoupling (ND), a training-free
method to accelerate convolutional neural networks (CNNs) by transferring
pre-trained CNN models into the MobileNet-like depthwise separable convolution
structure, with a promising speedup yet negligible accuracy loss. We further
verify through experiments that the proposed method is orthogonal to other
training-free methods like channel decomposition, spatial decomposition, etc.
Combining the proposed method with them will bring even larger CNN speedup. For
instance, ND itself achieves about 2X speedup for the widely used VGG16, and
combined with other methods, it reaches 3.7X speedup with graceful accuracy
degradation. We demonstrate that ND is widely applicable to classification
networks like ResNet, and object detection network like SSD300
Gaussian Filter in CRF Based Semantic Segmentation
Artificial intelligence is making great changes in academy and industry with
the fast development of deep learning, which is a branch of machine learning
and statistical learning. Fully convolutional network [1] is the standard model
for semantic segmentation. Conditional random fields coded as CNN [2] or RNN
[3] and connected with FCN has been successfully applied in object detection
[4]. In this paper, we introduce a multi-resolution neural network for FCN and
apply Gaussian filter to the extended CRF kernel neighborhood and the label
image to reduce the oscillating effect of CRF neural network segmentation, thus
achieve higher precision and faster training speed.Comment: 11 pages, 9 figures, 2 table
Accelerated Inference in Markov Random Fields via Smooth Riemannian Optimization
Markov Random Fields (MRFs) are a popular model for several pattern
recognition and reconstruction problems in robotics and computer vision.
Inference in MRFs is intractable in general and related work resorts to
approximation algorithms. Among those techniques, semidefinite programming
(SDP) relaxations have been shown to provide accurate estimates while scaling
poorly with the problem size and being typically slow for practical
applications. Our first contribution is to design a dual ascent method to solve
standard SDP relaxations that takes advantage of the geometric structure of the
problem to speed up computation. This technique, named Dual Ascent Riemannian
Staircase (DARS), is able to solve large problem instances in seconds. Our
second contribution is to develop a second and faster approach. The backbone of
this second approach is a novel SDP relaxation combined with a fast and
scalable solver based on smooth Riemannian optimization. We show that this
approach, named Fast Unconstrained SEmidefinite Solver (FUSES), can solve large
problems in milliseconds. Contrarily to local MRF solvers, e.g., loopy belief
propagation, our approaches do not require an initial guess. Moreover, we
leverage recent results from optimization theory to provide per-instance
sub-optimality guarantees. We demonstrate the proposed approaches in
multi-class image segmentation problems. Extensive experimental evidence shows
that (i) FUSES and DARS produce near-optimal solutions, attaining an objective
within 0.1% of the optimum, (ii) FUSES and DARS are remarkably faster than
general-purpose SDP solvers, and FUSES is more than two orders of magnitude
faster than DARS while attaining similar solution quality, (iii) FUSES is
faster than local search methods while being a global solver.Comment: 16 page
Leveraging Domain Knowledge to Improve Microscopy Image Segmentation with Lifted Multicuts
The throughput of electron microscopes has increased significantly in recent
years, enabling detailed analysis of cell morphology and ultrastructure.
Analysis of neural circuits at single-synapse resolution remains the flagship
target of this technique, but applications to cell and developmental biology
are also starting to emerge at scale. The amount of data acquired in such
studies makes manual instance segmentation, a fundamental step in many analysis
pipelines, impossible. While automatic segmentation approaches have improved
significantly thanks to the adoption of convolutional neural networks, their
accuracy still lags behind human annotations and requires additional manual
proof-reading. A major hindrance to further improvements is the limited field
of view of the segmentation networks preventing them from exploiting the
expected cell morphology or other prior biological knowledge which humans use
to inform their segmentation decisions. In this contribution, we show how such
domain-specific information can be leveraged by expressing it as long-range
interactions in a graph partitioning problem known as the lifted multicut
problem. Using this formulation, we demonstrate significant improvement in
segmentation accuracy for three challenging EM segmentation problems from
neuroscience and cell biology
A Survey on Deep Learning Methods for Robot Vision
Deep learning has allowed a paradigm shift in pattern recognition, from using
hand-crafted features together with statistical classifiers to using
general-purpose learning procedures for learning data-driven representations,
features, and classifiers together. The application of this new paradigm has
been particularly successful in computer vision, in which the development of
deep learning methods for vision applications has become a hot research topic.
Given that deep learning has already attracted the attention of the robot
vision community, the main purpose of this survey is to address the use of deep
learning in robot vision. To achieve this, a comprehensive overview of deep
learning and its usage in computer vision is given, that includes a description
of the most frequently used neural models and their main application areas.
Then, the standard methodology and tools used for designing deep-learning based
vision systems are presented. Afterwards, a review of the principal work using
deep learning in robot vision is presented, as well as current and future
trends related to the use of deep learning in robotics. This survey is intended
to be a guide for the developers of robot vision systems
cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600 Papers Survey
The paper gives futuristic challenges disscussed in the cvpaper.challenge. In
2015 and 2016, we thoroughly study 1,600+ papers in several
conferences/journals such as CVPR/ICCV/ECCV/NIPS/PAMI/IJCV
Knowledge-guided Semantic Computing Network
It is very useful to integrate human knowledge and experience into
traditional neural networks for faster learning speed, fewer training samples
and better interpretability. However, due to the obscured and indescribable
black box model of neural networks, it is very difficult to design its
architecture, interpret its features and predict its performance. Inspired by
human visual cognition process, we propose a knowledge-guided semantic
computing network which includes two modules: a knowledge-guided semantic tree
and a data-driven neural network. The semantic tree is pre-defined to describe
the spatial structural relations of different semantics, which just corresponds
to the tree-like description of objects based on human knowledge. The object
recognition process through the semantic tree only needs simple forward
computing without training. Besides, to enhance the recognition ability of the
semantic tree in aspects of the diversity, randomicity and variability, we use
the traditional neural network to aid the semantic tree to learn some
indescribable features. Only in this case, the training process is needed. The
experimental results on MNIST and GTSRB datasets show that compared with the
traditional data-driven network, our proposed semantic computing network can
achieve better performance with fewer training samples and lower computational
complexity. Especially, Our model also has better adversarial robustness than
traditional neural network with the help of human knowledge.Comment: 13 pages, 13 figure
cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey
The "cvpaper.challenge" is a group composed of members from AIST, Tokyo Denki
Univ. (TDU), and Univ. of Tsukuba that aims to systematically summarize papers
on computer vision, pattern recognition, and related fields. For this
particular review, we focused on reading the ALL 602 conference papers
presented at the CVPR2015, the premier annual computer vision event held in
June 2015, in order to grasp the trends in the field. Further, we are proposing
"DeepSurvey" as a mechanism embodying the entire process from the reading
through all the papers, the generation of ideas, and to the writing of paper.Comment: Survey Pape
A Review of Co-saliency Detection Technique: Fundamentals, Applications, and Challenges
Co-saliency detection is a newly emerging and rapidly growing research area
in computer vision community. As a novel branch of visual saliency, co-saliency
detection refers to the discovery of common and salient foregrounds from two or
more relevant images, and can be widely used in many computer vision tasks. The
existing co-saliency detection algorithms mainly consist of three components:
extracting effective features to represent the image regions, exploring the
informative cues or factors to characterize co-saliency, and designing
effective computational frameworks to formulate co-saliency. Although numerous
methods have been developed, the literature is still lacking a deep review and
evaluation of co-saliency detection techniques. In this paper, we aim at
providing a comprehensive review of the fundamentals, challenges, and
applications of co-saliency detection. Specifically, we provide an overview of
some related computer vision works, review the history of co-saliency
detection, summarize and categorize the major algorithms in this research area,
discuss some open issues in this area, present the potential applications of
co-saliency detection, and finally point out some unsolved challenges and
promising future works. We expect this review to be beneficial to both fresh
and senior researchers in this field, and give insights to researchers in other
related areas regarding the utility of co-saliency detection algorithms.Comment: 28 pages, 12 figures, 3 table
- …