7,173 research outputs found
Multi-label Learning via Structured Decomposition and Group Sparsity
In multi-label learning, each sample is associated with several labels.
Existing works indicate that exploring correlations between labels improve the
prediction performance. However, embedding the label correlations into the
training process significantly increases the problem size. Moreover, the
mapping of the label structure in the feature space is not clear. In this
paper, we propose a novel multi-label learning method "Structured Decomposition
+ Group Sparsity (SDGS)". In SDGS, we learn a feature subspace for each label
from the structured decomposition of the training data, and predict the labels
of a new sample from its group sparse representation on the multi-subspace
obtained from the structured decomposition. In particular, in the training
stage, we decompose the data matrix as
, wherein the rows of associated with samples that
belong to label are nonzero and consist a low-rank matrix, while the other
rows are all-zeros, the residual is a sparse matrix. The row space of
is the feature subspace corresponding to label . This decomposition can be
efficiently obtained via randomized optimization. In the prediction stage, we
estimate the group sparse representation of a new sample on the multi-subspace
via group \emph{lasso}. The nonzero representation coefficients tend to
concentrate on the subspaces of labels that the sample belongs to, and thus an
effective prediction can be obtained. We evaluate SDGS on several real datasets
and compare it with popular methods. Results verify the effectiveness and
efficiency of SDGS.Comment: 13 pages, 3 table
A Survey of Model Compression and Acceleration for Deep Neural Networks
Deep neural networks (DNNs) have recently achieved great success in many
visual recognition tasks. However, existing deep neural network models are
computationally expensive and memory intensive, hindering their deployment in
devices with low memory resources or in applications with strict latency
requirements. Therefore, a natural thought is to perform model compression and
acceleration in deep networks without significantly decreasing the model
performance. During the past five years, tremendous progress has been made in
this area. In this paper, we review the recent techniques for compacting and
accelerating DNN models. In general, these techniques are divided into four
categories: parameter pruning and quantization, low-rank factorization,
transferred/compact convolutional filters, and knowledge distillation. Methods
of parameter pruning and quantization are described first, after that the other
techniques are introduced. For each category, we also provide insightful
analysis about the performance, related applications, advantages, and
drawbacks. Then we go through some very recent successful methods, for example,
dynamic capacity networks and stochastic depths networks. After that, we survey
the evaluation matrices, the main datasets used for evaluating the model
performance, and recent benchmark efforts. Finally, we conclude this paper,
discuss remaining the challenges and possible directions for future work.Comment: Published in IEEE Signal Processing Magazine, updated version
including more recent work
Expensive Optimisation: A Metaheuristics Perspective
Stochastic, iterative search methods such as Evolutionary Algorithms (EAs)
are proven to be efficient optimizers. However, they require evaluation of the
candidate solutions which may be prohibitively expensive in many real world
optimization problems. Use of approximate models or surrogates is being
explored as a way to reduce the number of such evaluations. In this paper we
investigated three such methods. The first method (DAFHEA) partially replaces
an expensive function evaluation by its approximate model. The approximation is
realized with support vector machine (SVM) regression models. The second method
(DAFHEA II) is an enhancement on DAFHEA to accommodate for uncertain
environments. The third one uses surrogate ranking with preference learning or
ordinal regression. The fitness of the candidates is estimated by modeling
their rank. The techniques' performances on some of the benchmark numerical
optimization problems have been reported. The comparative benefits and
shortcomings of both techniques have been identified.Comment: 7 page
Recent Advances in Convolutional Neural Network Acceleration
In recent years, convolutional neural networks (CNNs) have shown great
performance in various fields such as image classification, pattern
recognition, and multi-media compression. Two of the feature properties, local
connectivity and weight sharing, can reduce the number of parameters and
increase processing speed during training and inference. However, as the
dimension of data becomes higher and the CNN architecture becomes more
complicated, the end-to-end approach or the combined manner of CNN is
computationally intensive, which becomes limitation to CNN's further
implementation. Therefore, it is necessary and urgent to implement CNN in a
faster way. In this paper, we first summarize the acceleration methods that
contribute to but not limited to CNN by reviewing a broad variety of research
papers. We propose a taxonomy in terms of three levels, i.e.~structure level,
algorithm level, and implementation level, for acceleration methods. We also
analyze the acceleration methods in terms of CNN architecture compression,
algorithm optimization, and hardware-based improvement. At last, we give a
discussion on different perspectives of these acceleration and optimization
methods within each level. The discussion shows that the methods in each level
still have large exploration space. By incorporating such a wide range of
disciplines, we expect to provide a comprehensive reference for researchers who
are interested in CNN acceleration.Comment: submitted to Neurocomputin
Exploring Uncertainty Measures for Image-Caption Embedding-and-Retrieval Task
With the wide development of black-box machine learning algorithms,
particularly deep neural network (DNN), the practical demand for the
reliability assessment is rapidly rising. On the basis of the concept that
`Bayesian deep learning knows what it does not know,' the uncertainty of DNN
outputs has been investigated as a reliability measure for the classification
and regression tasks. However, in the image-caption retrieval task, well-known
samples are not always easy-to-retrieve samples. This study investigates two
aspects of image-caption embedding-and-retrieval systems. On one hand, we
quantify feature uncertainty by considering image-caption embedding as a
regression task, and use it for model averaging, which can improve the
retrieval performance. On the other hand, we further quantify posterior
uncertainty by considering the retrieval as a classification task, and use it
as a reliability measure, which can greatly improve the retrieval performance
by rejecting uncertain queries. The consistent performance of two uncertainty
measures is observed with different datasets (MS COCO and Flickr30k), different
deep learning architectures (dropout and batch normalization), and different
similarity functions
Fast and Accurate Pseudoinverse with Sparse Matrix Reordering and Incremental Approach
How can we compute the pseudoinverse of a sparse feature matrix efficiently
and accurately for solving optimization problems? A pseudoinverse is a
generalization of a matrix inverse, which has been extensively utilized as a
fundamental building block for solving linear systems in machine learning.
However, an approximate computation, let alone an exact computation, of
pseudoinverse is very time-consuming due to its demanding time complexity,
which limits it from being applied to large data. In this paper, we propose
FastPI (Fast PseudoInverse), a novel incremental singular value decomposition
(SVD) based pseudoinverse method for sparse matrices. Based on the observation
that many real-world feature matrices are sparse and highly skewed, FastPI
reorders and divides the feature matrix and incrementally computes low-rank SVD
from the divided components. To show the efficacy of proposed FastPI, we apply
them in real-world multi-label linear regression problems. Through extensive
experiments, we demonstrate that FastPI computes the pseudoinverse faster than
other approximate methods without loss of accuracy. %and uses much less memory
compared to full-rank SVD based approach. Results imply that our method
efficiently computes the low-rank pseudoinverse of a large and sparse matrix
that other existing methods cannot handle with limited time and space
Big Learning with Bayesian Methods
Explosive growth in data and availability of cheap computing resources have
sparked increasing interest in Big learning, an emerging subfield that studies
scalable machine learning algorithms, systems, and applications with Big Data.
Bayesian methods represent one important class of statistic methods for machine
learning, with substantial recent developments on adaptive, flexible and
scalable Bayesian learning. This article provides a survey of the recent
advances in Big learning with Bayesian methods, termed Big Bayesian Learning,
including nonparametric Bayesian methods for adaptively inferring model
complexity, regularized Bayesian inference for improving the flexibility via
posterior regularization, and scalable algorithms and systems based on
stochastic subsampling and distributed computing for dealing with large-scale
applications.Comment: 21 pages, 6 figure
Scalable Nonlinear AUC Maximization Methods
The area under the ROC curve (AUC) is a measure of interest in various
machine learning and data mining applications. It has been widely used to
evaluate classification performance on heavily imbalanced data. The kernelized
AUC maximization machines have established a superior generalization ability
compared to linear AUC machines because of their capability in modeling the
complex nonlinear structure underlying most real-world data. However, the high
training complexity renders the kernelized AUC machines infeasible for
large-scale data. In this paper, we present two nonlinear AUC maximization
algorithms that optimize pairwise linear classifiers over a finite-dimensional
feature space constructed via the k-means Nystr\"{o}m method. Our first
algorithm maximize the AUC metric by optimizing a pairwise squared hinge loss
function using the truncated Newton method. However, the second-order batch AUC
maximization method becomes expensive to optimize for extremely massive
datasets. This motivate us to develop a first-order stochastic AUC maximization
algorithm that incorporates a scheduled regularization update and scheduled
averaging techniques to accelerate the convergence of the classifier.
Experiments on several benchmark datasets demonstrate that the proposed AUC
classifiers are more efficient than kernelized AUC machines while they are able
to surpass or at least match the AUC performance of the kernelized AUC
machines. The experiments also show that the proposed stochastic AUC classifier
outperforms the state-of-the-art online AUC maximization methods in terms of
AUC classification accuracy
cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey
The "cvpaper.challenge" is a group composed of members from AIST, Tokyo Denki
Univ. (TDU), and Univ. of Tsukuba that aims to systematically summarize papers
on computer vision, pattern recognition, and related fields. For this
particular review, we focused on reading the ALL 602 conference papers
presented at the CVPR2015, the premier annual computer vision event held in
June 2015, in order to grasp the trends in the field. Further, we are proposing
"DeepSurvey" as a mechanism embodying the entire process from the reading
through all the papers, the generation of ideas, and to the writing of paper.Comment: Survey Pape
ELKI: A large open-source library for data analysis - ELKI Release 0.7.5 "Heidelberg"
This paper documents the release of the ELKI data mining framework, version
0.7.5.
ELKI is an open source (AGPLv3) data mining software written in Java. The
focus of ELKI is research in algorithms, with an emphasis on unsupervised
methods in cluster analysis and outlier detection. In order to achieve high
performance and scalability, ELKI offers data index structures such as the
R*-tree that can provide major performance gains. ELKI is designed to be easy
to extend for researchers and students in this domain, and welcomes
contributions of additional methods. ELKI aims at providing a large collection
of highly parameterizable algorithms, in order to allow easy and fair
evaluation and benchmarking of algorithms.
We will first outline the motivation for this release, the plans for the
future, and then give a brief overview over the new functionality in this
version. We also include an appendix presenting an overview on the overall
implemented functionality
- …