5,254 research outputs found
Deep Boosting: Joint Feature Selection and Analysis Dictionary Learning in Hierarchy
This work investigates how the traditional image classification pipelines can
be extended into a deep architecture, inspired by recent successes of deep
neural networks. We propose a deep boosting framework based on layer-by-layer
joint feature boosting and dictionary learning. In each layer, we construct a
dictionary of filters by combining the filters from the lower layer, and
iteratively optimize the image representation with a joint
discriminative-generative formulation, i.e. minimization of empirical
classification error plus regularization of analysis image generation over
training images. For optimization, we perform two iterating steps: i) to
minimize the classification error, select the most discriminative features
using the gentle adaboost algorithm; ii) according to the feature selection,
update the filters to minimize the regularization on analysis image
representation using the gradient descent method. Once the optimization is
converged, we learn the higher layer representation in the same way. Our model
delivers several distinct advantages. First, our layer-wise optimization
provides the potential to build very deep architectures. Second, the generated
image representation is compact and meaningful. In several visual recognition
tasks, our framework outperforms existing state-of-the-art approaches
Training Skinny Deep Neural Networks with Iterative Hard Thresholding Methods
Deep neural networks have achieved remarkable success in a wide range of
practical problems. However, due to the inherent large parameter space, deep
models are notoriously prone to overfitting and difficult to be deployed in
portable devices with limited memory. In this paper, we propose an iterative
hard thresholding (IHT) approach to train Skinny Deep Neural Networks (SDNNs).
An SDNN has much fewer parameters yet can achieve competitive or even better
performance than its full CNN counterpart. More concretely, the IHT approach
trains an SDNN through following two alternative phases: (I) perform hard
thresholding to drop connections with small activations and fine-tune the other
significant filters; (II)~re-activate the frozen connections and train the
entire network to improve its overall discriminative capability. We verify the
superiority of SDNNs in terms of efficiency and classification performance on
four benchmark object recognition datasets, including CIFAR-10, CIFAR-100,
MNIST and ImageNet. Experimental results clearly demonstrate that IHT can be
applied for training SDNN based on various CNN architectures such as NIN and
AlexNet
Behavior Query Discovery in System-Generated Temporal Graphs
Computer system monitoring generates huge amounts of logs that record the
interaction of system entities. How to query such data to better understand
system behaviors and identify potential system risks and malicious behaviors
becomes a challenging task for system administrators due to the dynamics and
heterogeneity of the data. System monitoring data are essentially heterogeneous
temporal graphs with nodes being system entities and edges being their
interactions over time. Given the complexity of such graphs, it becomes
time-consuming for system administrators to manually formulate useful queries
in order to examine abnormal activities, attacks, and vulnerabilities in
computer systems.
In this work, we investigate how to query temporal graphs and treat query
formulation as a discriminative temporal graph pattern mining problem. We
introduce TGMiner to mine discriminative patterns from system logs, and these
patterns can be taken as templates for building more complex queries. TGMiner
leverages temporal information in graphs to prune graph patterns that share
similar growth trend without compromising pattern quality. Experimental results
on real system data show that TGMiner is 6-32 times faster than baseline
methods. The discovered patterns were verified by system experts; they achieved
high precision (97%) and recall (91%).Comment: The full version of the paper "Behavior Query Discovery in
System-Generated Temporal Graphs", to appear in VLDB'1
The Convergence of Machine Learning and Communications
The areas of machine learning and communication technology are converging.
Today's communications systems generate a huge amount of traffic data, which
can help to significantly enhance the design and management of networks and
communication components when combined with advanced machine learning methods.
Furthermore, recently developed end-to-end training procedures offer new ways
to jointly optimize the components of a communication system. Also in many
emerging application fields of communication technology, e.g., smart cities or
internet of things, machine learning methods are of central importance. This
paper gives an overview over the use of machine learning in different areas of
communications and discusses two exemplar applications in wireless networking.
Furthermore, it identifies promising future research topics and discusses their
potential impact.Comment: 8 pages, 4 figure
Extreme Classification in Log Memory
We present Merged-Averaged Classifiers via Hashing (MACH) for
K-classification with ultra-large values of K. Compared to traditional
one-vs-all classifiers that require O(Kd) memory and inference cost, MACH only
need O(d log K) (d is dimensionality )memory while only requiring O(K log K + d
log K) operation for inference. MACH is a generic K-classification algorithm,
with provably theoretical guarantees, which requires O(log K) memory without
any assumption on the relationship between classes. MACH uses universal hashing
to reduce classification with a large number of classes to few independent
classification tasks with small (constant) number of classes. We provide
theoretical quantification of discriminability-memory tradeoff. With MACH we
can train ODP dataset with 100,000 classes and 400,000 features on a single
Titan X GPU, with the classification accuracy of 19.28%, which is the
best-reported accuracy on this dataset. Before this work, the best performing
baseline is a one-vs-all classifier that requires 40 billion parameters (160 GB
model size) and achieves 9% accuracy. In contrast, MACH can achieve 9% accuracy
with 480x reduction in the model size (of mere 0.3GB). With MACH, we also
demonstrate complete training of fine-grained imagenet dataset (compressed size
104GB), with 21,000 classes, on a single GPU. To the best of our knowledge,
this is the first work to demonstrate complete training of these extreme-class
datasets on a single Titan X
Face Recognition: A Novel Multi-Level Taxonomy based Survey
In a world where security issues have been gaining growing importance, face
recognition systems have attracted increasing attention in multiple application
areas, ranging from forensics and surveillance to commerce and entertainment.
To help understanding the landscape and abstraction levels relevant for face
recognition systems, face recognition taxonomies allow a deeper dissection and
comparison of the existing solutions. This paper proposes a new, more
encompassing and richer multi-level face recognition taxonomy, facilitating the
organization and categorization of available and emerging face recognition
solutions; this taxonomy may also guide researchers in the development of more
efficient face recognition solutions. The proposed multi-level taxonomy
considers levels related to the face structure, feature support and feature
extraction approach. Following the proposed taxonomy, a comprehensive survey of
representative face recognition solutions is presented. The paper concludes
with a discussion on current algorithmic and application related challenges
which may define future research directions for face recognition.Comment: This paper is a preprint of a paper submitted to IET Biometrics. If
accepted, the copy of record will be available at the IET Digital Librar
A Mixtures-of-Experts Framework for Multi-Label Classification
We develop a novel probabilistic approach for multi-label classification that
is based on the mixtures-of-experts architecture combined with recently
introduced conditional tree-structured Bayesian networks. Our approach captures
different input-output relations from multi-label data using the efficient
tree-structured classifiers, while the mixtures-of-experts architecture aims to
compensate for the tree-structured restrictions and build a more accurate
model. We develop and present algorithms for learning the model from data and
for performing multi-label predictions on future data instances. Experiments on
multiple benchmark datasets demonstrate that our approach achieves highly
competitive results and outperforms the existing state-of-the-art multi-label
classification methods
Machine learning based hyperspectral image analysis: A survey
Hyperspectral sensors enable the study of the chemical properties of scene
materials remotely for the purpose of identification, detection, and chemical
composition analysis of objects in the environment. Hence, hyperspectral images
captured from earth observing satellites and aircraft have been increasingly
important in agriculture, environmental monitoring, urban planning, mining, and
defense. Machine learning algorithms due to their outstanding predictive power
have become a key tool for modern hyperspectral image analysis. Therefore, a
solid understanding of machine learning techniques have become essential for
remote sensing researchers and practitioners. This paper reviews and compares
recent machine learning-based hyperspectral image analysis methods published in
literature. We organize the methods by the image analysis task and by the type
of machine learning algorithm, and present a two-way mapping between the image
analysis tasks and the types of machine learning algorithms that can be applied
to them. The paper is comprehensive in coverage of both hyperspectral image
analysis tasks and machine learning algorithms. The image analysis tasks
considered are land cover classification, target detection, unmixing, and
physical parameter estimation. The machine learning algorithms covered are
Gaussian models, linear regression, logistic regression, support vector
machines, Gaussian mixture model, latent linear models, sparse linear models,
Gaussian mixture models, ensemble learning, directed graphical models,
undirected graphical models, clustering, Gaussian processes, Dirichlet
processes, and deep learning. We also discuss the open challenges in the field
of hyperspectral image analysis and explore possible future directions
Fuzziness-based Spatial-Spectral Class Discriminant Information Preserving Active Learning for Hyperspectral Image Classification
Traditional Active/Self/Interactive Learning for Hyperspectral Image
Classification (HSIC) increases the size of the training set without
considering the class scatters and randomness among the existing and new
samples. Second, very limited research has been carried out on joint
spectral-spatial information and finally, a minor but still worth mentioning is
the stopping criteria which not being much considered by the community.
Therefore, this work proposes a novel fuzziness-based spatial-spectral within
and between for both local and global class discriminant information preserving
(FLG) method. We first investigate a spatial prior fuzziness-based
misclassified sample information. We then compute the total local and global
for both within and between class information and formulate it in a
fine-grained manner. Later this information is fed to a discriminative
objective function to query the heterogeneous samples which eliminate the
randomness among the training samples. Experimental results on benchmark HSI
datasets demonstrate the effectiveness of the FLG method on Generative, Extreme
Learning Machine and Sparse Multinomial Logistic Regression (SMLR)-LORSAL
classifiers.Comment: 13 pages, 7 figure
Identifying Dwarfs Workloads in Big Data Analytics
Big data benchmarking is particularly important and provides applicable
yardsticks for evaluating booming big data systems. However, wide coverage and
great complexity of big data computing impose big challenges on big data
benchmarking. How can we construct a benchmark suite using a minimum set of
units of computation to represent diversity of big data analytics workloads?
Big data dwarfs are abstractions of extracting frequently appearing operations
in big data computing. One dwarf represents one unit of computation, and big
data workloads are decomposed into one or more dwarfs. Furthermore, dwarfs
workloads rather than vast real workloads are more cost-efficient and
representative to evaluate big data systems. In this paper, we extensively
investigate six most important or emerging application domains i.e. search
engine, social network, e-commerce, multimedia, bioinformatics and astronomy.
After analyzing forty representative algorithms, we single out eight dwarfs
workloads in big data analytics other than OLAP, which are linear algebra,
sampling, logic operations, transform operations, set operations, graph
operations, statistic operations and sort
- …