8,542 research outputs found
Deep Kernel Learning
We introduce scalable deep kernels, which combine the structural properties
of deep learning architectures with the non-parametric flexibility of kernel
methods. Specifically, we transform the inputs of a spectral mixture base
kernel with a deep architecture, using local kernel interpolation, inducing
points, and structure exploiting (Kronecker and Toeplitz) algebra for a
scalable kernel representation. These closed-form kernels can be used as
drop-in replacements for standard kernels, with benefits in expressive power
and scalability. We jointly learn the properties of these kernels through the
marginal likelihood of a Gaussian process. Inference and learning cost
for training points, and predictions cost per test point. On a large
and diverse collection of applications, including a dataset with 2 million
examples, we show improved performance over scalable Gaussian processes with
flexible kernel learning models, and stand-alone deep architectures.Comment: 19 pages, 6 figure
Detection and classification of masses in mammographic images in a multi-kernel approach
According to the World Health Organization, breast cancer is the main cause
of cancer death among adult women in the world. Although breast cancer occurs
indiscriminately in countries with several degrees of social and economic
development, among developing and underdevelopment countries mortality rates
are still high, due to low availability of early detection technologies. From
the clinical point of view, mammography is still the most effective diagnostic
technology, given the wide diffusion of the use and interpretation of these
images. Herein this work we propose a method to detect and classify
mammographic lesions using the regions of interest of images. Our proposal
consists in decomposing each image using multi-resolution wavelets. Zernike
moments are extracted from each wavelet component. Using this approach we can
combine both texture and shape features, which can be applied both to the
detection and classification of mammary lesions. We used 355 images of fatty
breast tissue of IRMA database, with 233 normal instances (no lesion), 72
benign, and 83 malignant cases. Classification was performed by using SVM and
ELM networks with modified kernels, in order to optimize accuracy rates,
reaching 94.11%. Considering both accuracy rates and training times, we defined
the ration between average percentage accuracy and average training time in a
reverse order. Our proposal was 50 times higher than the ratio obtained using
the best method of the state-of-the-art. As our proposed model can combine high
accuracy rate with low learning time, whenever a new data is received, our work
will be able to save a lot of time, hours, in learning process in relation to
the best method of the state-of-the-art
Basic Filters for Convolutional Neural Networks Applied to Music: Training or Design?
When convolutional neural networks are used to tackle learning problems based
on music or, more generally, time series data, raw one-dimensional data are
commonly pre-processed to obtain spectrogram or mel-spectrogram coefficients,
which are then used as input to the actual neural network. In this
contribution, we investigate, both theoretically and experimentally, the
influence of this pre-processing step on the network's performance and pose the
question, whether replacing it by applying adaptive or learned filters directly
to the raw data, can improve learning success. The theoretical results show
that approximately reproducing mel-spectrogram coefficients by applying
adaptive filters and subsequent time-averaging is in principle possible. We
also conducted extensive experimental work on the task of singing voice
detection in music. The results of these experiments show that for
classification based on Convolutional Neural Networks the features obtained
from adaptive filter banks followed by time-averaging perform better than the
canonical Fourier-transform-based mel-spectrogram coefficients. Alternative
adaptive approaches with center frequencies or time-averaging lengths learned
from training data perform equally well.Comment: Completely revised version; 21 pages, 4 figure
Object Relation Detection Based on One-shot Learning
Detecting the relations among objects, such as "cat on sofa" and "person ride
horse", is a crucial task in image understanding, and beneficial to bridging
the semantic gap between images and natural language. Despite the remarkable
progress of deep learning in detection and recognition of individual objects,
it is still a challenging task to localize and recognize the relations between
objects due to the complex combinatorial nature of various kinds of object
relations. Inspired by the recent advances in one-shot learning, we propose a
simple yet effective Semantics Induced Learner (SIL) model for solving this
challenging task. Learning in one-shot manner can enable a detection model to
adapt to a huge number of object relations with diverse appearance effectively
and robustly. In addition, the SIL combines bottom-up and top-down attention
mech- anisms, therefore enabling attention at the level of vision and semantics
favorably. Within our proposed model, the bottom-up mechanism, which is based
on Faster R-CNN, proposes objects regions, and the top-down mechanism selects
and integrates visual features according to semantic information. Experiments
demonstrate the effectiveness of our framework over other state-of-the-art
methods on two large-scale data sets for object relation detection
Stock Forecasting using M-Band Wavelet-Based SVR and RNN-LSTMs Models
The task of predicting future stock values has always been one that is
heavily desired albeit very difficult. This difficulty arises from stocks with
non-stationary behavior, and without any explicit form. Hence, predictions are
best made through analysis of financial stock data. To handle big data sets,
current convention involves the use of the Moving Average. However, by
utilizing the Wavelet Transform in place of the Moving Average to denoise stock
signals, financial data can be smoothened and more accurately broken down. This
newly transformed, denoised, and more stable stock data can be followed up by
non-parametric statistical methods, such as Support Vector Regression (SVR) and
Recurrent Neural Network (RNN) based Long Short-Term Memory (LSTM) networks to
predict future stock prices. Through the implementation of these methods, one
is left with a more accurate stock forecast, and in turn, increased profits
Survey of data mining approaches to user modeling for adaptive hypermedia
The ability of an adaptive hypermedia system to create tailored environments depends mainly on the amount and accuracy of information stored in each user model. Some of the difficulties that user modeling faces are the amount of data available to create user models, the adequacy of the data, the noise within that data, and the necessity of capturing the imprecise nature of human behavior. Data mining and machine learning techniques have the ability to handle large amounts of data and to process uncertainty. These characteristics make these techniques suitable for automatic generation of user models that simulate human decision making. This paper surveys different data mining techniques that can be used to efficiently and accurately capture user behavior. The paper also presents guidelines that show which techniques may be used more efficiently according to the task implemented by the applicatio
Deep Neural Network Based Hyperspectral Pixel Classification With Factorized Spectral-Spatial Feature Representation
Deep learning has been widely used for hyperspectral pixel classification due
to its ability of generating deep feature representation. However, how to
construct an efficient and powerful network suitable for hyperspectral data is
still under exploration. In this paper, a novel neural network model is
designed for taking full advantage of the spectral-spatial structure of
hyperspectral data. Firstly, we extract pixel-based intrinsic features from
rich yet redundant spectral bands by a subnetwork with supervised pre-training
scheme. Secondly, in order to utilize the local spatial correlation among
pixels, we share the previous subnetwork as a spectral feature extractor for
each pixel in a patch of image, after which the spectral features of all pixels
in a patch are combined and feeded into the subsequent classification
subnetwork. Finally, the whole network is further fine-tuned to improve its
classification performance. Specially, the spectral-spatial factorization
scheme is applied in our model architecture, making the network size and the
number of parameters great less than the existing spectral-spatial deep
networks for hyperspectral image classification. Experiments on the
hyperspectral data sets show that, compared with some state-of-art deep
learning methods, our method achieves better classification results while
having smaller network size and less parameters.Comment: 12 pages, 10 figure
A Survey of Model Compression and Acceleration for Deep Neural Networks
Deep neural networks (DNNs) have recently achieved great success in many
visual recognition tasks. However, existing deep neural network models are
computationally expensive and memory intensive, hindering their deployment in
devices with low memory resources or in applications with strict latency
requirements. Therefore, a natural thought is to perform model compression and
acceleration in deep networks without significantly decreasing the model
performance. During the past five years, tremendous progress has been made in
this area. In this paper, we review the recent techniques for compacting and
accelerating DNN models. In general, these techniques are divided into four
categories: parameter pruning and quantization, low-rank factorization,
transferred/compact convolutional filters, and knowledge distillation. Methods
of parameter pruning and quantization are described first, after that the other
techniques are introduced. For each category, we also provide insightful
analysis about the performance, related applications, advantages, and
drawbacks. Then we go through some very recent successful methods, for example,
dynamic capacity networks and stochastic depths networks. After that, we survey
the evaluation matrices, the main datasets used for evaluating the model
performance, and recent benchmark efforts. Finally, we conclude this paper,
discuss remaining the challenges and possible directions for future work.Comment: Published in IEEE Signal Processing Magazine, updated version
including more recent work
Polyphonic Sound Event Detection by using Capsule Neural Networks
Artificial sound event detection (SED) has the aim to mimic the human ability
to perceive and understand what is happening in the surroundings. Nowadays,
Deep Learning offers valuable techniques for this goal such as Convolutional
Neural Networks (CNNs). The Capsule Neural Network (CapsNet) architecture has
been recently introduced in the image processing field with the intent to
overcome some of the known limitations of CNNs, specifically regarding the
scarce robustness to affine transformations (i.e., perspective, size,
orientation) and the detection of overlapped images. This motivated the authors
to employ CapsNets to deal with the polyphonic-SED task, in which multiple
sound events occur simultaneously. Specifically, we propose to exploit the
capsule units to represent a set of distinctive properties for each individual
sound event. Capsule units are connected through a so-called "dynamic routing"
that encourages learning part-whole relationships and improves the detection
performance in a polyphonic context. This paper reports extensive evaluations
carried out on three publicly available datasets, showing how the CapsNet-based
algorithm not only outperforms standard CNNs but also allows to achieve the
best results with respect to the state of the art algorithms
Recent Advances in Convolutional Neural Network Acceleration
In recent years, convolutional neural networks (CNNs) have shown great
performance in various fields such as image classification, pattern
recognition, and multi-media compression. Two of the feature properties, local
connectivity and weight sharing, can reduce the number of parameters and
increase processing speed during training and inference. However, as the
dimension of data becomes higher and the CNN architecture becomes more
complicated, the end-to-end approach or the combined manner of CNN is
computationally intensive, which becomes limitation to CNN's further
implementation. Therefore, it is necessary and urgent to implement CNN in a
faster way. In this paper, we first summarize the acceleration methods that
contribute to but not limited to CNN by reviewing a broad variety of research
papers. We propose a taxonomy in terms of three levels, i.e.~structure level,
algorithm level, and implementation level, for acceleration methods. We also
analyze the acceleration methods in terms of CNN architecture compression,
algorithm optimization, and hardware-based improvement. At last, we give a
discussion on different perspectives of these acceleration and optimization
methods within each level. The discussion shows that the methods in each level
still have large exploration space. By incorporating such a wide range of
disciplines, we expect to provide a comprehensive reference for researchers who
are interested in CNN acceleration.Comment: submitted to Neurocomputin
- …