8,542 research outputs found

    Deep Kernel Learning

    Full text link
    We introduce scalable deep kernels, which combine the structural properties of deep learning architectures with the non-parametric flexibility of kernel methods. Specifically, we transform the inputs of a spectral mixture base kernel with a deep architecture, using local kernel interpolation, inducing points, and structure exploiting (Kronecker and Toeplitz) algebra for a scalable kernel representation. These closed-form kernels can be used as drop-in replacements for standard kernels, with benefits in expressive power and scalability. We jointly learn the properties of these kernels through the marginal likelihood of a Gaussian process. Inference and learning cost O(n)O(n) for nn training points, and predictions cost O(1)O(1) per test point. On a large and diverse collection of applications, including a dataset with 2 million examples, we show improved performance over scalable Gaussian processes with flexible kernel learning models, and stand-alone deep architectures.Comment: 19 pages, 6 figure

    Detection and classification of masses in mammographic images in a multi-kernel approach

    Full text link
    According to the World Health Organization, breast cancer is the main cause of cancer death among adult women in the world. Although breast cancer occurs indiscriminately in countries with several degrees of social and economic development, among developing and underdevelopment countries mortality rates are still high, due to low availability of early detection technologies. From the clinical point of view, mammography is still the most effective diagnostic technology, given the wide diffusion of the use and interpretation of these images. Herein this work we propose a method to detect and classify mammographic lesions using the regions of interest of images. Our proposal consists in decomposing each image using multi-resolution wavelets. Zernike moments are extracted from each wavelet component. Using this approach we can combine both texture and shape features, which can be applied both to the detection and classification of mammary lesions. We used 355 images of fatty breast tissue of IRMA database, with 233 normal instances (no lesion), 72 benign, and 83 malignant cases. Classification was performed by using SVM and ELM networks with modified kernels, in order to optimize accuracy rates, reaching 94.11%. Considering both accuracy rates and training times, we defined the ration between average percentage accuracy and average training time in a reverse order. Our proposal was 50 times higher than the ratio obtained using the best method of the state-of-the-art. As our proposed model can combine high accuracy rate with low learning time, whenever a new data is received, our work will be able to save a lot of time, hours, in learning process in relation to the best method of the state-of-the-art

    Basic Filters for Convolutional Neural Networks Applied to Music: Training or Design?

    Full text link
    When convolutional neural networks are used to tackle learning problems based on music or, more generally, time series data, raw one-dimensional data are commonly pre-processed to obtain spectrogram or mel-spectrogram coefficients, which are then used as input to the actual neural network. In this contribution, we investigate, both theoretically and experimentally, the influence of this pre-processing step on the network's performance and pose the question, whether replacing it by applying adaptive or learned filters directly to the raw data, can improve learning success. The theoretical results show that approximately reproducing mel-spectrogram coefficients by applying adaptive filters and subsequent time-averaging is in principle possible. We also conducted extensive experimental work on the task of singing voice detection in music. The results of these experiments show that for classification based on Convolutional Neural Networks the features obtained from adaptive filter banks followed by time-averaging perform better than the canonical Fourier-transform-based mel-spectrogram coefficients. Alternative adaptive approaches with center frequencies or time-averaging lengths learned from training data perform equally well.Comment: Completely revised version; 21 pages, 4 figure

    Object Relation Detection Based on One-shot Learning

    Full text link
    Detecting the relations among objects, such as "cat on sofa" and "person ride horse", is a crucial task in image understanding, and beneficial to bridging the semantic gap between images and natural language. Despite the remarkable progress of deep learning in detection and recognition of individual objects, it is still a challenging task to localize and recognize the relations between objects due to the complex combinatorial nature of various kinds of object relations. Inspired by the recent advances in one-shot learning, we propose a simple yet effective Semantics Induced Learner (SIL) model for solving this challenging task. Learning in one-shot manner can enable a detection model to adapt to a huge number of object relations with diverse appearance effectively and robustly. In addition, the SIL combines bottom-up and top-down attention mech- anisms, therefore enabling attention at the level of vision and semantics favorably. Within our proposed model, the bottom-up mechanism, which is based on Faster R-CNN, proposes objects regions, and the top-down mechanism selects and integrates visual features according to semantic information. Experiments demonstrate the effectiveness of our framework over other state-of-the-art methods on two large-scale data sets for object relation detection

    Stock Forecasting using M-Band Wavelet-Based SVR and RNN-LSTMs Models

    Full text link
    The task of predicting future stock values has always been one that is heavily desired albeit very difficult. This difficulty arises from stocks with non-stationary behavior, and without any explicit form. Hence, predictions are best made through analysis of financial stock data. To handle big data sets, current convention involves the use of the Moving Average. However, by utilizing the Wavelet Transform in place of the Moving Average to denoise stock signals, financial data can be smoothened and more accurately broken down. This newly transformed, denoised, and more stable stock data can be followed up by non-parametric statistical methods, such as Support Vector Regression (SVR) and Recurrent Neural Network (RNN) based Long Short-Term Memory (LSTM) networks to predict future stock prices. Through the implementation of these methods, one is left with a more accurate stock forecast, and in turn, increased profits

    Survey of data mining approaches to user modeling for adaptive hypermedia

    Get PDF
    The ability of an adaptive hypermedia system to create tailored environments depends mainly on the amount and accuracy of information stored in each user model. Some of the difficulties that user modeling faces are the amount of data available to create user models, the adequacy of the data, the noise within that data, and the necessity of capturing the imprecise nature of human behavior. Data mining and machine learning techniques have the ability to handle large amounts of data and to process uncertainty. These characteristics make these techniques suitable for automatic generation of user models that simulate human decision making. This paper surveys different data mining techniques that can be used to efficiently and accurately capture user behavior. The paper also presents guidelines that show which techniques may be used more efficiently according to the task implemented by the applicatio

    Deep Neural Network Based Hyperspectral Pixel Classification With Factorized Spectral-Spatial Feature Representation

    Full text link
    Deep learning has been widely used for hyperspectral pixel classification due to its ability of generating deep feature representation. However, how to construct an efficient and powerful network suitable for hyperspectral data is still under exploration. In this paper, a novel neural network model is designed for taking full advantage of the spectral-spatial structure of hyperspectral data. Firstly, we extract pixel-based intrinsic features from rich yet redundant spectral bands by a subnetwork with supervised pre-training scheme. Secondly, in order to utilize the local spatial correlation among pixels, we share the previous subnetwork as a spectral feature extractor for each pixel in a patch of image, after which the spectral features of all pixels in a patch are combined and feeded into the subsequent classification subnetwork. Finally, the whole network is further fine-tuned to improve its classification performance. Specially, the spectral-spatial factorization scheme is applied in our model architecture, making the network size and the number of parameters great less than the existing spectral-spatial deep networks for hyperspectral image classification. Experiments on the hyperspectral data sets show that, compared with some state-of-art deep learning methods, our method achieves better classification results while having smaller network size and less parameters.Comment: 12 pages, 10 figure

    A Survey of Model Compression and Acceleration for Deep Neural Networks

    Full text link
    Deep neural networks (DNNs) have recently achieved great success in many visual recognition tasks. However, existing deep neural network models are computationally expensive and memory intensive, hindering their deployment in devices with low memory resources or in applications with strict latency requirements. Therefore, a natural thought is to perform model compression and acceleration in deep networks without significantly decreasing the model performance. During the past five years, tremendous progress has been made in this area. In this paper, we review the recent techniques for compacting and accelerating DNN models. In general, these techniques are divided into four categories: parameter pruning and quantization, low-rank factorization, transferred/compact convolutional filters, and knowledge distillation. Methods of parameter pruning and quantization are described first, after that the other techniques are introduced. For each category, we also provide insightful analysis about the performance, related applications, advantages, and drawbacks. Then we go through some very recent successful methods, for example, dynamic capacity networks and stochastic depths networks. After that, we survey the evaluation matrices, the main datasets used for evaluating the model performance, and recent benchmark efforts. Finally, we conclude this paper, discuss remaining the challenges and possible directions for future work.Comment: Published in IEEE Signal Processing Magazine, updated version including more recent work

    Polyphonic Sound Event Detection by using Capsule Neural Networks

    Full text link
    Artificial sound event detection (SED) has the aim to mimic the human ability to perceive and understand what is happening in the surroundings. Nowadays, Deep Learning offers valuable techniques for this goal such as Convolutional Neural Networks (CNNs). The Capsule Neural Network (CapsNet) architecture has been recently introduced in the image processing field with the intent to overcome some of the known limitations of CNNs, specifically regarding the scarce robustness to affine transformations (i.e., perspective, size, orientation) and the detection of overlapped images. This motivated the authors to employ CapsNets to deal with the polyphonic-SED task, in which multiple sound events occur simultaneously. Specifically, we propose to exploit the capsule units to represent a set of distinctive properties for each individual sound event. Capsule units are connected through a so-called "dynamic routing" that encourages learning part-whole relationships and improves the detection performance in a polyphonic context. This paper reports extensive evaluations carried out on three publicly available datasets, showing how the CapsNet-based algorithm not only outperforms standard CNNs but also allows to achieve the best results with respect to the state of the art algorithms

    Recent Advances in Convolutional Neural Network Acceleration

    Full text link
    In recent years, convolutional neural networks (CNNs) have shown great performance in various fields such as image classification, pattern recognition, and multi-media compression. Two of the feature properties, local connectivity and weight sharing, can reduce the number of parameters and increase processing speed during training and inference. However, as the dimension of data becomes higher and the CNN architecture becomes more complicated, the end-to-end approach or the combined manner of CNN is computationally intensive, which becomes limitation to CNN's further implementation. Therefore, it is necessary and urgent to implement CNN in a faster way. In this paper, we first summarize the acceleration methods that contribute to but not limited to CNN by reviewing a broad variety of research papers. We propose a taxonomy in terms of three levels, i.e.~structure level, algorithm level, and implementation level, for acceleration methods. We also analyze the acceleration methods in terms of CNN architecture compression, algorithm optimization, and hardware-based improvement. At last, we give a discussion on different perspectives of these acceleration and optimization methods within each level. The discussion shows that the methods in each level still have large exploration space. By incorporating such a wide range of disciplines, we expect to provide a comprehensive reference for researchers who are interested in CNN acceleration.Comment: submitted to Neurocomputin
    corecore