2,609 research outputs found

    Generative Models For Deep Learning with Very Scarce Data

    Full text link
    The goal of this paper is to deal with a data scarcity scenario where deep learning techniques use to fail. We compare the use of two well established techniques, Restricted Boltzmann Machines and Variational Auto-encoders, as generative models in order to increase the training set in a classification framework. Essentially, we rely on Markov Chain Monte Carlo (MCMC) algorithms for generating new samples. We show that generalization can be improved comparing this methodology to other state-of-the-art techniques, e.g. semi-supervised learning with ladder networks. Furthermore, we show that RBM is better than VAE generating new samples for training a classifier with good generalization capabilities

    Learning from Noisy Label Distributions

    Full text link
    In this paper, we consider a novel machine learning problem, that is, learning a classifier from noisy label distributions. In this problem, each instance with a feature vector belongs to at least one group. Then, instead of the true label of each instance, we observe the label distribution of the instances associated with a group, where the label distribution is distorted by an unknown noise. Our goals are to (1) estimate the true label of each instance, and (2) learn a classifier that predicts the true label of a new instance. We propose a probabilistic model that considers true label distributions of groups and parameters that represent the noise as hidden variables. The model can be learned based on a variational Bayesian method. In numerical experiments, we show that the proposed model outperforms existing methods in terms of the estimation of the true labels of instances.Comment: Accepted in ICANN201

    Institutional Effects in a Simple Model of Educational Production

    Get PDF
    This paper presents a model of educational production that tries to make sense of recent evidence on effects of institutional arrangements on student performance. In a simple principal-agent framework, students choose their learning effort to maximize their net benefits, while the government chooses educational spending to maximize its net benefits. In the jointly determined equilibrium, schooling quality is shown to depend on several institutionally determined parameters. The impact on student performance of institutions such as central examinations, centralization versus school autonomy, teachers\u27 influence, parental influence, and competition from private schools is analyzed. Furthermore, the model can rationalize why positive resource effects may be lacking in educational production

    Bridging the Gap between Probabilistic and Deterministic Models: A Simulation Study on a Variational Bayes Predictive Coding Recurrent Neural Network Model

    Full text link
    The current paper proposes a novel variational Bayes predictive coding RNN model, which can learn to generate fluctuated temporal patterns from exemplars. The model learns to maximize the lower bound of the weighted sum of the regularization and reconstruction error terms. We examined how this weighting can affect development of different types of information processing while learning fluctuated temporal patterns. Simulation results show that strong weighting of the reconstruction term causes the development of deterministic chaos for imitating the randomness observed in target sequences, while strong weighting of the regularization term causes the development of stochastic dynamics imitating probabilistic processes observed in targets. Moreover, results indicate that the most generalized learning emerges between these two extremes. The paper concludes with implications in terms of the underlying neuronal mechanisms for autism spectrum disorder and for free action.Comment: This paper is accepted the 24th International Conference On Neural Information Processing (ICONIP 2017). The previous submission to arXiv is replaced by this version because there was an error in Equation

    Analysis of dropout learning regarded as ensemble learning

    Full text link
    Deep learning is the state-of-the-art in fields such as visual object recognition and speech recognition. This learning uses a large number of layers, huge number of units, and connections. Therefore, overfitting is a serious problem. To avoid this problem, dropout learning is proposed. Dropout learning neglects some inputs and hidden units in the learning process with a probability, p, and then, the neglected inputs and hidden units are combined with the learned network to express the final output. We find that the process of combining the neglected hidden units with the learned network can be regarded as ensemble learning, so we analyze dropout learning from this point of view.Comment: 9 pages, 8 figures, submitted to Conferenc

    Towards Analyzing Semantic Robustness of Deep Neural Networks

    Full text link
    Despite the impressive performance of Deep Neural Networks (DNNs) on various vision tasks, they still exhibit erroneous high sensitivity toward semantic primitives (e.g. object pose). We propose a theoretically grounded analysis for DNN robustness in the semantic space. We qualitatively analyze different DNNs' semantic robustness by visualizing the DNN global behavior as semantic maps and observe interesting behavior of some DNNs. Since generating these semantic maps does not scale well with the dimensionality of the semantic space, we develop a bottom-up approach to detect robust regions of DNNs. To achieve this, we formalize the problem of finding robust semantic regions of the network as optimizing integral bounds and we develop expressions for update directions of the region bounds. We use our developed formulations to quantitatively evaluate the semantic robustness of different popular network architectures. We show through extensive experimentation that several networks, while trained on the same dataset and enjoying comparable accuracy, do not necessarily perform similarly in semantic robustness. For example, InceptionV3 is more accurate despite being less semantically robust than ResNet50. We hope that this tool will serve as a milestone towards understanding the semantic robustness of DNNs.Comment: Presented at European conference on computer vision (ECCV 2020) Workshop on Adversarial Robustness in the Real World ( https://eccv20-adv-workshop.github.io/ ) [best paper award]. The code is available at https://github.com/ajhamdi/semantic-robustnes

    Outlier detection with partial information:Application to emergency mapping

    Get PDF
    This paper, addresses the problem of novelty detection in the case that the observed data is a mixture of a known 'background' process contaminated with an unknown other process, which generates the outliers, or novel observations. The framework we describe here is quite general, employing univariate classification with incomplete information, based on knowledge of the distribution (the 'probability density function', 'pdf') of the data generated by the 'background' process. The relative proportion of this 'background' component (the 'prior' 'background' 'probability), the 'pdf' and the 'prior' probabilities of all other components are all assumed unknown. The main contribution is a new classification scheme that identifies the maximum proportion of observed data following the known 'background' distribution. The method exploits the Kolmogorov-Smirnov test to estimate the proportions, and afterwards data are Bayes optimally separated. Results, demonstrated with synthetic data, show that this approach can produce more reliable results than a standard novelty detection scheme. The classification algorithm is then applied to the problem of identifying outliers in the SIC2004 data set, in order to detect the radioactive release simulated in the 'oker' data set. We propose this method as a reliable means of novelty detection in the emergency situation which can also be used to identify outliers prior to the application of a more general automatic mapping algorithm. © Springer-Verlag 2007

    A multivariate framework to study spatio-temporal dependency of electricity load and wind power

    Get PDF
    With massive wind power integration, the spatial distribution of electricity load centers and wind power plants make it plausible to study the inter-spatial dependence and temporal correlation for the effective working of the power system. In this paper, a novel multivariate framework is developed to study the spatio-temporal dependency using vine copula. Hourly resolution of load and wind power data obtained from a US regional transmission operator spanning 3 years and spatially distributed in 19 load and two wind power zones are considered in this study. Data collection, in terms of dimension, tends to increase in future, and to tackle this high-dimensional data, a reproducible sampling algorithm using vine copula is developed. The sampling algorithm employs k-means clustering along with singular value decomposition technique to ease the computational burden. Selection of appropriate clustering technique and copula family is realized by the goodness of clustering and goodness of fit tests. The paper concludes with a discussion on the importance of spatio-temporal modeling of load and wind power and the advantage of the proposed multivariate sampling algorithm using vine copula

    Foothill: A Quasiconvex Regularization for Edge Computing of Deep Neural Networks

    Full text link
    Deep neural networks (DNNs) have demonstrated success for many supervised learning tasks, ranging from voice recognition, object detection, to image classification. However, their increasing complexity might yield poor generalization error that make them hard to be deployed on edge devices. Quantization is an effective approach to compress DNNs in order to meet these constraints. Using a quasiconvex base function in order to construct a binary quantizer helps training binary neural networks (BNNs) and adding noise to the input data or using a concrete regularization function helps to improve generalization error. Here we introduce foothill function, an infinitely differentiable quasiconvex function. This regularizer is flexible enough to deform towards L1L_1 and L2L_2 penalties. Foothill can be used as a binary quantizer, as a regularizer, or as a loss. In particular, we show this regularizer reduces the accuracy gap between BNNs and their full-precision counterpart for image classification on ImageNet.Comment: Accepted in 16th International Conference of Image Analysis and Recognition (ICIAR 2019

    Lesion detection and Grading of Diabetic Retinopathy via Two-stages Deep Convolutional Neural Networks

    Full text link
    We propose an automatic diabetic retinopathy (DR) analysis algorithm based on two-stages deep convolutional neural networks (DCNN). Compared to existing DCNN-based DR detection methods, the proposed algorithm have the following advantages: (1) Our method can point out the location and type of lesions in the fundus images, as well as giving the severity grades of DR. Moreover, since retina lesions and DR severity appear with different scales in fundus images, the integration of both local and global networks learn more complete and specific features for DR analysis. (2) By introducing imbalanced weighting map, more attentions will be given to lesion patches for DR grading, which significantly improve the performance of the proposed algorithm. In this study, we label 12,206 lesion patches and re-annotate the DR grades of 23,595 fundus images from Kaggle competition dataset. Under the guidance of clinical ophthalmologists, the experimental results show that our local lesion detection net achieve comparable performance with trained human observers, and the proposed imbalanced weighted scheme also be proved to significantly improve the capability of our DCNN-based DR grading algorithm
    • …
    corecore