2,609 research outputs found
Generative Models For Deep Learning with Very Scarce Data
The goal of this paper is to deal with a data scarcity scenario where deep
learning techniques use to fail. We compare the use of two well established
techniques, Restricted Boltzmann Machines and Variational Auto-encoders, as
generative models in order to increase the training set in a classification
framework. Essentially, we rely on Markov Chain Monte Carlo (MCMC) algorithms
for generating new samples. We show that generalization can be improved
comparing this methodology to other state-of-the-art techniques, e.g.
semi-supervised learning with ladder networks. Furthermore, we show that RBM is
better than VAE generating new samples for training a classifier with good
generalization capabilities
Learning from Noisy Label Distributions
In this paper, we consider a novel machine learning problem, that is,
learning a classifier from noisy label distributions. In this problem, each
instance with a feature vector belongs to at least one group. Then, instead of
the true label of each instance, we observe the label distribution of the
instances associated with a group, where the label distribution is distorted by
an unknown noise. Our goals are to (1) estimate the true label of each
instance, and (2) learn a classifier that predicts the true label of a new
instance. We propose a probabilistic model that considers true label
distributions of groups and parameters that represent the noise as hidden
variables. The model can be learned based on a variational Bayesian method. In
numerical experiments, we show that the proposed model outperforms existing
methods in terms of the estimation of the true labels of instances.Comment: Accepted in ICANN201
Institutional Effects in a Simple Model of Educational Production
This paper presents a model of educational production that tries to make sense of recent evidence on effects of institutional arrangements on student performance. In a simple principal-agent framework, students choose their learning effort to maximize their net benefits, while the government chooses educational spending to maximize its net benefits. In the jointly determined equilibrium, schooling quality is shown to depend on several institutionally determined parameters. The impact on student performance of institutions such as central examinations, centralization versus school autonomy, teachers\u27 influence, parental influence, and competition from private schools is analyzed. Furthermore, the model can rationalize why positive resource effects may be lacking in educational production
Bridging the Gap between Probabilistic and Deterministic Models: A Simulation Study on a Variational Bayes Predictive Coding Recurrent Neural Network Model
The current paper proposes a novel variational Bayes predictive coding RNN
model, which can learn to generate fluctuated temporal patterns from exemplars.
The model learns to maximize the lower bound of the weighted sum of the
regularization and reconstruction error terms. We examined how this weighting
can affect development of different types of information processing while
learning fluctuated temporal patterns. Simulation results show that strong
weighting of the reconstruction term causes the development of deterministic
chaos for imitating the randomness observed in target sequences, while strong
weighting of the regularization term causes the development of stochastic
dynamics imitating probabilistic processes observed in targets. Moreover,
results indicate that the most generalized learning emerges between these two
extremes. The paper concludes with implications in terms of the underlying
neuronal mechanisms for autism spectrum disorder and for free action.Comment: This paper is accepted the 24th International Conference On Neural
Information Processing (ICONIP 2017). The previous submission to arXiv is
replaced by this version because there was an error in Equation
Analysis of dropout learning regarded as ensemble learning
Deep learning is the state-of-the-art in fields such as visual object
recognition and speech recognition. This learning uses a large number of
layers, huge number of units, and connections. Therefore, overfitting is a
serious problem. To avoid this problem, dropout learning is proposed. Dropout
learning neglects some inputs and hidden units in the learning process with a
probability, p, and then, the neglected inputs and hidden units are combined
with the learned network to express the final output. We find that the process
of combining the neglected hidden units with the learned network can be
regarded as ensemble learning, so we analyze dropout learning from this point
of view.Comment: 9 pages, 8 figures, submitted to Conferenc
Towards Analyzing Semantic Robustness of Deep Neural Networks
Despite the impressive performance of Deep Neural Networks (DNNs) on various
vision tasks, they still exhibit erroneous high sensitivity toward semantic
primitives (e.g. object pose). We propose a theoretically grounded analysis for
DNN robustness in the semantic space. We qualitatively analyze different DNNs'
semantic robustness by visualizing the DNN global behavior as semantic maps and
observe interesting behavior of some DNNs. Since generating these semantic maps
does not scale well with the dimensionality of the semantic space, we develop a
bottom-up approach to detect robust regions of DNNs. To achieve this, we
formalize the problem of finding robust semantic regions of the network as
optimizing integral bounds and we develop expressions for update directions of
the region bounds. We use our developed formulations to quantitatively evaluate
the semantic robustness of different popular network architectures. We show
through extensive experimentation that several networks, while trained on the
same dataset and enjoying comparable accuracy, do not necessarily perform
similarly in semantic robustness. For example, InceptionV3 is more accurate
despite being less semantically robust than ResNet50. We hope that this tool
will serve as a milestone towards understanding the semantic robustness of
DNNs.Comment: Presented at European conference on computer vision (ECCV 2020)
Workshop on Adversarial Robustness in the Real World (
https://eccv20-adv-workshop.github.io/ ) [best paper award]. The code is
available at https://github.com/ajhamdi/semantic-robustnes
Outlier detection with partial information:Application to emergency mapping
This paper, addresses the problem of novelty detection in the case that the observed data is a mixture of a known 'background' process contaminated with an unknown other process, which generates the outliers, or novel observations. The framework we describe here is quite general, employing univariate classification with incomplete information, based on knowledge of the distribution (the 'probability density function', 'pdf') of the data generated by the 'background' process. The relative proportion of this 'background' component (the 'prior' 'background' 'probability), the 'pdf' and the 'prior' probabilities of all other components are all assumed unknown. The main contribution is a new classification scheme that identifies the maximum proportion of observed data following the known 'background' distribution. The method exploits the Kolmogorov-Smirnov test to estimate the proportions, and afterwards data are Bayes optimally separated. Results, demonstrated with synthetic data, show that this approach can produce more reliable results than a standard novelty detection scheme. The classification algorithm is then applied to the problem of identifying outliers in the SIC2004 data set, in order to detect the radioactive release simulated in the 'oker' data set. We propose this method as a reliable means of novelty detection in the emergency situation which can also be used to identify outliers prior to the application of a more general automatic mapping algorithm. © Springer-Verlag 2007
A multivariate framework to study spatio-temporal dependency of electricity load and wind power
With massive wind power integration, the spatial distribution of electricity load centers and wind power plants make it plausible to study the inter-spatial dependence and temporal correlation for the effective working of the power system. In this paper, a novel multivariate framework is developed to study the spatio-temporal dependency using vine copula. Hourly resolution of load and wind power data obtained from a US regional transmission operator spanning 3 years and spatially distributed in 19 load and two wind power zones are considered in this study. Data collection, in terms of dimension, tends to increase in future, and to tackle this high-dimensional data, a reproducible sampling algorithm using vine copula is developed. The sampling algorithm employs k-means clustering along with singular value decomposition technique to ease the computational burden. Selection of appropriate clustering technique and copula family is realized by the goodness of clustering and goodness of fit tests. The paper concludes with a discussion on the importance of spatio-temporal modeling of load and wind power and the advantage of the proposed multivariate sampling algorithm using vine copula
Foothill: A Quasiconvex Regularization for Edge Computing of Deep Neural Networks
Deep neural networks (DNNs) have demonstrated success for many supervised
learning tasks, ranging from voice recognition, object detection, to image
classification. However, their increasing complexity might yield poor
generalization error that make them hard to be deployed on edge devices.
Quantization is an effective approach to compress DNNs in order to meet these
constraints. Using a quasiconvex base function in order to construct a binary
quantizer helps training binary neural networks (BNNs) and adding noise to the
input data or using a concrete regularization function helps to improve
generalization error. Here we introduce foothill function, an infinitely
differentiable quasiconvex function. This regularizer is flexible enough to
deform towards and penalties. Foothill can be used as a binary
quantizer, as a regularizer, or as a loss. In particular, we show this
regularizer reduces the accuracy gap between BNNs and their full-precision
counterpart for image classification on ImageNet.Comment: Accepted in 16th International Conference of Image Analysis and
Recognition (ICIAR 2019
Lesion detection and Grading of Diabetic Retinopathy via Two-stages Deep Convolutional Neural Networks
We propose an automatic diabetic retinopathy (DR) analysis algorithm based on
two-stages deep convolutional neural networks (DCNN). Compared to existing
DCNN-based DR detection methods, the proposed algorithm have the following
advantages: (1) Our method can point out the location and type of lesions in
the fundus images, as well as giving the severity grades of DR. Moreover, since
retina lesions and DR severity appear with different scales in fundus images,
the integration of both local and global networks learn more complete and
specific features for DR analysis. (2) By introducing imbalanced weighting map,
more attentions will be given to lesion patches for DR grading, which
significantly improve the performance of the proposed algorithm. In this study,
we label 12,206 lesion patches and re-annotate the DR grades of 23,595 fundus
images from Kaggle competition dataset. Under the guidance of clinical
ophthalmologists, the experimental results show that our local lesion detection
net achieve comparable performance with trained human observers, and the
proposed imbalanced weighted scheme also be proved to significantly improve the
capability of our DCNN-based DR grading algorithm
- …