213,957 research outputs found
A Learning Algorithm based on High School Teaching Wisdom
A learning algorithm based on primary school teaching and learning is
presented. The methodology is to continuously evaluate a student and to give
them training on the examples for which they repeatedly fail, until, they can
correctly answer all types of questions. This incremental learning procedure
produces better learning curves by demanding the student to optimally dedicate
their learning time on the failed examples. When used in machine learning, the
algorithm is found to train a machine on a data with maximum variance in the
feature space so that the generalization ability of the network improves. The
algorithm has interesting applications in data mining, model evaluations and
rare objects discovery
Learning curves for Soft Margin Classifiers
Typical learning curves for Soft Margin Classifiers (SMCs) learning both
realizable and unrealizable tasks are determined using the tools of Statistical
Mechanics. We derive the analytical behaviour of the learning curves in the
regimes of small and large training sets. The generalization errors present
different decay laws towards the asymptotic values as a function of the
training set size, depending on general geometrical characteristics of the rule
to be learned. Optimal generalization curves are deduced through a fine tuning
of the hyperparameter controlling the trade-off between the error and the
regularization terms in the cost function. Even if the task is realizable, the
optimal performance of the SMC is better than that of a hard margin Support
Vector Machine (SVM) learning the same rule, and is very close to that of the
Bayesian classifier.Comment: 26 pages, 10 figure
Quasar microlensing light curve analysis using deep machine learning
We introduce a deep machine learning approach to studying quasar microlensing
light curves for the first time by analyzing hundreds of thousands of simulated
light curves with respect to the accretion disc size and temperature profile.
Our results indicate that it is possible to successfully classify very large
numbers of diverse light curve data and measure the accretion disc structure.
The detailed shape of the accretion disc brightness profile is found to play a
negligible role, in agreement with Mortonson et al. (2005). The speed and
efficiency of our deep machine learning approach is ideal for quantifying
physical properties in a `big-data' problem setup. This proposed approach looks
promising for analyzing decade-long light curves for thousands of microlensed
quasars, expected to be provided by the Large Synoptic Survey Telescope.Comment: 11 pages, 7 figures, accepted for publication in MNRA
Alchemical and structural distribution based representation for improved QML
We introduce a representation of any atom in any chemical environment for the
generation of efficient quantum machine learning (QML) models of common
electronic ground-state properties. The representation is based on scaled
distribution functions explicitly accounting for elemental and structural
degrees of freedom. Resulting QML models afford very favorable learning curves
for properties of out-of-sample systems including organic molecules,
non-covalently bonded protein side-chains, (HO)-clusters, as well as
diverse crystals. The elemental components help to lower the learning curves,
and, through interpolation across the periodic table, even enable "alchemical
extrapolation" to covalent bonding between elements not part of training, as
evinced for single, double, and triple bonds among main-group elements
Alchemical and structural distribution based representation for improved QML
We introduce a representation of any atom in any chemical environment for the
generation of efficient quantum machine learning (QML) models of common
electronic ground-state properties. The representation is based on scaled
distribution functions explicitly accounting for elemental and structural
degrees of freedom. Resulting QML models afford very favorable learning curves
for properties of out-of-sample systems including organic molecules,
non-covalently bonded protein side-chains, (HO)-clusters, as well as
diverse crystals. The elemental components help to lower the learning curves,
and, through interpolation across the periodic table, even enable "alchemical
extrapolation" to covalent bonding between elements not part of training, as
evinced for single, double, and triple bonds among main-group elements
Unachievable Region in Precision-Recall Space and Its Effect on Empirical Evaluation
Precision-recall (PR) curves and the areas under them are widely used to
summarize machine learning results, especially for data sets exhibiting class
skew. They are often used analogously to ROC curves and the area under ROC
curves. It is known that PR curves vary as class skew changes. What was not
recognized before this paper is that there is a region of PR space that is
completely unachievable, and the size of this region depends only on the skew.
This paper precisely characterizes the size of that region and discusses its
implications for empirical evaluation methodology in machine learning.Comment: ICML2012, fixed citations to use correct tech report numbe
- …