517,976 research outputs found
Lasso based feature selection for malaria risk exposure prediction
In life sciences, the experts generally use empirical knowledge to recode
variables, choose interactions and perform selection by classical approach. The
aim of this work is to perform automatic learning algorithm for variables
selection which can lead to know if experts can be help in they decision or
simply replaced by the machine and improve they knowledge and results. The
Lasso method can detect the optimal subset of variables for estimation and
prediction under some conditions. In this paper, we propose a novel approach
which uses automatically all variables available and all interactions. By a
double cross-validation combine with Lasso, we select a best subset of
variables and with GLM through a simple cross-validation perform predictions.
The algorithm assures the stability and the the consistency of estimators.Comment: in Petra Perner. Machine Learning and Data Mining in Pattern
Recognition, Jul 2015, Hamburg, Germany. Ibai publishing, 2015, Machine
Learning and Data Mining in Pattern Recognition (proceedings of 11th
International Conference, MLDM 2015
Individualized selection of learning objects
Rapidly evolving Internet and web technologies and international efforts on standardization of learning object metadata enable learners in a web-based educational system ubiquitous access to multiple learning resources. It is becoming more necessary and possible to provide individualized help with selecting learning materials to make the most suitable choice among many alternatives.
A framework for individualized learning object selection, called Eliminating and Optimized Selection (EOS), is presented in this thesis. This framework contains a suggestion for extending learning object metadata specifications and presents an approach to selecting a short list of suitable learning objects appropriate for an individual learner in a particular learning context. The key features of the EOS approach are to evaluate the suitability of a learning object in its situated context and to refine the evaluation by using available historical usage information about the learning object. A Learning Preference Survey was conducted to discover and determine the relationships between the importance of learning object attributes and learner characteristics. Two weight models, a Bayesian Network Weight Model and a NaĂŻve Bayes Model, were derived from the data collected in the survey. Given a particular learner, both of these models provide a set of personal weights for learning object features required by the individualized learning object selection.
The optimized selection approach was demonstrated and verified using simulated selections. Seventy simulated learning objects were evaluated for three simulated learners within simulated learning contexts. Both the Bayesian Network Weight Model and the NaĂŻve Bayes Model were used in the selection of simulated learning objects. The results produced by the two algorithms were compared, and the two algorithms highly correlated each other in the domain where the testing was conducted.
A Learning Object Selection Study was performed to validate the learning object selection algorithms against human experts. By comparing machine selection and human experts’ selection, we found out that the agreement between machine selection and human experts’ selection is higher than agreement among the human experts alone
Recommending Learning Algorithms and Their Associated Hyperparameters
The success of machine learning on a given task dependson, among other
things, which learning algorithm is selected and its associated
hyperparameters. Selecting an appropriate learning algorithm and setting its
hyperparameters for a given data set can be a challenging task, especially for
users who are not experts in machine learning. Previous work has examined using
meta-features to predict which learning algorithm and hyperparameters should be
used. However, choosing a set of meta-features that are predictive of algorithm
performance is difficult. Here, we propose to apply collaborative filtering
techniques to learning algorithm and hyperparameter selection, and find that
doing so avoids determining which meta-features to use and outperforms
traditional meta-learning approaches in many cases.Comment: Short paper--2 pages, 2 table
Boosting as a Product of Experts
In this paper, we derive a novel probabilistic model of boosting as a Product
of Experts. We re-derive the boosting algorithm as a greedy incremental model
selection procedure which ensures that addition of new experts to the ensemble
does not decrease the likelihood of the data. These learning rules lead to a
generic boosting algorithm - POE- Boost which turns out to be similar to the
AdaBoost algorithm under certain assumptions on the expert probabilities. The
paper then extends the POEBoost algorithm to POEBoost.CS which handles
hypothesis that produce probabilistic predictions. This new algorithm is shown
to have better generalization performance compared to other state of the art
algorithms
Machine learning for automatic prediction of the quality of electrophysiological recordings
The quality of electrophysiological recordings varies a lot due to technical and biological variability and neuroscientists inevitably have to select “good” recordings for further analyses. This procedure is time-consuming and prone to selection biases. Here, we investigate replacing human decisions by a machine learning approach. We define 16 features, such as spike height and width, select the most informative ones using a wrapper method and train a classifier to reproduce the judgement of one of our expert electrophysiologists. Generalisation performance is then assessed on unseen data, classified by the same or by another expert. We observe that the learning machine can be equally, if not more, consistent in its judgements as individual experts amongst each other. Best performance is achieved for a limited number of informative features; the optimal feature set being different from one data set to another. With 80–90% of correct judgements, the performance of the system is very promising within the data sets of each expert but judgments are less reliable when it is used across sets of recordings from different experts. We conclude that the proposed approach is relevant to the selection of electrophysiological recordings, provided parameters are adjusted to different types of experiments and to individual experimenters
OBOE: Collaborative Filtering for AutoML Model Selection
Algorithm selection and hyperparameter tuning remain two of the most
challenging tasks in machine learning. Automated machine learning (AutoML)
seeks to automate these tasks to enable widespread use of machine learning by
non-experts. This paper introduces OBOE, a collaborative filtering method for
time-constrained model selection and hyperparameter tuning. OBOE forms a matrix
of the cross-validated errors of a large number of supervised learning models
(algorithms together with hyperparameters) on a large number of datasets, and
fits a low rank model to learn the low-dimensional feature vectors for the
models and datasets that best predict the cross-validated errors. To find
promising models for a new dataset, OBOE runs a set of fast but informative
algorithms on the new dataset and uses their cross-validated errors to infer
the feature vector for the new dataset. OBOE can find good models under
constraints on the number of models fit or the total time budget. To this end,
this paper develops a new heuristic for active learning in time-constrained
matrix completion based on optimal experiment design. Our experiments
demonstrate that OBOE delivers state-of-the-art performance faster than
competing approaches on a test bed of supervised learning problems. Moreover,
the success of the bilinear model used by OBOE suggests that AutoML may be
simpler than was previously understood
Multi-task learning for intelligent data processing in granular computing context
Classification is a popular task in many application areas, such as decision making, rating, sentiment analysis and pattern recognition. In the recent years, due to the vast and rapid increase in the size of data, classification has been mainly undertaken in the way of supervised machine learning. In this context, a classification task involves data labelling, feature extraction,feature selection and learning of classifiers. In traditional machine learning, data is usually single-labelled by experts, i.e., each instance is only assigned one class label, since experts assume that different classes are mutually exclusive and each instance is clear-cut. However, the above assumption does not always hold in real applications. For example, in the context of emotion detection, there could be more than one emotion identified from the same person. On the other hand, feature selection has typically been done by evaluating feature subsets in terms of their relevance to all the classes. However, it is possible that a feature is only relevant to one class, but is irrelevant to all the other classes. Based on the above argumentation on data labelling and feature selection, we propose in this paper a framework of multi-task learning. In particular, we consider
traditional machine learning to be single task learning, and argue the necessity to turn it into multi-task learning to allow an instance to belong to more than one class (i.e., multi-task classification) and to achieve class specific feature selection (i.e.,multi-task feature selection). Moreover, we report two experimental studies in terms of fuzzy multi-task classification and rule learning based multi-task feature selection. The results show empirically that it is necessary to undertake multi-task learning for both classification and feature selection
Task Transfer by Preference-Based Cost Learning
The goal of task transfer in reinforcement learning is migrating the action
policy of an agent to the target task from the source task. Given their
successes on robotic action planning, current methods mostly rely on two
requirements: exactly-relevant expert demonstrations or the explicitly-coded
cost function on target task, both of which, however, are inconvenient to
obtain in practice. In this paper, we relax these two strong conditions by
developing a novel task transfer framework where the expert preference is
applied as a guidance. In particular, we alternate the following two steps:
Firstly, letting experts apply pre-defined preference rules to select related
expert demonstrates for the target task. Secondly, based on the selection
result, we learn the target cost function and trajectory distribution
simultaneously via enhanced Adversarial MaxEnt IRL and generate more
trajectories by the learned target distribution for the next preference
selection. The theoretical analysis on the distribution learning and
convergence of the proposed algorithm are provided. Extensive simulations on
several benchmarks have been conducted for further verifying the effectiveness
of the proposed method.Comment: Accepted to AAAI 2019. Mingxuan Jing and Xiaojian Ma contributed
equally to this wor
Learning From Multiple Experts:Self-paced Knowledge Distillation for Long-Tailed Classification
In real-world scenarios, data tends to exhibit a long-tailed distribution,
which increases the difficulty of training deep networks. In this paper, we
propose a novel self-paced knowledge distillation framework, termed Learning
From Multiple Experts (LFME). Our method is inspired by the observation that
networks trained on less imbalanced subsets of the distribution often yield
better performances than their jointly-trained counterparts. We refer to these
models as 'Experts', and the proposed LFME framework aggregates the knowledge
from multiple 'Experts' to learn a unified student model. Specifically, the
proposed framework involves two levels of adaptive learning schedules:
Self-paced Expert Selection and Curriculum Instance Selection, so that the
knowledge is adaptively transferred to the 'Student'. We conduct extensive
experiments and demonstrate that our method is able to achieve superior
performances compared to state-of-the-art methods. We also show that our method
can be easily plugged into state-of-the-art long-tailed classification
algorithms for further improvements.Comment: ECCV 2020 Spotligh
- …