Search CORE

517,976 research outputs found

Lasso based feature selection for malaria risk exposure prediction

Author: Fonton Noël
Kouwayè Bienvenue
Rossi Fabrice
Publication venue
Publication date: 11/07/2015
Field of study

In life sciences, the experts generally use empirical knowledge to recode variables, choose interactions and perform selection by classical approach. The aim of this work is to perform automatic learning algorithm for variables selection which can lead to know if experts can be help in they decision or simply replaced by the machine and improve they knowledge and results. The Lasso method can detect the optimal subset of variables for estimation and prediction under some conditions. In this paper, we propose a novel approach which uses automatically all variables available and all interactions. By a double cross-validation combine with Lasso, we select a best subset of variables and with GLM through a simple cross-validation perform predictions. The algorithm assures the stability and the the consistency of estimators.Comment: in Petra Perner. Machine Learning and Data Mining in Pattern Recognition, Jul 2015, Hamburg, Germany. Ibai publishing, 2015, Machine Learning and Data Mining in Pattern Recognition (proceedings of 11th International Conference, MLDM 2015

arXiv.org e-Print Archive

Individualized selection of learning objects

Author: Liu Jian
Publication venue: 'University of Saskatchewan Library'
Publication date: 01/01/2009
Field of study

Rapidly evolving Internet and web technologies and international efforts on standardization of learning object metadata enable learners in a web-based educational system ubiquitous access to multiple learning resources. It is becoming more necessary and possible to provide individualized help with selecting learning materials to make the most suitable choice among many alternatives. A framework for individualized learning object selection, called Eliminating and Optimized Selection (EOS), is presented in this thesis. This framework contains a suggestion for extending learning object metadata specifications and presents an approach to selecting a short list of suitable learning objects appropriate for an individual learner in a particular learning context. The key features of the EOS approach are to evaluate the suitability of a learning object in its situated context and to refine the evaluation by using available historical usage information about the learning object. A Learning Preference Survey was conducted to discover and determine the relationships between the importance of learning object attributes and learner characteristics. Two weight models, a Bayesian Network Weight Model and a Naïve Bayes Model, were derived from the data collected in the survey. Given a particular learner, both of these models provide a set of personal weights for learning object features required by the individualized learning object selection. The optimized selection approach was demonstrated and verified using simulated selections. Seventy simulated learning objects were evaluated for three simulated learners within simulated learning contexts. Both the Bayesian Network Weight Model and the Naïve Bayes Model were used in the selection of simulated learning objects. The results produced by the two algorithms were compared, and the two algorithms highly correlated each other in the domain where the testing was conducted. A Learning Object Selection Study was performed to validate the learning object selection algorithms against human experts. By comparing machine selection and human experts’ selection, we found out that the agreement between machine selection and human experts’ selection is higher than agreement among the human experts alone

eCommons@USASK

University of Saskatchewan Research Archive

Recommending Learning Algorithms and Their Associated Hyperparameters

Author: Giraud-Carrier Christophe
Martinez Tony
Mitchell Logan
Smith Michael R.
Publication venue
Publication date: 07/07/2014
Field of study

The success of machine learning on a given task dependson, among other things, which learning algorithm is selected and its associated hyperparameters. Selecting an appropriate learning algorithm and setting its hyperparameters for a given data set can be a challenging task, especially for users who are not experts in machine learning. Previous work has examined using meta-features to predict which learning algorithm and hyperparameters should be used. However, choosing a set of meta-features that are predictive of algorithm performance is difficult. Here, we propose to apply collaborative filtering techniques to learning algorithm and hyperparameter selection, and find that doing so avoids determining which meta-features to use and outperforms traditional meta-learning approaches in many cases.Comment: Short paper--2 pages, 2 table

arXiv.org e-Print Archive

CiteSeerX

Boosting as a Product of Experts

Author: Brown Gary
Edakunni Narayanan U.
Kovacs Tim
Publication venue
Publication date: 01/01/2011
Field of study

In this paper, we derive a novel probabilistic model of boosting as a Product of Experts. We re-derive the boosting algorithm as a greedy incremental model selection procedure which ensures that addition of new experts to the ensemble does not decrease the likelihood of the data. These learning rules lead to a generic boosting algorithm - POE- Boost which turns out to be similar to the AdaBoost algorithm under certain assumptions on the expert probabilities. The paper then extends the POEBoost algorithm to POEBoost.CS which handles hypothesis that produce probabilistic predictions. This new algorithm is shown to have better generalization performance compared to other state of the art algorithms

arXiv.org e-Print Archive

CiteSeerX

Machine learning for automatic prediction of the quality of electrophysiological recordings

Author: AB Wiltschko
BT Priest
C Mathes
CG Galizia
Dominique Martinez
F Franke
H Lei
Jean-Pierre Rospars
Johannes Reisert
M Asmild
MS Lewicki
R Friedrich
R Kohavi
S Panzeri
S Takahashi
SB Wilson
Shereen Elbanna
Sylvia Anton
T Nowotny
Thomas Nowotny
Y Saeys
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

The quality of electrophysiological recordings varies a lot due to technical and biological variability and neuroscientists inevitably have to select “good” recordings for further analyses. This procedure is time-consuming and prone to selection biases. Here, we investigate replacing human decisions by a machine learning approach. We define 16 features, such as spike height and width, select the most informative ones using a wrapper method and train a classifier to reproduce the judgement of one of our expert electrophysiologists. Generalisation performance is then assessed on unseen data, classified by the same or by another expert. We observe that the learning machine can be equally, if not more, consistent in its judgements as individual experts amongst each other. Best performance is achieved for a limited number of informative features; the optimal feature set being different from one data set to another. With 80–90% of correct judgements, the performance of the system is very promising within the data sets of each expert but judgments are less reliable when it is used across sets of recordings from different experts. We conclude that the proposed approach is relevant to the selection of electrophysiological recordings, provided parameters are adjusted to different types of experiments and to individual experimenters

INRIA a CCSD electronic archive server

Directory of Open Access Journals

ProdInra

FigShare

OBOE: Collaborative Filtering for AutoML Model Selection

Author: Akimoto Yuji
Kim Dae Won
Udell Madeleine
Yang Chengrun
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/05/2019
Field of study

Algorithm selection and hyperparameter tuning remain two of the most challenging tasks in machine learning. Automated machine learning (AutoML) seeks to automate these tasks to enable widespread use of machine learning by non-experts. This paper introduces OBOE, a collaborative filtering method for time-constrained model selection and hyperparameter tuning. OBOE forms a matrix of the cross-validated errors of a large number of supervised learning models (algorithms together with hyperparameters) on a large number of datasets, and fits a low rank model to learn the low-dimensional feature vectors for the models and datasets that best predict the cross-validated errors. To find promising models for a new dataset, OBOE runs a set of fast but informative algorithms on the new dataset and uses their cross-validated errors to infer the feature vector for the new dataset. OBOE can find good models under constraints on the number of models fit or the total time budget. To this end, this paper develops a new heuristic for active learning in time-constrained matrix completion based on optimal experiment design. Our experiments demonstrate that OBOE delivers state-of-the-art performance faster than competing approaches on a test bed of supervised learning problems. Moreover, the success of the bilinear model used by OBOE suggests that AutoML may be simpler than was previously understood

arXiv.org e-Print Archive

Multi-task learning for intelligent data processing in granular computing context

Author: Cocea Mihaela
Ding Weili
Liu Han
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2018
Field of study

Classification is a popular task in many application areas, such as decision making, rating, sentiment analysis and pattern recognition. In the recent years, due to the vast and rapid increase in the size of data, classification has been mainly undertaken in the way of supervised machine learning. In this context, a classification task involves data labelling, feature extraction,feature selection and learning of classifiers. In traditional machine learning, data is usually single-labelled by experts, i.e., each instance is only assigned one class label, since experts assume that different classes are mutually exclusive and each instance is clear-cut. However, the above assumption does not always hold in real applications. For example, in the context of emotion detection, there could be more than one emotion identified from the same person. On the other hand, feature selection has typically been done by evaluating feature subsets in terms of their relevance to all the classes. However, it is possible that a feature is only relevant to one class, but is irrelevant to all the other classes. Based on the above argumentation on data labelling and feature selection, we propose in this paper a framework of multi-task learning. In particular, we consider traditional machine learning to be single task learning, and argue the necessity to turn it into multi-task learning to allow an instance to belong to more than one class (i.e., multi-task classification) and to achieve class specific feature selection (i.e.,multi-task feature selection). Moreover, we report two experimental studies in terms of fuzzy multi-task classification and rule learning based multi-task feature selection. The results show empirically that it is necessary to undertake multi-task learning for both classification and feature selection

Task Transfer by Preference-Based Cost Learning

Author: Huang Wenbing
Jing Mingxuan
Liu Huaping
Ma Xiaojian
Sun Fuchun
Publication venue
Publication date: 18/02/2019
Field of study

The goal of task transfer in reinforcement learning is migrating the action policy of an agent to the target task from the source task. Given their successes on robotic action planning, current methods mostly rely on two requirements: exactly-relevant expert demonstrations or the explicitly-coded cost function on target task, both of which, however, are inconvenient to obtain in practice. In this paper, we relax these two strong conditions by developing a novel task transfer framework where the expert preference is applied as a guidance. In particular, we alternate the following two steps: Firstly, letting experts apply pre-defined preference rules to select related expert demonstrates for the target task. Secondly, based on the selection result, we learn the target cost function and trajectory distribution simultaneously via enhanced Adversarial MaxEnt IRL and generate more trajectories by the learned target distribution for the next preference selection. The theoretical analysis on the distribution learning and convergence of the proposed algorithm are provided. Extensive simulations on several benchmarks have been conducted for further verifying the effectiveness of the proposed method.Comment: Accepted to AAAI 2019. Mingxuan Jing and Xiaojian Ma contributed equally to this wor

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Learning From Multiple Experts:Self-paced Knowledge Distillation for Long-Tailed Classification

Author: B Zhou
E Stamatatos
G Ding
H Han
H He
MA Tahir
NV Chawla
P Jeatrakul
S Guo
SH Khan
T-Y Lin
Y Guo
Y-X Wang
Z Li
Publication venue: Springer Nature
Publication date: 20/09/2020
Field of study

In real-world scenarios, data tends to exhibit a long-tailed distribution, which increases the difficulty of training deep networks. In this paper, we propose a novel self-paced knowledge distillation framework, termed Learning From Multiple Experts (LFME). Our method is inspired by the observation that networks trained on less imbalanced subsets of the distribution often yield better performances than their jointly-trained counterparts. We refer to these models as 'Experts', and the proposed LFME framework aggregates the knowledge from multiple 'Experts' to learn a unified student model. Specifically, the proposed framework involves two levels of adaptive learning schedules: Self-paced Expert Selection and Curriculum Instance Selection, so that the knowledge is adaptively transferred to the 'Student'. We conduct extensive experiments and demonstrate that our method is able to achieve superior performances compared to state-of-the-art methods. We also show that our method can be easily plugged into state-of-the-art long-tailed classification algorithms for further improvements.Comment: ECCV 2020 Spotligh

arXiv.org e-Print Archive