Search CORE

11 research outputs found

Special issue on advances in learning with label noise

Author: Frénay Benoît
Kabán Ata
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

Crossref

University of Birmingham Research Portal

Radar-based Hail-producing Storm Detection Using Positive Unlabeled Classification

Author: Di Wang
Huizhen Jia
Junzhi Shi
Ping Wang
Publication venue: 'Mechanical Engineering Faculty in Slavonski Brod'
Publication date: 01/01/2020
Field of study

Machine learning methods have been widely used in many fields of weather forecasting. However, some severe weather, such as hailstorm, is difficult to be completely and accurately recorded. These inaccurate data sets will affect the performance of machine-learning-based forecasting models. In this paper, a weather-radar-based hail-producing storm detection method is proposed. This method utilizes the bagging class-weighted support vector machine to learn from partly labeled hail case data and the other unlabeled data, with features extracted from radar and sounding data. The real case data from three radars of North China are used for evaluation. Results suggest that the proposed method could improve both the forecast accuracy and the forecast lead time comparing with the commonly used radar parameter methods. Besides, the proposed method works better than the method with the supervised learning model in any situation, especially when the number of positive samples contaminated in the unlabeled set is large

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

A Robust Ensemble Approach to Learn From Positive and Unlabeled Data Using SVM Base Models

Author: Claesen Marc
De Moor Bart
De Smet Frank
Suykens Johan
Publication venue: 'Elsevier BV'
Publication date: 01/07/2015
Field of study

© 2015 Elsevier B.V. We present a novel approach to learn binary classifiers when only positive and unlabeled instances are available (PU learning). This problem is routinely cast as a supervised task with label noise in the negative set. We use an ensemble of SVM models trained on bootstrap resamples of the training data for increased robustness against label noise. The approach can be considered in a bagging framework which provides an intuitive explanation for its mechanics in a semi-supervised setting. We compared our method to state-of-the-art approaches in simulations using multiple public benchmark data sets. The included benchmark comprises three settings with increasing label noise: (i) fully supervised, (ii) PU learning and (iii) PU learning with false positives. Our approach shows a marginal improvement over existing methods in the second setting and a significant improvement in the third.publisher: Elsevier articletitle: A robust ensemble approach to learn from positive and unlabeled data using SVM base models journaltitle: Neurocomputing articlelink: http://dx.doi.org/10.1016/j.neucom.2014.10.081 content_type: article copyright: Copyright © 2015 Elsevier B.V. All rights reserved.status: publishe

Lirias

A robust ensemble approach to learn from positive and unlabeled data using SVM base models

Author: Aerts
Banfield
Bart De Moor
Bauer
Bergstra
Blackard
Blanchard
Breiman
Breiman
Breiman
Brown
Chang
Claesen
Claesen
Daemen
Demšar
Dietterich
Duarte
Fan
Fantine
Frank De Smet
Frenay
Grandvalet
Johan A.K. Suykens
Keijzer
Marc Claesen
Martínez-Muñoz
Mordelet
Shoichet
Sifrim
Yu
Zhu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Automated Machine Learning for Positive-Unlabelled Learning

Author: Saunders Jack Duke
Publication venue
Publication date: 08/04/2024
Field of study

Positive-Unlabelled (PU) learning is a field of machine learning that involves learning classifiers from data consisting of positive class and unlabelled instances. That is, instances that may be either positive or negative, but the label is unknown. PU learning differs from standard binary classification due to the absence of negative instances. This difference is non-trivial and requires differing classification frameworks and evaluation metrics. This thesis looks to address gaps in the PU learning literature and make PU learning more accessible to non-experts by introducing Automated Machine Learning (Auto-ML) systems specific to PU learning. Three such systems have been developed, GA-Auto-PU, a Genetic Algorithm (GA)-based Auto-ML system, BO-Auto-PU, a Bayesian Optimisation (BO)-based Auto-ML system, and EBO-Auto-PU, an Evolutionary/Bayesian Optimisation (EBO) hybrid-based Auto-ML system. These three Auto-ML systems are three primary contributions of this work. EBO, the optimiser component of EBO-Auto-PU, is by itself a novel optimisation method developed in this work that has proved effective for the task of Auto-ML and represents another contribution. EBO was developed with the aim of acting as a trade-off between GA, which achieved high predictive performance but at high computational expense, and BO, which, when utilised by the Auto-PU system, did not perform as well as the GA-based system but did execute much faster. EBO achieved this aim, providing high predictive performance with a computational runtime much faster than the GA-based system, and not substantially slower than the BO-based system. The proposed Auto-ML systems for PU learning were evaluated on three versions of 40 datasets, thus evaluated on 120 learning tasks in total. The 40 datasets consist of 20 real-world biomedical datasets and 20 synthetic datasets. The main evaluation measure was the F-measure, a popular measure in PU learning. Based on the F-measure results, the three proposed systems outperformed in general two baseline PU learning methods, usually with statistically significant results. Among the three proposed systems, there was no statistically significance difference between their results in general, whilst a version of the EBO-Auto-PU system performed overall slightly better than the other systems, in terms of F-measure. The two other main contributions of this work relate specifically to the field of PU learning. Firstly, in this work we present and utilise a robust evaluation approach. Evaluating PU learning classifiers is non-trivial and little guidance has been provided in the literature on how to do so. In this work, we present a clear framework for evaluation and use this framework to evaluate the proposed systems. Secondly, when evaluating the proposed systems, an analysis of the most frequently selected components of the optimised PU learning algorithm is presented. That is, the components that constitute the PU learning algorithms produced by the optimisers (for example, the choice of classifiers used in the algorithm, the number of iterations, etc.). This analysis is used to provide guidance on the construction of PU learning algorithms for specific dataset characteristics

Kent Academic Repository