Search CORE

27,828 research outputs found

Exploratory Analysis of Multivariate Data (Unsupervised Image Segmentation and Data Driven Linear and Nonlinear Decomposition)

Author: Hilger Klaus Baggesen
Publication venue
Publication date: 01/03/2002
Field of study

Combining Labelled and Unlabelled Data in the Design of Pattern Classification Systems

Author: Gabrys Bogdan
Publication venue
Publication date
Field of study

There has been much interest in applying techniques that incorporate knowledge from unlabelled data into a supervised learning system but less effort has been made to compare the effectiveness of different approaches on real world problems and to analyse the behaviour of the learning system when using different amount of unlabelled data. In this paper an analysis of the performance of supervised methods enforced by unlabelled data and some semisupervised approaches using different ratios of labelled to unlabelled samples is presented. The experimental results show that when supported by unlabelled samples much less labelled data is generally required to build a classifier without compromising the classification performance. If only a very limited amount of labelled data is available the results show high variability and the performance of the final classifier is more dependant on how reliable the labelled data samples are rather than use of additional unlabelled data. Semi-supervised clustering utilising both labelled and unlabelled data have been shown to offer most significant improvements when natural clusters are present in the considered problem

Bournemouth University Research Online

A new fuzzy set merging technique using inclusion-based fuzzy clustering

Author: Kaymak U
Nefti-Meziani S
Oussalah M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

This paper proposes a new method of merging parameterized fuzzy sets based on clustering in the parameters space, taking into account the degree of inclusion of each fuzzy set in the cluster prototypes. The merger method is applied to fuzzy rule base simplification by automatically replacing the fuzzy sets corresponding to a given cluster with that pertaining to cluster prototype. The feasibility and the performance of the proposed method are studied using an application in mobile robot navigation. The results indicate that the proposed merging and rule base simplification approach leads to good navigation performance in the application considered and to fuzzy models that are interpretable by experts. In this paper, we concentrate mainly on fuzzy systems with Gaussian membership functions, but the general approach can also be applied to other parameterized fuzzy sets

University of Salford Institutional Repository

Crossref

University of Birmingham Research Portal

Pure OAI Repository

EUR Research Repository

Oversampling for Imbalanced Learning Based on K-Means and SMOTE

Author: Bacao Fernando
Douzas Georgios
Last Felix
Publication venue: 'Elsevier BV'
Publication date: 12/12/2017
Field of study

Learning from class-imbalanced data continues to be a common and challenging problem in supervised learning as standard classification algorithms are designed to handle balanced class distributions. While different strategies exist to tackle this problem, methods which generate artificial data to achieve a balanced class distribution are more versatile than modifications to the classification algorithm. Such techniques, called oversamplers, modify the training data, allowing any classifier to be used with class-imbalanced datasets. Many algorithms have been proposed for this task, but most are complex and tend to generate unnecessary noise. This work presents a simple and effective oversampling method based on k-means clustering and SMOTE oversampling, which avoids the generation of noise and effectively overcomes imbalances between and within classes. Empirical results of extensive experiments with 71 datasets show that training data oversampled with the proposed method improves classification results. Moreover, k-means SMOTE consistently outperforms other popular oversampling methods. An implementation is made available in the python programming language.Comment: 19 pages, 8 figure

arXiv.org e-Print Archive

Repositório da Universidade Nova de Lisboa

Analysis of group evolution prediction in complex networks

Author: Bródka Piotr
Kazienko Przemysław
Koziarski Michał
Saganowski Stanisław
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2019
Field of study

In the world, in which acceptance and the identification with social communities are highly desired, the ability to predict evolution of groups over time appears to be a vital but very complex research problem. Therefore, we propose a new, adaptable, generic and mutli-stage method for Group Evolution Prediction (GEP) in complex networks, that facilitates reasoning about the future states of the recently discovered groups. The precise GEP modularity enabled us to carry out extensive and versatile empirical studies on many real-world complex / social networks to analyze the impact of numerous setups and parameters like time window type and size, group detection method, evolution chain length, prediction models, etc. Additionally, many new predictive features reflecting the group state at a given time have been identified and tested. Some other research problems like enriching learning evolution chains with external data have been analyzed as well

arXiv.org e-Print Archive

Directory of Open Access Journals