Search CORE

14,810 research outputs found

Multi-Criterion Mammographic Risk Analysis Supported with Multi-Label Fuzzy-Rough Feature Selection

Author: Qu Yanpeng
Shang Changjing
Shen Qiang
Yang Longzhi
Yue Guanli
Zwiggelaar Reyer
Publication venue
Publication date: 01/09/2019
Field of study

Context and background Breast cancer is one of the most common diseases threatening the human lives globally, requiring effective and early risk analysis for which learning classifiers supported with automated feature selection offer a potential robust solution. Motivation Computer aided risk analysis of breast cancer typically works with a set of extracted mammographic features which may contain significant redundancy and noise, thereby requiring technical developments to improve runtime performance in both computational efficiency and classification accuracy. Hypothesis Use of advanced feature selection methods based on multiple diagnosis criteria may lead to improved results for mammographic risk analysis. Methods An approach for multi-criterion based mammographic risk analysis is proposed, by adapting the recently developed multi-label fuzzy-rough feature selection mechanism. Results A system for multi-criterion mammographic risk analysis is implemented with the aid of multi-label fuzzy-rough feature selection and its performance is positively verified experimentally, in comparison with representative popular mechanisms. Conclusions The novel approach for mammographic risk analysis based on multiple criteria helps improve classification accuracy using selected informative features, without suffering from the redundancy caused by such complex criteria, with the implemented system demonstrating practical efficacy

Northumbria Research Link

Aberystwyth Research Portal

Dealing with imbalanced and weakly labelled data in machine learning using fuzzy and rough set methods

Author: Vluymans Sarah
Publication venue: Ghent University. Faculty of Medicine and Health Sciences ; University of Granada. Department of Computer Science and Artificial Intelligence
Publication date: 01/01/2018
Field of study

Ghent University Academic Bibliography

Grooming Detection using Fuzzy-Rough Feature Selection and Text Classification

Author: Anderson Philip
Li Jie
Naik Nitin
Yang Longzhi
Zuo Zheming
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/07/2018
Field of study

Online child grooming detection has recently attracted intensive research interests from both the machine learning community and digital forensics community due to its great social impact. The existing data-driven approaches usually face the challenges of lack of training data and the uncertainty of classes in terms of the classification or decision boundary. This paper proposes a grooming detection approach in an effort to address such uncertainty based on a data set derived from a publicly available profiling data set. In particular, the approach firstly applies the conventional text feature extraction approach in identifying the most significant words in the data set. This is followed by the application of a fuzzy-rough feature selection approach in reducing the high dimensions of the selected words for fast processing, which at the same time addressing the uncertainty of class boundaries. The experimental results demonstrate the efficiency and efficacy

Northumbria Research Link

Crossref

Teeside University's Research Repository

Inverse Projection Representation and Category Contribution Rate for Robust Tumor Recognition

Author: Chen Yun-Mei
Tian Li
Wu Wen-Ming
Xu Shuang
Yang Li-Jun
Yang Xiao-Hui
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Sparse representation based classification (SRC) methods have achieved remarkable results. SRC, however, still suffer from requiring enough training samples, insufficient use of test samples and instability of representation. In this paper, a stable inverse projection representation based classification (IPRC) is presented to tackle these problems by effectively using test samples. An IPR is firstly proposed and its feasibility and stability are analyzed. A classification criterion named category contribution rate is constructed to match the IPR and complete classification. Moreover, a statistical measure is introduced to quantify the stability of representation-based classification methods. Based on the IPRC technique, a robust tumor recognition framework is presented by interpreting microarray gene expression data, where a two-stage hybrid gene selection method is introduced to select informative genes. Finally, the functional analysis of candidate's pathogenicity-related genes is given. Extensive experiments on six public tumor microarray gene expression datasets demonstrate the proposed technique is competitive with state-of-the-art methods.Comment: 14 pages, 19 figures, 10 table

arXiv.org e-Print Archive

Crossref

A systematic review of data quality issues in knowledge discovery tasks

Author: Corrales David Camilo
Corrales Juan Carlos
Ledezma Agapito Ismael
Publication venue: 'Universidad de Medellin'
Publication date: 07/11/2015
Field of study

Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Universidad de Medellín: Revistas Científicas

Repositorio Institucional Universidad de Medellín

DIALNET

Fuzzy-rough-learn 0.1 : a Python library for machine learning with fuzzy rough sets

Author: C Cornelis
D Dubois
F Pedregosa
G Nguyen
G van Rossum
LS Riza
M Hall
N Verbiest
R Jensen
RR Yager
S Vluymans
S Vluymans
S Vluymans
YA Malkov
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

We present fuzzy-rough-learn, the first Python library of fuzzy rough set machine learning algorithms. It contains three algorithms previously implemented in R and Java, as well as two new algorithms from the recent literature. We briefly discuss the use cases of fuzzy-rough-learn and the design philosophy guiding its development, before providing an overview of the included algorithms and their parameters

Crossref

Ghent University Academic Bibliography

Feature Selection Inspired Classifier Ensemble Reduction

Author: Chao Fei
Diao Ren
Peng Taoxin
Shen Qiang
Snooke Neal
晁飞
Publication venue
Publication date: 15/07/2014
Field of study

Classifier ensembles constitute one of the main research directions in machine learning and data mining. The use of multiple classifiers generally allows better predictive performance than that achievable with a single model. Several approaches exist in the literature that provide means to construct and aggregate such ensembles. However, these ensemble systems contain redundant members that, if removed, may further increase group diversity and produce better results. Smaller ensembles also relax the memory and storage requirements, reducing system's run-time overhead while improving overall efficiency. This paper extends the ideas developed for feature selection problems to support classifier ensemble reduction, by transforming ensemble predictions into training samples, and treating classifiers as features. Also, the global heuristic harmony search is used to select a reduced subset of such artificial features, while attempting to maximize the feature subset evaluation. The resulting technique is systematically evaluated using high dimensional and large sized benchmark datasets, showing a superior classification performance against both original, unreduced ensembles, and randomly formed subsets. ? 2013 IEEE

Aberystwyth Research Portal

Repository@Napier

Xiamen University Institutional Repository