2,459 research outputs found
Lichen planus induced by interferon-alpha-2B therapy in a patient with cutaneous malignant melanoma
Hierarchical fuzzy rule based classification systems with genetic rule selection for imbalanced data-sets
In many real application areas, the data used are highly skewed and the number of
instances for some classes are much higher than that of the other classes. Solving a classification
task using such an imbalanced data-set is difficult due to the bias of the training
towards the majority classes.
The aim of this paper is to improve the performance of fuzzy rule based classification systems
on imbalanced domains, increasing the granularity of the fuzzy partitions on the
boundary areas between the classes, in order to obtain a better separability. We propose
the use of a hierarchical fuzzy rule based classification system, which is based on the
refinement of a simple linguistic fuzzy model by means of the extension of the structure
of the knowledge base in a hierarchical way and the use of a genetic rule selection process
in order to get a compact and accurate model.
The good performance of this approach is shown through an extensive experimental
study carried out over a large collection of imbalanced data-sets.Spanish Ministry of Education and Science (MEC) under Projects TIN-2005-08386-C05-01 and TIN-2005-08386-
C05-0
"Trabajar en casa de familia". Mujeres indÃgenas migrantes en el empleo doméstico en Panamá
Towards Smart Data Technologies for Big Data Analytics
Currently the publicly available datasets for Big Data Ana-lytics are of different qualities, and obtaining the expected behavior from the Machine Learning algorithms is crucial. Furthermore, since working with a huge amount of data is usually a time-demanding task, tohave high quality data is required. Smart Data refers to the process of transforming Big Data into clean and reliable data, and this can be accomplished by converting them, reducing unnecessary volume of data or applying some preprocessing techniques with the aim of improve their quality, and still to obtain trustworthy results. We present those properties that affect the quality of data. Also, the available proposals to analyze the quality of huge amount of data and to cope with low quality datasets in an scalable way, are commented. Furthermore, the need for a methodology towards Smart Data is highlighted.Instituto de Investigación en InformáticaInstituto de Investigación en Informátic
Dynamic affinity-based classification of multi-class imbalanced data with one-versus-one decomposition : a fuzzy rough set approach
Class imbalance occurs when data elements are unevenly distributed among classes, which poses a challenge for classifiers. The core focus of the research community has been on binary-class imbalance, although there is a recent trend toward the general case of multi-class imbalanced data. The IFROWANN method, a classifier based on fuzzy rough set theory, stands out for its performance in two-class imbalanced problems. In this paper, we consider its extension to multi-class data by combining it with one-versus-one decomposition. The latter transforms a multi-class problem into two-class sub-problems. Binary classifiers are applied to these sub-problems, after which their outcomes are aggregated into one prediction. We enhance the integration of IFROWANN in the decomposition scheme in two steps. Firstly, we propose an adaptive weight setting for the binary classifier, addressing the varying characteristics of the sub-problems. We call this modified classifier IFROWANN-WIR. Second, we develop a new dynamic aggregation method called WV–FROST that combines the predictions of the binary classifiers with the global class affinity before making a final decision. In a meticulous experimental study, we show that our complete proposal outperforms the state-of-the-art on a wide range of multi-class imbalanced datasets
An Analysis of the Rule Weights and Fuzzy Reasoning Methods for Linguistic Rule Based Classification Systems Applied to Problems with Highly Imbalanced Data Sets
In this contribution we carry out an analysis of the rule
weights and Fuzzy Reasoning Methods for Fuzzy Rule Based Classification
Systems in the framework of imbalanced data-sets with a high
imbalance degree. We analyze the behaviour of the Fuzzy Rule Based
Classification Systems searching for the best configuration of rule weight
and Fuzzy Reasoning Method also studying the cooperation of some
pre-processing methods of instances. To do so we use a simple rule base
obtained with the Chi (and co-authors’) method that extends the wellknown
Wang and Mendel method to classification problems.
The results obtained show the necessity to apply an instance preprocessing
step and the clear differences in the use of the rule weight
and Fuzzy Reasoning Method.
Finally, it is empirically proved that there is a superior performance
of Fuzzy Rule Based Classification Systems compared to the 1-NN and
C4.5 classifiers in the framework of highly imbalanced data-sets.Spanish Projects TIN-2005-08386-C05-01 & TIC-2005-08386-
C05-0
Improving the performance of fuzzy rule-based classification systems with interval-valued fuzzy sets and genetic amplitude tuning
Among the computational intelligence techniques employed to solve classification problems,
Fuzzy Rule-Based Classification Systems (FRBCSs) are a popular tool because of their
interpretable models based on linguistic variables, which are easier to understand for the
experts or end-users.
The aim of this paper is to enhance the performance of FRBCSs by extending the Knowledge
Base with the application of the concept of Interval-Valued Fuzzy Sets (IVFSs). We
consider a post-processing genetic tuning step that adjusts the amplitude of the upper
bound of the IVFS to contextualize the fuzzy partitions and to obtain a most accurate solution
to the problem.
We analyze the goodness of this approach using two basic and well-known fuzzy rule
learning algorithms, the Chi et al.’s method and the fuzzy hybrid genetics-based machine
learning algorithm. We show the improvement achieved by this model through an extensive
empirical study with a large collection of data-sets.This work has been supported by the Spanish Ministry of Science and
Technology under projects TIN2008-06681-C06-01 and TIN2007-65981
Why Linguistic Fuzzy Rule Based Classification Systems perform well in Big Data Applications?
The significance of addressing Big Data applications is beyond all doubt. The current ability of extracting interesting knowledge from large volumes of information provides great advantages to both corporations and academia. Therefore, researchers and practitioners must deal with the problem of scalability so that Machine Learning and Data Mining algorithms can address Big Data properly. With this end, the MapReduce programming framework is by far the most widely used mechanism to implement fault-tolerant distributed applications. This novel framework implies the design of a divide-and-conquer mechanism in which local models are learned separately in one stage (Map tasks) whereas a second stage (Reduce) is devoted to aggregate all sub-models into a single solution. In this paper, we focus on the analysis of the behavior of Linguistic Fuzzy Rule Based Classification Systems when embedded into a MapReduce working procedure. By retrieving different information regarding the rules learned throughout the MapReduce process, we will be able to identify some of the capabilities of this particular paradigm that allowed them to provide a good performance when addressing Big Data problems. In summary, we will show that linguistic fuzzy classifiers are a robust approach in case of scalability requirements.This work have been partially supported by the
Spanish Ministry of Science and Technology under
projects TIN2014-57251-P and TIN2015-68454-R
Leveraging Users’ Trust and Reputation in Social Networks
In on line communities, where there is a huge number of users that interact under anonymous identities, it has been observed that e-word of mouth is a very powerful influence tool. So far, this technology is well known in on-line marketplaces, such as Amazon, eBay or travel based platforms like Tripadvisor or Booking. However, these trust based approach can be leverage in other scenarios from e-democracy to trust based recommendations on e-health context and e-learning systems. The purpose of this contribution is to analyse the main existing trust and reputation mechanisms and to point out new research challenges that needs to be accomplished with the objective of fully exploiting these systems in real world on-line communities.The authors would like to acknowledge the financial support from the EU project H2020-MSCA-IF-2016-
DeciTrustNET-746398 and FEDER funds provided in the Spanish project TIN2016-75850-P
An analysis of local and global solutions to address Big Data imbalanced classification: a case study with SMOTE preprocessing
Addressing the huge amount of data continuously generated is an important challenge in the Machine Learning field. The need to adapt the traditional techniques or create new ones is evident. To do so, distributed technologies have to be used to deal with the significant scalability constraints due to the Big Data context.
In many Big Data applications for classification, there are some classes that are highly underrepresented, leading to what is known as the imbalanced classification problem. In this scenario, learning algorithms are often biased towards the majority classes, treating minority ones as outliers or noise.
Consequently, preprocessing techniques to balance the class distribution were developed. This can be achieved by suppressing majority instances (undersampling) or by creating minority examples (oversampling). Regarding the oversampling methods, one of the most widespread is the SMOTE algorithm, which creates artificial examples according to the neighborhood of each minority class instance.
In this work, our objective is to analyze the SMOTE behavior in Big Data as a function of some key aspects such as the oversampling degree, the neighborhood value and, specially, the type of distributed design (local vs. global).Instituto de Investigación en Informátic
- …