4,632 research outputs found
Discovering Regression Rules with Ant Colony Optimization
The majority of Ant Colony Optimization (ACO) algorithms for data mining have dealt with classification or clustering problems. Regression remains an unexplored research area to the best of our knowledge. This paper proposes a new ACO algorithm that generates regression rules for data mining applications. The new algorithm combines components from an existing deterministic (greedy) separate and conquer algorithmâemploying the same quality metrics and continuous attribute processing techniquesâallowing a comparison of the two. The new algorithm has been shown to decrease the relative root mean square error when compared to the greedy algorithm. Additionally a different approach to handling continuous attributes was investigated showing further improvements were possible
Learning Interpretable Rules for Multi-label Classification
Multi-label classification (MLC) is a supervised learning problem in which,
contrary to standard multiclass classification, an instance can be associated
with several class labels simultaneously. In this chapter, we advocate a
rule-based approach to multi-label classification. Rule learning algorithms are
often employed when one is not only interested in accurate predictions, but
also requires an interpretable theory that can be understood, analyzed, and
qualitatively evaluated by domain experts. Ideally, by revealing patterns and
regularities contained in the data, a rule-based theory yields new insights in
the application domain. Recently, several authors have started to investigate
how rule-based models can be used for modeling multi-label data. Discussing
this task in detail, we highlight some of the problems that make rule learning
considerably more challenging for MLC than for conventional classification.
While mainly focusing on our own previous work, we also provide a short
overview of related work in this area.Comment: Preprint version. To appear in: Explainable and Interpretable Models
in Computer Vision and Machine Learning. The Springer Series on Challenges in
Machine Learning. Springer (2018). See
http://www.ke.tu-darmstadt.de/bibtex/publications/show/3077 for further
informatio
An evolutionary algorithm to discover quantitative association rules in multidimensional time series
An evolutionary approach for finding existing
relationships among several variables of a multidimensional
time series is presented in this work. The proposed model to
discover these relationships is based on quantitative association
rules. This algorithm, called QARGA (Quantitative
Association Rules by Genetic Algorithm), uses a particular
codification of the individuals that allows solving two basic
problems. First, it does not perform a previous attribute
discretization and, second, it is not necessary to set which
variables belong to the antecedent or consequent. Therefore,
it may discover all underlying dependencies among
different variables. To evaluate the proposed algorithm
three experiments have been carried out. As initial step,
several public datasets have been analyzed with the purpose
of comparing with other existing evolutionary approaches.
Also, the algorithm has been applied to synthetic time series
(where the relationships are known) to analyze its potential
for discovering rules in time series. Finally, a real-world
multidimensional time series composed by several climatological
variables has been considered. All the results show
a remarkable performance of QARGA.Ministerio de Ciencia y TecnologĂa TIN2007- 68084-C02-02Junta de Andalucia P07-TIC- 0261
Mining quantitative association rules based on evolutionary computation and its application to atmospheric pollution
This research presents the mining of quantitative association rules based on evolutionary computation techniques.
First, a real-coded genetic algorithm that extends the well-known binary-coded CHC algorithm has been projected to determine
the intervals that define the rules without needing to discretize the attributes. The proposed algorithm is evaluated in synthetic
datasets under different levels of noise in order to test its performance and the reported results are then compared to that of
a multi-objective differential evolution algorithm, recently published. Furthermore, rules from real-world time series such as
temperature, humidity, wind speed and direction of the wind, ozone, nitrogen monoxide and sulfur dioxide have been discovered
with the objective of finding all existing relations between atmospheric pollution and climatological conditions.Ministerio de Ciencia y TecnologĂa TIN2007-68084-C-00Junta de AndalucĂa P07-TIC-0261
Data mining in soft computing framework: a survey
The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model. The utility of the different soft computing methodologies is highlighted. Generally fuzzy sets are suitable for handling the issues related to understandability of patterns, incomplete/noisy data, mixed media information and human interaction, and can provide approximate solutions faster. Neural networks are nonparametric, robust, and exhibit good learning and generalization capabilities in data-rich environments. Genetic algorithms provide efficient search algorithms to select a model, from mixed media data, based on some preference criterion/objective function. Rough sets are suitable for handling different types of uncertainty in data. Some challenges to data mining and the application of soft computing methodologies are indicated. An extensive bibliography is also included
An Overview of the Use of Neural Networks for Data Mining Tasks
In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks
The Usage of Association Rule Mining to Identify Influencing Factors on Deafness After Birth
Background: Providing complete and high quality health care services has very important role to enable
people to understand the factors related to personal and social health and to make decision regarding
choice of suitable healthy behaviors in order to achieve healthy life. For this reason, demographic and
clinical data of person are collecting, this huge volume of data can be known as a valuable resource for
analyzing, exploring and discovering valuable information and communication. This study using forum
rules techniques in the data mining has tried to identify the affecting factors on hearing loss after
birth in Iran. Materials and Methods: The survey is kind of data oriented study. The population of the
study is contained questionnaires in several provinces of the country. First, all data of questionnaire
was implemented in the form of information table in Software SQL Server and followed by Data Entry
using written software of C # .Net, then algorithm Association in SQL Server Data Tools software and
Clementine software was implemented to determine the rules and hidden patterns in the gathered
data. Findings: Two factors of number of deaf brothers and the degree of consanguinity of the parents
have a significant impact on severity of deafness of individuals. Also, when the severity of hearing loss
is greater than or equal to moderately severe hearing loss, people use hearing aids and Men are also
less interested in the use of hearing aids. Conclusion: In fact, it can be said that in families with consanguineous
marriage of parents that are from first degree (girl/boy cousins) and 2nd degree relatives
(girl/boy cousins) and especially from first degree, the number of people with severe hearing loss or
deafness are more and in the use of hearing aids, gender of the patient is more important than the
severity of the hearing los
- âŠ