3,315 research outputs found
Using a unified measure function for heuristics, discretization, and rule quality evaluation in Ant-Miner
Ant-Miner is a classification rule discovery algorithm that is based on Ant Colony Optimization (ACO) meta-heuristic. cAnt-Miner is the extended version of the algorithm that handles continuous attributes on-the-fly during the rule construction process, while ?Ant-Miner is an extension of the algorithm that selects the rule class prior to its construction, and utilizes multiple pheromone types, one for each permitted rule class. In this paper, we combine these two algorithms to derive a new approach for learning classification rules using ACO. The proposed approach is based on using the measure function for 1) computing the heuristics for rule term selection, 2) a criteria for discretizing continuous attributes, and 3) evaluating the quality of the constructed rule for pheromone update as well. We explore the effect of using different measure functions for on the output model in terms of predictive accuracy and model size. Empirical evaluations found that hypothesis of different functions produce different results are acceptable according to Friedman’s statistical test
Imprecise probability models for inference in exponential families
When considering sampling models described by a distribution from an exponential family, it is possible to create two types of imprecise probability models. One is based on the corresponding conjugate distribution and the other on the corresponding predictive distribution. In this paper, we show how these types of models can be constructed for any (regular, linear, canonical) exponential family, such as the centered normal distribution.
To illustrate the possible use of such models, we take a look at credal classification. We show that they are very natural and potentially promising candidates for describing the attributes of a credal classifier, also in the case of continuous attributes
Knowledge discovery through creating formal contexts
Knowledge discovery is important for systems
that have computational intelligence in helping them learn
and adapt to changing environments. By representing, in
a formal way, the context in which an intelligent system
operates, it is possible to discover knowledge through an
emerging data technology called Formal Concept Analysis
(FCA). This paper describes a tool called FcaBedrock that
converts data into Formal Contexts for FCA. The paper
describes how, through a process of guided automation,
data preparation techniques such as attribute exclusion and
value restriction allow data to be interpreted to meet the requirements
of the analysis. Creating Formal Contexts using
FcaBedrock is shown to be straightforward and versatile.
Large data sets are easily converted into a standard FCA
format
Rough sets for predicting the Kuala Lumpur Stock Exchange Composite Index returns
This study aims to prove the usability of Rough Set approach in capturing the relationship between the technical indicators and the level of Kuala Lumpur Stock Exchange Composite Index (KLCI) over time.Stock markets are affected by many interrelated economic, political, and even psychological factors.Therefore, it is
generally very difficult to predict its movements. There are extensive literatures available describing attempts to use artificial intelligence techniques; in particular neural networks and genetic algorithm for analyzing stock market
variations.However, drawbacks are found where neural networks have great complexity in interpreting the results; genetic algorithms create large data redundancies.A relatively new approach, the rough sets are suggested for
its simple knowledge representation, ability to deal with uncertainties and lowering data redundancies.In this study, a few different discretization algorithms were used
at data preprocessing. From the simulations and result produced, the rough sets approach can be a promising alternative to the existing methods for stock market prediction
An Efficient Search Strategy for Aggregation and Discretization of Attributes of Bayesian Networks Using Minimum Description Length
Bayesian networks are convenient graphical expressions for high dimensional
probability distributions representing complex relationships between a large
number of random variables. They have been employed extensively in areas such
as bioinformatics, artificial intelligence, diagnosis, and risk management. The
recovery of the structure of a network from data is of prime importance for the
purposes of modeling, analysis, and prediction. Most recovery algorithms in the
literature assume either discrete of continuous but Gaussian data. For general
continuous data, discretization is usually employed but often destroys the very
structure one is out to recover. Friedman and Goldszmidt suggest an approach
based on the minimum description length principle that chooses a discretization
which preserves the information in the original data set, however it is one
which is difficult, if not impossible, to implement for even moderately sized
networks. In this paper we provide an extremely efficient search strategy which
allows one to use the Friedman and Goldszmidt discretization in practice
- …