8,056 research outputs found
Temporal fuzzy association rule mining with 2-tuple linguistic representation
This paper reports on an approach that contributes towards the problem of discovering fuzzy association rules that exhibit a temporal pattern. The novel application of the 2-tuple linguistic representation identifies fuzzy association rules in a temporal context, whilst maintaining the interpretability of linguistic terms. Iterative Rule Learning (IRL) with a Genetic Algorithm (GA) simultaneously induces rules and tunes the membership functions. The discovered rules were compared with those from a traditional method of discovering fuzzy association rules and results demonstrate how the traditional method can loose information because rules occur at the intersection of membership function boundaries. New information can be mined from the proposed approach by improving upon rules discovered with the traditional method and by discovering new rules
Mining Frequent Itemsets Using Genetic Algorithm
In general frequent itemsets are generated from large data sets by applying
association rule mining algorithms like Apriori, Partition, Pincer-Search,
Incremental, Border algorithm etc., which take too much computer time to
compute all the frequent itemsets. By using Genetic Algorithm (GA) we can
improve the scenario. The major advantage of using GA in the discovery of
frequent itemsets is that they perform global search and its time complexity is
less compared to other algorithms as the genetic algorithm is based on the
greedy approach. The main aim of this paper is to find all the frequent
itemsets from given data sets using genetic algorithm
Web Usage Mining with Evolutionary Extraction of Temporal Fuzzy Association Rules
In Web usage mining, fuzzy association rules that have a temporal property can provide useful knowledge about when associations occur. However, there is a problem with traditional temporal fuzzy association rule mining algorithms. Some rules occur at the intersection of fuzzy sets' boundaries where there is less support (lower membership), so the rules are lost. A genetic algorithm (GA)-based solution is described that uses the flexible nature of the 2-tuple linguistic representation to discover rules that occur at the intersection of fuzzy set boundaries. The GA-based approach is enhanced from previous work by including a graph representation and an improved fitness function. A comparison of the GA-based approach with a traditional approach on real-world Web log data discovered rules that were lost with the traditional approach. The GA-based approach is recommended as complementary to existing algorithms, because it discovers extra rules. (C) 2013 Elsevier B.V. All rights reserved
Predicting human preferences using the block structure of complex social networks
With ever-increasing available data, predicting individuals' preferences and
helping them locate the most relevant information has become a pressing need.
Understanding and predicting preferences is also important from a fundamental
point of view, as part of what has been called a "new" computational social
science. Here, we propose a novel approach based on stochastic block models,
which have been developed by sociologists as plausible models of complex
networks of social interactions. Our model is in the spirit of predicting
individuals' preferences based on the preferences of others but, rather than
fitting a particular model, we rely on a Bayesian approach that samples over
the ensemble of all possible models. We show that our approach is considerably
more accurate than leading recommender algorithms, with major relative
improvements between 38% and 99% over industry-level algorithms. Besides, our
approach sheds light on decision-making processes by identifying groups of
individuals that have consistently similar preferences, and enabling the
analysis of the characteristics of those groups
Fuzzy clustering of univariate and multivariate time series by genetic multiobjective optimization
Given a set of time series, it is of interest to discover subsets that share similar properties. For instance, this may be useful for identifying and estimating a single model that may fit conveniently several time series, instead of performing the usual identification and estimation steps for each one. On the other hand time series in the same cluster are related with respect to the measures assumed for cluster analysis and are suitable for building multivariate time series models. Though many approaches to clustering time series exist, in this view the most effective method seems to have to rely on choosing some features relevant for the problem at hand and seeking for clusters according to their measurements, for instance the autoregressive coe±cients, spectral measures or the eigenvectors of the covariance matrix. Some new indexes based on goodnessof-fit criteria will be proposed in this paper for fuzzy clustering of multivariate time series. A general purpose fuzzy clustering algorithm may be used to estimate the proper cluster structure according to some internal criteria of cluster validity. Such indexes are known to measure actually definite often conflicting cluster properties, compactness or connectedness, for instance, or distribution, orientation, size and shape. It is argued that the multiobjective optimization supported by genetic algorithms is a most effective choice in such a di±cult context. In this paper we use the Xie-Beni index and the C-means functional as objective functions to evaluate the cluster validity in a multiobjective optimization framework. The concept of Pareto optimality in multiobjective genetic algorithms is used to evolve a set of potential solutions towards a set of optimal non-dominated solutions. Genetic algorithms are well suited for implementing di±cult optimization problems where objective functions do not usually have good mathematical properties such as continuity, differentiability or convexity. In addition the genetic algorithms, as population based methods, may yield a complete Pareto front at each step of the iterative evolutionary procedure. The method is illustrated by means of a set of real data and an artificial multivariate time series data set.Fuzzy clustering, Internal criteria of cluster validity, Genetic algorithms, Multiobjective optimization, Time series, Pareto optimality
How to improve robustness in Kohonen maps and display additional information in Factorial Analysis: application to text mining
This article is an extended version of a paper presented in the WSOM'2012
conference [1]. We display a combination of factorial projections, SOM
algorithm and graph techniques applied to a text mining problem. The corpus
contains 8 medieval manuscripts which were used to teach arithmetic techniques
to merchants. Among the techniques for Data Analysis, those used for
Lexicometry (such as Factorial Analysis) highlight the discrepancies between
manuscripts. The reason for this is that they focus on the deviation from the
independence between words and manuscripts. Still, we also want to discover and
characterize the common vocabulary among the whole corpus. Using the properties
of stochastic Kohonen maps, which define neighborhood between inputs in a
non-deterministic way, we highlight the words which seem to play a special role
in the vocabulary. We call them fickle and use them to improve both Kohonen map
robustness and significance of FCA visualization. Finally we use graph
algorithmic to exploit this fickleness for classification of words
- …