6 research outputs found

    Visualization of an Approach to Data Clustering

    Get PDF
    Using visualization and clustering goals as guidelines, this thesis explores a graphic implementation of a data clustering technique that repositions vertices by applying physical laws of charges and springs to the components of the graph. The resulting visualizations are evidence of the success of the approach as well as of the data sets that lend themselves to a clustering routine. Due to the visual product of the implementation, the algorithm is most useful as an aid in understanding the grouping pattern of a data set. Either for a rapid analysis or to assist in presentation, the visual result of the clustering approach is a useful tool for discovering trends in a data set

    Fuzzy set covering as a new paradigm for the induction of fuzzy classification rules

    Get PDF
    In 1965 Lofti A. Zadeh proposed fuzzy sets as a generalization of crisp (or classic) sets to address the incapability of crisp sets to model uncertainty and vagueness inherent in the real world. Initially, fuzzy sets did not receive a very warm welcome as many academics stood skeptical towards a theory of imprecise'' mathematics. In the middle to late 1980's the success of fuzzy controllers brought fuzzy sets into the limelight, and many applications using fuzzy sets started appearing. In the early 1970's the first machine learning algorithms started appearing. The AQ family of algorithms pioneered by Ryszard S. Michalski is a good example of the family of set covering algorithms. This class of learning algorithm induces concept descriptions by a greedy construction of rules that describe (or cover) positive training examples but not negative training examples. The learning process is iterative, and in each iteration one rule is induced and the positive examples covered by the rule removed from the set of positive training examples. Because positive instances are separated from negative instances, the term separate-and-conquer has been used to contrast the learning strategy against decision tree induction that use a divide-and-conquer learning strategy. This dissertation proposes fuzzy set covering as a powerful rule induction strategy. We survey existing fuzzy learning algorithms, and conclude that very few fuzzy learning algorithms follow a greedy rule construction strategy and no publications to date made the link between fuzzy sets and set covering explicit. We first develop the theoretical aspects of fuzzy set covering, and then apply these in proposing the first fuzzy learning algorithm that apply set covering and make explicit use of a partial order for fuzzy classification rule induction. We also investigate several strategies to improve upon the basic algorithm, such as better search heuristics and different rule evaluation metrics. We then continue by proposing a general unifying framework for fuzzy set covering algorithms. We demonstrate the benefits of the framework and propose several further fuzzy set covering algorithms that fit within the framework. We compare fuzzy and crisp rule induction, and provide arguments in favour of fuzzy set covering as a rule induction strategy. We also show that our learning algorithms outperform other fuzzy rule learners on real world data. We further explore the idea of simultaneous concept learning in the fuzzy case, and continue to propose the first fuzzy decision list induction algorithm. Finally, we propose a first strategy for encoding the rule sets generated by our fuzzy set covering algorithms inside an equivalent neural network

    Fuzzy set covering as a new paradigm for the induction of fuzzy classification rules

    Full text link
    In 1965 Lofti A. Zadeh proposed fuzzy sets as a generalization of crisp (or classic) sets to address the incapability of crisp sets to model uncertainty and vagueness inherent in the real world. Initially, fuzzy sets did not receive a very warm welcome as many academics stood skeptical towards a theory of imprecise'' mathematics. In the middle to late 1980's the success of fuzzy controllers brought fuzzy sets into the limelight, and many applications using fuzzy sets started appearing. In the early 1970's the first machine learning algorithms started appearing. The AQ family of algorithms pioneered by Ryszard S. Michalski is a good example of the family of set covering algorithms. This class of learning algorithm induces concept descriptions by a greedy construction of rules that describe (or cover) positive training examples but not negative training examples. The learning process is iterative, and in each iteration one rule is induced and the positive examples covered by the rule removed from the set of positive training examples. Because positive instances are separated from negative instances, the term separate-and-conquer has been used to contrast the learning strategy against decision tree induction that use a divide-and-conquer learning strategy. This dissertation proposes fuzzy set covering as a powerful rule induction strategy. We survey existing fuzzy learning algorithms, and conclude that very few fuzzy learning algorithms follow a greedy rule construction strategy and no publications to date made the link between fuzzy sets and set covering explicit. We first develop the theoretical aspects of fuzzy set covering, and then apply these in proposing the first fuzzy learning algorithm that apply set covering and make explicit use of a partial order for fuzzy classification rule induction. We also investigate several strategies to improve upon the basic algorithm, such as better search heuristics and different rule evaluation metrics. We then continue by proposing a general unifying framework for fuzzy set covering algorithms. We demonstrate the benefits of the framework and propose several further fuzzy set covering algorithms that fit within the framework. We compare fuzzy and crisp rule induction, and provide arguments in favour of fuzzy set covering as a rule induction strategy. We also show that our learning algorithms outperform other fuzzy rule learners on real world data. We further explore the idea of simultaneous concept learning in the fuzzy case, and continue to propose the first fuzzy decision list induction algorithm. Finally, we propose a first strategy for encoding the rule sets generated by our fuzzy set covering algorithms inside an equivalent neural network

    Interactive exploration of fuzzy clusters using Neighborgrams

    No full text
    We describe an interactive method to generate a set of fuzzy clusters for classes of interest of a given, labeled data set. The presented method is therefore best suited for applications where the focus of analysis lies on a model for the minority class or for small to medium-sized data sets. The clustering algorithm creates one–dimensional models of the neighborhood for a set of patterns by constructing cluster candidates for each pattern of interest and then chooses the best subset of clusters that form a global model of the data. The accompanying visualization of these neighborhoods allows the user to interact with the clustering process by selecting, discarding, or fine–tuning potential cluster candidates. Clusters can be crisp or fuzzy and the latter leads to a substantial improvement of the classification accuracy. We demonstrate the performance of the underlying algorithm on several data sets from the StatLog project and show its usefulness for visual cluster exploration on the Iris data and a large molecular dataset from the National Cancer Institute

    Interactive Exploration of Fuzzy Clusters Using Neighborgrams

    No full text
    We describe an interactive method to generate a set of fuzzy clusters for classes of interest of a given, labeled data set. The presented method is therefore best suited for applications where the focus of analysis lies on a model for the minority class or for small- to medium-size data sets. The clustering algorithm creates one-dimensional models of the neighborhood for a set of patterns by constructing cluster candidates for each pattern of interest and then chooses the best subset of clusters that form a global model of the data. The accompanying visualization of these neighborhoods allows the user to interact with the clustering process by selecting, discarding, or fine-tuning potential cluster candidates. Clusters can be crisp or fuzzy and the latter leads to a substantial improvement of the classification accuracy. We demonstrate the performance of the underlying algorithm on several data sets from the StatLog project
    corecore