413,196 research outputs found

    Combination of molecular similarity measures using data fusion

    Get PDF
    Many different measures of structural similarity have been suggested for matching chemical structures, each such measure focusing upon some particular type of molecular characteristic. The multi-faceted nature of biological activity suggests that an appropriate similarity measure should encompass many different types of characteristic, and this article discusses the use of data fusion methods to combine the results of searches based on multiple similarity measures. Experiments with several different types of dataset and activity suggest that data fusion provides a simple, but effective, approach to the combination of individual similarity measures. The best results were generally obtained with a fusion rule that sums the rank positions achieved by each molecule in searches using individual measures

    On the semantics of fuzzy logic

    Get PDF
    AbstractThis paper presents a formal characterization of the major concepts and constructs of fuzzy logic in terms of notions of distance, closeness, and similarity between pairs of possible worlds. The formalism is a direct extension (by recognition of multiple degrees of accessibility, conceivability, or reachability) of the najor modal logic concepts of possible and necessary truth.Given a function that maps pairs of possible worlds into a number between 0 and 1, generalizing the conventional concept of an equivalence relation, the major constructs of fuzzy logic (conditional and unconditioned possibility distributions) are defined in terms of this similarity relation using familiar concepts from the mathematical theory of metric spaces. This interpretation is different in nature and character from the typical, chance-oriented, meanings associated with probabilistic concepts, which are grounded on the mathematical notion of set measure. The similarity structure defines a topological notion of continuity in the space of possible worlds (and in that of its subsets, i.e., propositions) that allows a form of logical “extrapolation” between possible worlds.This logical extrapolation operation corresponds to the major deductive rule of fuzzy logic — the compositional rule of inference or generalized modus ponens of Zadeh — an inferential operation that generalizes its classical counterpart by virtue of its ability to be utilized when propositions representing available evidence match only approximately the antecedents of conditional propositions. The relations between the similarity-based interpretation of the role of conditional possibility distributions and the approximate inferential procedures of Baldwin are also discussed.A straightforward extension of the theory to the case where the similarity scale is symbolic rather than numeric is described. The problem of generating similarity functions from a given set of possibility distributions, with the latter interpreted as defining a number of (graded) discernibility relations and the former as the result of combining them into a joint measure of distinguishability between possible worlds, is briefly discussed

    An Exploration of Rule Clustering in Cellular Automata Rule Spaces

    Get PDF
    The study of complex systems examines the global behavior of a system and how the individual parts of the system affect that behavior [1]. The study of complex systems spans across many fields of science like biology, physics, engineering, and computer science. One area of complex systems that has not been fully explored is cellular automata. Since its discovery by John von Neumann, there have been no consistent ways of categorizing similarities between cellular automata rules or collecting similar rules for observation. This thesis introduces an approach to identifying clusters of similar rules and extracting rules from that cluster. Several similarity measures were developed to establish similarity between rules. All similarity measure approaches are outlined in this thesis, but only one was selected for determining similarity in this approach. Based on a partitioning of the rule space, this approach uses lo and h with their inherent primitives p0 and pi to obtain a cluster identification string [5], The cluster Id. is determined by the output of the surrounding neighbors of any rule in the cluster. This cluster Id. can be used to produce a set of rules, all yielding the same or similar output

    Enhancing the Efficiency of a Decision Support System through the Clustering of Complex Rule-Based Knowledge Bases and Modification of the Inference Algorithm

    Get PDF
    Decision support systems founded on rule-based knowledge representation should be equipped with rule management mechanisms. Effective exploration of new knowledge in every domain of human life requires new algorithms of knowledge organization and a thorough search of the created data structures. In this work, the author introduces an optimization of both the knowledge base structure and the inference algorithm. Hence, a new, hierarchically organized knowledge base structure is proposed as it draws on the cluster analysis method and a new forward-chaining inference algorithm which searches only the so-called representatives of rule clusters. Making use of the similarity approach, the algorithm tries to discover new facts (new knowledge) from rules and facts already known. The author defines and analyses four various representative generation methods for rule clusters. Experimental results contain the analysis of the impact of the proposed methods on the efficiency of a decision support system with such knowledge representation. In order to do this, four representative generation methods and various types of clustering parameters (similarity measure, clustering methods, etc.) were examined. As can be seen, the proposed modification of both the structure of knowledge base and the inference algorithm has yielded satisfactory results

    K2-ABC: Approximate Bayesian Computation with Kernel Embeddings

    Get PDF
    Complicated generative models often result in a situation where computing the likelihood of observed data is intractable, while simulating from the conditional density given a parameter value is relatively easy. Approximate Bayesian Computation (ABC) is a paradigm that enables simulation-based posterior inference in such cases by measuring the similarity between simulated and observed data in terms of a chosen set of summary statistics. However, there is no general rule to construct sufficient summary statistics for complex models. Insufficient summary statistics will "leak" information, which leads to ABC algorithms yielding samples from an incorrect (partial) posterior. In this paper, we propose a fully nonparametric ABC paradigm which circumvents the need for manually selecting summary statistics. Our approach, K2-ABC, uses maximum mean discrepancy (MMD) as a dissimilarity measure between the distributions over observed and simulated data. MMD is easily estimated as the squared difference between their empirical kernel embeddings. Experiments on a simulated scenario and a real-world biological problem illustrate the effectiveness of the proposed algorithm

    Elementary Cellular Automata, Fractal Dimensions and Mutual Information.

    Get PDF
    We explore a quantitative description of Wolfram\u27s classification of elementary cellular automata based on fractal dimensions. We find the· fractal dimension to be a global measure in classifying elementary cellular automata independent of initial conditions. On the other hand, the results of our analysis of rules in Class 3 numerically confirm the existence of a wide range of dynamics among rules in Class 3. The main reason for this is due to the fact that the rules with the Sierpinski structure in Class 3 have the capacity to behave like rules in Class 2 depending on their initial conditions. Furthermore, we apply mutual information to investigate how elementary cellular automata handle information throughout the iterations of time steps. We discover a similarity among Rule 30, Rule 110 and Rule 22, and give further supporting evidence for Wolfram\u27s conjecture that Rule 30 and Rule 22 may be universal
    corecore