10 research outputs found

    On the Informativeness of the DNA Promoter Sequences Domain Theory

    Full text link
    The DNA promoter sequences domain theory and database have become popular for testing systems that integrate empirical and analytical learning. This note reports a simple change and reinterpretation of the domain theory in terms of M-of-N concepts, involving no learning, that results in an accuracy of 93.4% on the 106 items of the database. Moreover, an exhaustive search of the space of M-of-N domain theory interpretations indicates that the expected accuracy of a randomly chosen interpretation is 76.5%, and that a maximum accuracy of 97.2% is achieved in 12 cases. This demonstrates the informativeness of the domain theory, without the complications of understanding the interactions between various learning algorithms and the theory. In addition, our results help characterize the difficulty of learning using the DNA promoters theory.Comment: See http://www.jair.org/ for any accompanying file

    Data-Driven Theory Refinement Algorithms for Bioinformatics

    Get PDF
    Bioinformatics and related applications call for efficient algorithms for knowledge intensive learning and data driven knowledge refinement. Knowledge based artificial neural networks offer an attractive approach to extending or modifying incomplete knowledge bases or domain theories. We present results of experiments with several such algorithms for data driven knowledge discovery and theory refinement in some simple bioinformatics applications. Results of experiments on the ribosome binding site and promoter site identification problems indicate that the performance of KBDistAl and Tiling Pyramid algorithms compares quite favorably with those of substantially more computationally demanding techniques

    Tractability of Theory Patching

    Full text link
    In this paper we consider the problem of `theory patching', in which we are given a domain theory, some of whose components are indicated to be possibly flawed, and a set of labeled training examples for the domain concept. The theory patching problem is to revise only the indicated components of the theory, such that the resulting theory correctly classifies all the training examples. Theory patching is thus a type of theory revision in which revisions are made to individual components of the theory. Our concern in this paper is to determine for which classes of logical domain theories the theory patching problem is tractable. We consider both propositional and first-order domain theories, and show that the theory patching problem is equivalent to that of determining what information contained in a theory is `stable' regardless of what revisions might be performed to the theory. We show that determining stability is tractable if the input theory satisfies two conditions: that revisions to each theory component have monotonic effects on the classification of examples, and that theory components act independently in the classification of examples in the theory. We also show how the concepts introduced can be used to determine the soundness and completeness of particular theory patching algorithms.Comment: See http://www.jair.org/ for any accompanying file

    Connectionist Theory Refinement: Genetically Searching the Space of Network Topologies

    Full text link
    An algorithm that learns from a set of examples should ideally be able to exploit the available resources of (a) abundant computing power and (b) domain-specific knowledge to improve its ability to generalize. Connectionist theory-refinement systems, which use background knowledge to select a neural network's topology and initial weights, have proven to be effective at exploiting domain-specific knowledge; however, most do not exploit available computing power. This weakness occurs because they lack the ability to refine the topology of the neural networks they produce, thereby limiting generalization, especially when given impoverished domain theories. We present the REGENT algorithm which uses (a) domain-specific knowledge to help create an initial population of knowledge-based neural networks and (b) genetic operators of crossover and mutation (specifically designed for knowledge-based networks) to continually search for better network topologies. Experiments on three real-world domains indicate that our new algorithm is able to significantly increase generalization compared to a standard connectionist theory-refinement system, as well as our previous algorithm for growing knowledge-based networks.Comment: See http://www.jair.org/ for any accompanying file

    Probabilistic Inductive Querying Using ProbLog

    Get PDF
    We study how probabilistic reasoning and inductive querying can be combined within ProbLog, a recent probabilistic extension of Prolog. ProbLog can be regarded as a database system that supports both probabilistic and inductive reasoning through a variety of querying mechanisms. After a short introduction to ProbLog, we provide a survey of the different types of inductive queries that ProbLog supports, and show how it can be applied to the mining of large biological networks.Peer reviewe

    REVISING HORN FORMULAS

    Get PDF
    Boolean formulas can be used to model real-world facts. In some situation we may havea Boolean formula that closely approximates a real-world fact, but we need to fine-tune itso that it models the real-world fact exactly. This is a problem of theory revision where thetheory is in the form of a Boolean formula. An algorithm is presented for revising a class ofBoolean formulas that are expressible as conjunctions of Horn clauses. Each of the clausesin the formulas considered here has a unique unnegated variable that does not appear inany other clauses, and is not `F\u27. The revision algorithm uses equivalence and membershipqueries to revise a given formula into a formula that is equivalent to an unknown targetformula having the same set of unnegated variables. The amount of time required by thealgorithm to perform this revision is logarithmic in the number of variables, and polynomialin the number of clauses in the unknown formula. An early version of this work waspresented at the 2003 Midwest Artificial Intelligence and Cognitive Science Conference [4]

    Science as an Anomaly-Driven Enterprise: A Computational Approach to Generating Acceptable Theory Revisions in the Face of Anomalous Data

    Get PDF
    Anomalous data lead to scientific discoveries. Although machine learning systems can be forced to resolve anomalous data, these systems use general learning algorithms to do so. To determine whether anomaly-driven approaches to discovery produce more accurate models than the standard approaches, we built a program called Kalpana. We also used Kalpana to explore means for identifying those anomaly resolutions that are acceptable to domain experts. Our experiments indicated that anomaly-driven approaches can lead to a richer set of model revisions than standard methods. Additionally we identified semantic and syntactic measures that are significantly correlated with the acceptability of model revisions. These results suggest that by interpreting data within the context of a model and by interpreting model revisions within the context of domain knowledge, discovery systems can more readily suggest accurate and acceptable anomaly resolutions

    Bias-driven revision of logical domain theories

    No full text
    The theory revision problem is the problem of how best to go about revising a deficient domain theory using information contained in examples that expose inaccuracies. In this paper we present our approach to the theory revision problem for propositional domain theories. The approach described here, called PTR, uses probabilities associated with domain theory elements to numerically track the ‘‘flow’ ’ ofproof through the theory. This allows us to measure the precise role of a clause or literal in allowing or preventing a (desired or undesired) derivation for a given example. This information is used to efficiently locate and repair flawed elements of the theory. PTR is proved to converge to a theory which correctly classifies all examples, and shown experimentally to be fast and accurate even for deep theories. 1

    Bias-Driven Revision of Logical Domain Theories

    No full text
    corecore