5 research outputs found

    Inductive logic programming using bounded hypothesis space

    Get PDF
    Inductive Logic Programming (ILP) systems apply inductive learning to an inductive learning task by deriving a hypothesis which explains the given examples. Applying ILP systems to real applications poses many challenges as they require large search space, noise is present in the learning task, and in domains such as software engineering hypotheses are required to satisfy domain specific syntactic constraints. ILP systems use language biases to define the hypothesis space, and learning can be seen as a search within the defined hypothesis space. Past systems apply search heuristics to traverse across a large hypothesis space. This is unsuitable for systems implemented using Answer Set Programming (ASP), for which scalability is a constraint as the hypothesis space will need to be grounded by the ASP solver prior to solving the learning task, making them unable to solve large learning tasks. This work explores how to learn using bounded hypothesis spaces and iterative refinement. Hypotheses that explain all examples are learnt by refining smaller partial hypotheses. This improves the scalability of ASP based systems as the learning task is split into multiple smaller manageable refinement tasks. The thesis presents how syntactic integrity constraints on the hypothesis space can be used to strengthen hypothesis selection criteria, removing hypotheses with undesirable structure. The notion of constraint-driven bias is introduced, where hypotheses are required to be acceptable with respect to the given meta-level integrity constraints. Building upon the ILP system ASPAL, the system RASPAL which learns through iterative hypothesis refinement is implemented. RASPAL's algorithm is proven, under certain assumptions, to be complete and consistent. Both systems have been applied to a case study in learning user's behaviours from data collected from their mobile usage. This demonstrates their capability for learning with noise, and the difference in their efficiency. Constraint-driven bias has been implemented for both systems, and applied to a task in specification revision, and in learning stratified programs.Open Acces

    Using Analogy to Acquire Commonsense Knowledge from Human Contributors

    Get PDF
    The goal of the work reported here is to capture the commonsense knowledge of non-expert human contributors. Achieving this goal will enable more intelligent human-computer interfaces and pave the way for computers to reason about our world. In the domain of natural language processing, it will provide the world knowledge much needed for semantic processing of natural language. To acquire knowledge from contributors not trained in knowledge engineering, I take the following four steps: (i) develop a knowledge representation (KR) model for simple assertions in natural language, (ii) introduce cumulative analogy, a class of nearest-neighbor based analogical reasoning algorithms over this representation, (iii) argue that cumulative analogy is well suited for knowledge acquisition (KA) based on a theoretical analysis of effectiveness of KA with this approach, and (iv) test the KR model and the effectiveness of the cumulative analogy algorithms empirically. To investigate effectiveness of cumulative analogy for KA empirically, Learner, an open source system for KA by cumulative analogy has been implemented, deployed, and evaluated. (The site "1001 Questions," is available at http://teach-computers.org/learner.html). Learner acquires assertion-level knowledge by constructing shallow semantic analogies between a KA topic and its nearest neighbors and posing these analogies as natural language questions to human contributors. Suppose, for example, that based on the knowledge about "newspapers" already present in the knowledge base, Learner judges "newspaper" to be similar to "book" and "magazine." Further suppose that assertions "books contain information" and "magazines contain information" are also already in the knowledge base. Then Learner will use cumulative analogy from the similar topics to ask humans whether "newspapers contain information." Because similarity between topics is computed based on what is already known about them, Learner exhibits bootstrapping behavior --- the quality of its questions improves as it gathers more knowledge. By summing evidence for and against posing any given question, Learner also exhibits noise tolerance, limiting the effect of incorrect similarities. The KA power of shallow semantic analogy from nearest neighbors is one of the main findings of this thesis. I perform an analysis of commonsense knowledge collected by another research effort that did not rely on analogical reasoning and demonstrate that indeed there is sufficient amount of correlation in the knowledge base to motivate using cumulative analogy from nearest neighbors as a KA method. Empirically, evaluating the percentages of questions answered affirmatively, negatively and judged to be nonsensical in the cumulative analogy case compares favorably with the baseline, no-similarity case that relies on random objects rather than nearest neighbors. Of the questions generated by cumulative analogy, contributors answered 45% affirmatively, 28% negatively and marked 13% as nonsensical; in the control, no-similarity case 8% of questions were answered affirmatively, 60% negatively and 26% were marked as nonsensical

    Inducci贸n de conocimiento con incertidumbre en bases de datos relacionales borrosas

    Get PDF
    Este trabajo presenta un sistema para aprendizaje de definiciones l贸gicas con incertidumbre, a partir de una base de datos relacional borrosa. El campo de inter茅s se centra, por tanto, en la programaci贸n l贸gica inductiva, introduciendo algunas interesantes aportaciones, principalmente en lo que se refiere a la entrada de datos y a los resultados producidos: Los datos de entrada pertenecen a una base de datos relacional borrosa. Por tanto, vienen expresados en forma de tablas de tuplas (relaciones), en las que las tuplas pueden llevar asociado un grado de pertenencia a la relaci贸n correspondiente. Se trata, por tanto, de relaciones borrosas, directamente identificables con conceptos borrosos (tan comunes en la realidad vista desde un punto de vista humano), y no de relaciones ordinarias con atributos borrosos (tal y como se entiende la "borrosidad" en muchos sistemas existentes). Los datos de salida vienen expresados en forma de definiciones l贸gicas de una relaci贸n (ordinaria o borrosa), que consta de una cl谩usula de Horn o de la disyunci贸n de varias. Estas cl谩usulas de Horn se construyen mediante literales, aplicados sobre variables (generalmente), y asociados a relaciones borrosas u ordinarias. Los literales borrosos pueden ser modificados, adem谩s, por el empleo de etiquetas ling眉铆sticas. Por tanto, se combina, en estas definiciones, la l贸gica de predicados con la l贸gica borrosa, en lo que podemos denominar "l贸gica borrosa de predicados", lo que constituye una aportaci贸n dentro de la inducci贸n autom谩tica de conocimiento. Adem谩s, las definiciones inducidas llevan asociado un factor de incertidumbre, como hacen otros sistemas ya existentes. El punto de partida del trabajo lo constituye un sistema de inducci贸n de definiciones l贸gicas bien conocido: FOIL, creado por Quinlan en 1990, basado en la l贸gica de predicados. Sobre este sistema inicial se realizan, adem谩s de las extensiones para l贸gica borrosa ya mencionadas, otra serie de modificaciones y ampliaciones enfocadas a mejorar la inducci贸n de conocimiento. Estas mejoras se realizan, principalmente, en su parte heur铆stica, al definir una funci贸n de evaluaci贸n de literales, basada en medidas de inter茅s, que permite corregir algunas deficiencias del sistema original y aumentar la calidad de las reglas inducidas. Otras modificaciones se orientan hacia la introducci贸n de conocimiento de base, mediante relaciones definidas intensionalmente, de modo similar a otros sistemas como FOCL. Como resultado tangible de la tesis, se ha desarrollado y probado un sistema, FZFOIL, disponible p煤blicamente bajo la licencia GNU

    Evolutionary program induction directed by logic grammars.

    Get PDF
    by Wong Man Leung.Thesis (Ph.D.)--Chinese University of Hong Kong, 1995.Includes bibliographical references (leaves 227-236).List of Figures --- p.iiiList of Tables --- p.viChapter Chapter 1 : --- Introduction --- p.1Chapter 1.1. --- Automatic programming and program induction --- p.1Chapter 1.2. --- Motivation --- p.6Chapter 1.3. --- Contributions of the research --- p.8Chapter 1.4. --- Outline of the thesis --- p.11Chapter Chapter 2 : --- An Overview of Evolutionary Algorithms --- p.13Chapter 2.1. --- Evolutionary algorithms --- p.13Chapter 2.2. --- Genetic Algorithms (GAs) --- p.15Chapter 2.2.1. --- The canonical genetic algorithm --- p.16Chapter 2.2.1.1. --- Selection methods --- p.21Chapter 2.2.1.2. --- Recombination methods --- p.24Chapter 2.2.1.3. --- Inversion and Reordering --- p.27Chapter 2.2.2. --- Implicit parallelism and the building block hypothesis --- p.28Chapter 2.2.3. --- Steady state genetic algorithms --- p.32Chapter 2.2.4. --- Hybrid algorithms --- p.33Chapter 2.3. --- Genetic Programming (GP) --- p.34Chapter 2.3.1. --- Introduction to the traditional GP --- p.34Chapter 2.3.2. --- Automatic Defined Function (ADF) --- p.41Chapter 2.3.3. --- Module Acquisition (MA) --- p.44Chapter 2.3.4. --- Strongly Typed Genetic Programming (STGP) --- p.49Chapter 2.4. --- Evolution Strategies (ES) --- p.50Chapter 2.5. --- Evolutionary Programming (EP) --- p.55Chapter Chapter 3 : --- Inductive Logic Programming --- p.59Chapter 3.1. --- Inductive concept learning --- p.59Chapter 3.2. --- Inductive Logic Programming (ILP) --- p.62Chapter 3.2.1. --- Interactive ILP --- p.64Chapter 3.2.2. --- Empirical ILP --- p.65Chapter 3.3. --- Techniques and methods of ILP --- p.67Chapter Chapter 4 : --- Genetic Logic Programming and Applications --- p.74Chapter 4.1. --- Introduction --- p.74Chapter 4.2. --- Representations of logic programs --- p.76Chapter 4.3. --- Crossover of logic programs --- p.81Chapter 4.4. --- Genetic Logic Programming System (GLPS) --- p.87Chapter 4.5. --- Applications --- p.90Chapter 4.5.1. --- The Winston's arch problem --- p.91Chapter 4.5.2. --- The modified Quinlan's network reachability problem --- p.92Chapter 4.5.3. --- The factorial problem --- p.95Chapter Chapter 5 : --- The logic grammars based genetic programming system (LOGENPRO) --- p.100Chapter 5.1. --- Logic grammars --- p.101Chapter 5.2. --- Representations of programs --- p.103Chapter 5.3. --- Crossover of programs --- p.111Chapter 5.4. --- Mutation of programs --- p.126Chapter 5.5. --- The evolution process of LOGENPRO --- p.130Chapter 5.6. --- Discussion --- p.132Chapter Chapter 6 : --- Applications of LOGENPRO --- p.134Chapter 6.1. --- Learning functional programs --- p.134Chapter 6.1.1. --- Learning S-expressions using LOGENPRO --- p.134Chapter 6.1.2. --- The DOT PRODUCT problem --- p.137Chapter 6.1.2. --- Learning sub-functions using explicit knowledge --- p.143Chapter 6.2. --- Learning logic programs --- p.148Chapter 6.2.1. --- Learning logic programs using LOGENPRO --- p.148Chapter 6.2.2. --- The Winston's arch problem --- p.151Chapter 6.2.3. --- The modified Quinlan's network reachability problem --- p.153Chapter 6.2.4. --- The factorial problem --- p.154Chapter 6.2.5. --- Discussion --- p.155Chapter 6.3. --- Learning programs in C --- p.155Chapter Chapter 7 : --- Knowledge Discovery in Databases --- p.159Chapter 7.1. --- Inducing decision trees using LOGENPRO --- p.160Chapter 7.1.1. --- Decision trees --- p.160Chapter 7.1.2. --- Representing decision trees as S-expressions --- p.164Chapter 7.1.3. --- The credit screening problem --- p.166Chapter 7.1.4. --- The experiment --- p.168Chapter 7.2. --- Learning logic program from imperfect data --- p.174Chapter 7.2.1. --- The chess endgame problem --- p.177Chapter 7.2.2. --- The setup of experiments --- p.178Chapter 7.2.3. --- Comparison of LOGENPRO with FOIL --- p.180Chapter 7.2.4. --- Comparison of LOGENPRO with BEAM-FOIL --- p.182Chapter 7.2.5. --- Comparison of LOGENPRO with mFOILl --- p.183Chapter 7.2.6. --- Comparison of LOGENPRO with mFOIL2 --- p.184Chapter 7.2.7. --- Comparison of LOGENPRO with mFOIL3 --- p.185Chapter 7.2.8. --- Comparison of LOGENPRO with mFOIL4 --- p.186Chapter 7.2.9. --- Comparison of LOGENPRO with mFOIL5 --- p.187Chapter 7.2.10. --- Discussion --- p.188Chapter 7.3. --- Learning programs in Fuzzy Prolog --- p.189Chapter Chapter 8 : --- An Adaptive Inductive Logic Programming System --- p.192Chapter 8.1. --- Adaptive Inductive Logic Programming --- p.192Chapter 8.2. --- A generic top-down ILP algorithm --- p.196Chapter 8.3. --- Inducing procedural search biases --- p.200Chapter 8.3.1. --- The evolution process --- p.201Chapter 8.3.2. --- The experimentation setup --- p.202Chapter 8.3.3. --- Fitness calculation --- p.203Chapter 8.4. --- Experimentation and evaluations --- p.204Chapter 8.4.1. --- The member predicate --- p.205Chapter 8.4.2. --- The member predicate in a noisy environment --- p.205Chapter 8.4.3. --- The multiply predicate --- p.206Chapter 8.4.4. --- The uncle predicate --- p.207Chapter 8.5. --- Discussion --- p.208Chapter Chapter 9 : --- Conclusion and Future Work --- p.210Chapter 9.1. --- Conclusion --- p.210Chapter 9.2. --- Future work --- p.217Chapter 9.2.1. --- Applying LOGENPRO to discover knowledge from databases --- p.217Chapter 9.2.2. --- Learning recursive programs --- p.218Chapter 9.2.3. --- Applying LOGENPRO in engineering design --- p.220Chapter 9.2.4. --- Exploiting parallelism of evolutionary algorithms --- p.222Reference --- p.227Appendix A --- p.23

    Object-oriented data mining

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    corecore