2,598 research outputs found

    Inducing Probabilistic Grammars by Bayesian Model Merging

    Full text link
    We describe a framework for inducing probabilistic grammars from corpora of positive samples. First, samples are {\em incorporated} by adding ad-hoc rules to a working grammar; subsequently, elements of the model (such as states or nonterminals) are {\em merged} to achieve generalization and a more compact representation. The choice of what to merge and when to stop is governed by the Bayesian posterior probability of the grammar given the data, which formalizes a trade-off between a close fit to the data and a default preference for simpler models (`Occam's Razor'). The general scheme is illustrated using three types of probabilistic grammars: Hidden Markov models, class-based nn-grams, and stochastic context-free grammars.Comment: To appear in Grammatical Inference and Applications, Second International Colloquium on Grammatical Inference; Springer Verlag, 1994. 13 page

    Bayesian Grammar Induction for Language Modeling

    Full text link
    We describe a corpus-based induction algorithm for probabilistic context-free grammars. The algorithm employs a greedy heuristic search within a Bayesian framework, and a post-pass using the Inside-Outside algorithm. We compare the performance of our algorithm to n-gram models and the Inside-Outside algorithm in three language modeling tasks. In two of the tasks, the training data is generated by a probabilistic context-free grammar and in both tasks our algorithm outperforms the other techniques. The third task involves naturally-occurring data, and in this task our algorithm does not perform as well as n-gram models but vastly outperforms the Inside-Outside algorithm.Comment: 8 pages, LaTeX, uses aclap.st

    Grammar Generation and Optimization from Multiple Inputs

    Get PDF
    Human being uses multiple modes like speech, text, facial expression, hand gesture, showing picture etc. for communication in between them. The use of this ways for communication makes human communication more simple and fast. In previous years several techniques are used to bring the human computer interaction more closely. It costs more for development and maintenance of Multimodal grammar in integrating and understanding input in multimodal interfaces i.e. using multiple input ways. This leads to improve and investigate more robust algorithm. The proposed system generates the grammar from multiple inputs called as multimodal grammar and evaluates grammar description length. Furthermore, to optimize the multimodal grammar proposed system uses learning operators which improves grammar description DOI: 10.17762/ijritcc2321-8169.15016

    E-Generalization Using Grammars

    Full text link
    We extend the notion of anti-unification to cover equational theories and present a method based on regular tree grammars to compute a finite representation of E-generalization sets. We present a framework to combine Inductive Logic Programming and E-generalization that includes an extension of Plotkin's lgg theorem to the equational case. We demonstrate the potential power of E-generalization by three example applications: computation of suggestions for auxiliary lemmas in equational inductive proofs, computation of construction laws for given term sequences, and learning of screen editor command sequences.Comment: 49 pages, 16 figures, author address given in header is meanwhile outdated, full version of an article in the "Artificial Intelligence Journal", appeared as technical report in 2003. An open-source C implementation and some examples are found at the Ancillary file

    Statistical relational learning of semantic models and grammar rules for 3D building reconstruction from 3D point clouds

    Get PDF
    Formal grammars are well suited for the estimation of models with an a-priori unknown number of parameters such as buildings and have proven their worth for 3D modeling and reconstruction of cities. However, the generation and design of corresponding grammar rules is a laborious task and relies on expert knowledge. This thesis presents novel approaches for the reduction of this effort using advanced machine learning methods resulting in automatically learned sophisticated grammar rules. Indeed, the learning of a wide range of sophisticated rules, that reflect the variety and complexity, is a challenging task. This is especially the case if a simultaneous machine learning of building structures and the underlying aggregation hierarchies as well as the building parameters and the constraints among them for a semantic interpretation is expected. Thus, in this thesis, an incremental approach is followed. It separates the structure learning from the parameter distribution learning of building parts. Moreover, the so far procedural approaches with formal grammars are mostly rather convenient for the generation of virtual city models than for the reconstruction of existing buildings. To this end, Inductive Logic Programming (ILP) techniques are transferred and applied for the first time in the field of 3D building modeling. This enables the automatic learning of declarative logic programs, which are equivalent to attribute grammars and separate the representation of buildings and their parts from the reconstruction task. A stepwise bottom-up learning, starting from the smallest atomic features of a building part together with the semantic, topological and geometric constraints, is a key to a successful learning of a whole building part. Only few examples are sufficient to learn from precise as well as noisy observations. The learning from uncertain data is realized using probability density functions, decision trees and uncertain projective geometry. This enables the handling and modeling of uncertain topology and geometric reasoning taking noise into consideration. The uncertainty of models itself is also considered. Therefore, a novel method is developed for the learning of Weighted Attribute Context-Free Grammar (WACFG). On the one hand, the structure learning of façades – context-free part of the Grammar – is performed based on annotated derivation trees using specific Support Vector Machines (SVMs). The latter are able to derive probabilistic models from structured data and to predict a most likely tree regarding to given observations. On the other hand, to the best of my knowledge, Statistical Relational Learning (SRL), especially Markov Logic Networks (MLNs), are applied for the first time in order to learn building part (shape and location) parameters as well as the constraints among these parts. The use of SRL enables to take profit from the elegant logical relational description and to benefit from the efficiency of statistical inference methods. In order to model latent prior knowledge and exploit the architectural regularities of buildings, a novel method is developed for the automatic identification of translational as well as axial symmetries. For symmetry identification a supervised machine learning approach is followed based on an SVM classifier. Building upon the classification results, algorithms are designed for the representation of symmetries using context-free grammars from authoritative building footprints. In all steps the machine learning is performed based on real- world data such as 3D point clouds and building footprints. The handling with uncertainty and occlusions is assured. The presented methods have been successfully applied on real data. The belonging classification and reconstruction results are shown.Statistisches relationales Lernen von semantischen Modellen und Grammatikregeln für 3D Gebäuderekonstruktion aus 3D Punktwolken Formale Grammatiken eignen sich sehr gut zur Schätzung von Modellen mit a-priori unbekannter Anzahl von Parametern und haben sich daher als guter Ansatz zur Rekonstruktion von Städten mittels 3D Stadtmodellen bewährt. Der Entwurf und die Erstellung der dazugehörigen Grammatikregeln benötigt jedoch Expertenwissen und ist mit großem Aufwand verbunden. Im Rahmen dieser Arbeit wurden Verfahren entwickelt, die diesen Aufwand unter Zuhilfenahme von leistungsfähigen Techniken des maschinellen Lernens reduzieren und automatisches Lernen von Regeln ermöglichen. Das Lernen umfangreicher Grammatiken, die die Vielfalt und Komplexität der Gebäude und ihrer Bestandteile widerspiegeln, stellt eine herausfordernde Aufgabe dar. Dies ist insbesondere der Fall, wenn zur semantischen Interpretation sowohl das Lernen der Strukturen und Aggregationshierarchien als auch von Parametern der zu lernenden Objekte gleichzeitig statt finden soll. Aus diesem Grund wird hier ein inkrementeller Ansatz verfolgt, der das Lernen der Strukturen vom Lernen der Parameterverteilungen und Constraints zielführend voneinander trennt. Existierende prozedurale Ansätze mit formalen Grammatiken sind eher zur Generierung von synthetischen Stadtmodellen geeignet, aber nur bedingt zur Rekonstruktion existierender Gebäude nutzbar. Hierfür werden in dieser Schrift Techniken der Induktiven Logischen Programmierung (ILP) zum ersten Mal auf den Bereich der 3D Gebäudemodellierung übertragen. Dies führt zum Lernen deklarativer logischer Programme, die hinsichtlich ihrer Ausdrucksstärke mit attributierten Grammatiken gleichzusetzen sind und die Repräsentation der Gebäude von der Rekonstruktionsaufgabe trennen. Das Lernen von zuerst disaggregierten atomaren Bestandteilen sowie der semantischen, topologischen und geometrischen Beziehungen erwies sich als Schlüssel zum Lernen der Gesamtheit eines Gebäudeteils. Das Lernen erfolgte auf Basis einiger weniger sowohl präziser als auch verrauschter Beispielmodelle. Um das Letztere zu ermöglichen, wurde auf Wahrscheinlichkeitsdichteverteilungen, Entscheidungsbäumen und unsichere projektive Geometrie zurückgegriffen. Dies erlaubte den Umgang mit und die Modellierung von unsicheren topologischen Relationen sowie unscharfer Geometrie. Um die Unsicherheit der Modelle selbst abbilden zu können, wurde ein Verfahren zum Lernen Gewichteter Attributierter Kontextfreier Grammatiken (Weighted Attributed Context-Free Grammars, WACFG) entwickelt. Zum einen erfolgte das Lernen der Struktur von Fassaden –kontextfreier Anteil der Grammatik – aus annotierten Herleitungsbäumen mittels spezifischer Support Vektor Maschinen (SVMs), die in der Lage sind, probabilistische Modelle aus strukturierten Daten abzuleiten und zu prädizieren. Zum anderen wurden nach meinem besten Wissen Methoden des statistischen relationalen Lernens (SRL), insbesondere Markov Logic Networks (MLNs), erstmalig zum Lernen von Parametern von Gebäuden sowie von bestehenden Relationen und Constraints zwischen ihren Bestandteilen eingesetzt. Das Nutzen von SRL erlaubt es, die eleganten relationalen Beschreibungen der Logik mit effizienten Methoden der statistischen Inferenz zu verbinden. Um latentes Vorwissen zu modellieren und architekturelle Regelmäßigkeiten auszunutzen, ist ein Verfahren zur automatischen Erkennung von Translations- und Spiegelsymmetrien und deren Repräsentation mittels kontextfreier Grammatiken entwickelt worden. Hierfür wurde mittels überwachtem Lernen ein SVM-Klassifikator entwickelt und implementiert. Basierend darauf wurden Algorithmen zur Induktion von Grammatikregeln aus Grundrissdaten entworfen

    Cumulative subject index volumes 52-55

    Get PDF

    Solving Bongard Problems with a Visual Language and Pragmatic Reasoning

    Full text link
    More than 50 years ago Bongard introduced 100 visual concept learning problems as a testbed for intelligent vision systems. These problems are now known as Bongard problems. Although they are well known in the cognitive science and AI communities only moderate progress has been made towards building systems that can solve a substantial subset of them. In the system presented here, visual features are extracted through image processing and then translated into a symbolic visual vocabulary. We introduce a formal language that allows representing complex visual concepts based on this vocabulary. Using this language and Bayesian inference, complex visual concepts can be induced from the examples that are provided in each Bongard problem. Contrary to other concept learning problems the examples from which concepts are induced are not random in Bongard problems, instead they are carefully chosen to communicate the concept, hence requiring pragmatic reasoning. Taking pragmatic reasoning into account we find good agreement between the concepts with high posterior probability and the solutions formulated by Bongard himself. While this approach is far from solving all Bongard problems, it solves the biggest fraction yet
    • …
    corecore