857 research outputs found

    Using Contextual Representations to Efficiently Learn Context-Free Languages

    No full text
    International audienceWe present a polynomial update time algorithm for the inductive inference of a large class of context-free languages using the paradigm of positive data and a membership oracle. We achieve this result by moving to a novel representation, called Contextual Binary Feature Grammars (CBFGs), which are capable of representing richly structured context-free languages as well as some context sensitive languages. These representations explicitly model the lattice structure of the distribution of a set of substrings and can be inferred using a generalisation of distributional learning. This formalism is an attempt to bridge the gap between simple learnable classes and the sorts of highly expressive representations necessary for linguistic representation: it allows the learnability of a large class of context-free languages, that includes all regular languages and those context-free languages that satisfy two simple constraints. The formalism and the algorithm seem well suited to natural language and in particular to the modeling of first language acquisition. Preliminary experimental results confirm the effectiveness of this approach

    From treebank resources to LFG F-structures

    Get PDF
    We present two methods for automatically annotating treebank resources with functional structures. Both methods define systematic patterns of correspondence between partial PS configurations and functional structures. These are applied to PS rules extracted from treebanks, or directly to constraint set encodings of treebank PS trees

    Specifying Software Languages: Grammars, Projectional Editors, and Unconventional Approaches

    Get PDF
    We discuss several approaches for defining software languages, together with Integrated Development Environments for them. Theoretical foundation is grammar-based models: they can be used where proven correctness of specifications is required. From a practical point of view, we discuss how language specification can be made more accessible by focusing on language workbenches and projectional editing, and discuss how it can be formalized. We also give a brief overview of unconventional ideas to language definition, and outline three open problems connected to the approaches we discuss

    A novel Markov logic rule induction strategy for characterizing sports video footage

    Get PDF
    The grounding of high-level semantic concepts is a key requirement of video annotation systems. Rule induction can thus constitute an invaluable intermediate step in characterizing protocol-governed domains, such as broadcast sports footage. We here set out a novel “clause grammar template” approach to the problem of rule-induction in video footage of court games that employs a second-order meta grammar for Markov Logic Network construction. The aim is to build an adaptive system for sports video annotation capable, in principle, both of learning ab initio and also adaptively transferring learning between distinct rule domains. The method is tested with respect to both a simulated game predicate generator and also real data derived from tennis footage via computer-vision based approaches including HOG3D based player-action classification, Hough-transform based court detection, and graph-theoretic ball-tracking. Experiments demonstrate that the method exhibits both error resilience and learning transfer in the court domain context. Moreover the clause template approach naturally generalizes to any suitably-constrained, protocol-governed video domain characterized by feature noise or detector error

    A novel Markov logic rule induction strategy for characterizing sports video footage

    Get PDF
    The grounding of high-level semantic concepts is a key requirement of video annotation systems. Rule induction can thus constitute an invaluable intermediate step in characterizing protocol-governed domains, such as broadcast sports footage. We here set out a novel “clause grammar template” approach to the problem of rule-induction in video footage of court games that employs a second-order meta grammar for Markov Logic Network construction. The aim is to build an adaptive system for sports video annotation capable, in principle, both of learning ab initio and also adaptively transferring learning between distinct rule domains. The method is tested with respect to both a simulated game predicate generator and also real data derived from tennis footage via computer-vision based approaches including HOG3D based player-action classification, Hough-transform based court detection, and graph-theoretic ball-tracking. Experiments demonstrate that the method exhibits both error resilience and learning transfer in the court domain context. Moreover the clause template approach naturally generalizes to any suitably-constrained, protocol-governed video domain characterized by feature noise or detector error

    Creating a Semantic Graph from Wikipedia

    Get PDF
    With the continued need to organize and automate the use of data, solutions are needed to transform unstructred text into structred information. By treating dependency grammar functions as programming language functions, this process produces \property maps which connect entities (people, places, events) with snippets of information. These maps are used to construct a semantic graph. By inputting Wikipedia, a large graph of information is produced representing a section of history. The resulting graph allows a user to quickly browse a topic and view the interconnections between entities across history

    Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning

    Full text link
    This paper describes an experimental comparison of seven different learning algorithms on the problem of learning to disambiguate the meaning of a word from context. The algorithms tested include statistical, neural-network, decision-tree, rule-based, and case-based classification techniques. The specific problem tested involves disambiguating six senses of the word ``line'' using the words in the current and proceeding sentence as context. The statistical and neural-network methods perform the best on this particular problem and we discuss a potential reason for this observed difference. We also discuss the role of bias in machine learning and its importance in explaining performance differences observed on specific problems.Comment: 10 page

    Effective Use of Linguistic Features for Sentiment Analysis of Korean

    Get PDF

    K + K = 120 : Papers dedicated to László Kálmán and András Kornai on the occasion of their 60th birthdays

    Get PDF
    corecore