138,598 research outputs found

    Improving treebank-based automatic LFG induction for Spanish

    Get PDF
    We describe several improvements to the method of treebank-based LFG induction for Spanish from the Cast3LB treebank (O’Donovan et al., 2005). We discuss the different categories of problems encountered and present the solutions adopted. Some of the problems involve a simple adoption of existing linguistic analyses, as in our treatment of clitic doubling and null subjects. In other cases there is no standard LFG account for the phenomenon we wish to model and we adopt a compromise, conservative solution. This is exemplified by our treatment of Spanish periphrastic constructions. In yet another case, the less configurational nature of Spanish means that the LFG annotation algorithm has to rely mostly on Cast3LB function tags, and consequently a reliable method of adding those tags to parse trees had to be developed. This method achieves over 6% improvement over the baseline for the Cast3LB-function-tag assignment task, and over 3% improvement over the baseline for LFG f-structure construction from function-tag-enriched trees

    Fighting with the Sparsity of Synonymy Dictionaries

    Full text link
    Graph-based synset induction methods, such as MaxMax and Watset, induce synsets by performing a global clustering of a synonymy graph. However, such methods are sensitive to the structure of the input synonymy graph: sparseness of the input dictionary can substantially reduce the quality of the extracted synsets. In this paper, we propose two different approaches designed to alleviate the incompleteness of the input dictionaries. The first one performs a pre-processing of the graph by adding missing edges, while the second one performs a post-processing by merging similar synset clusters. We evaluate these approaches on two datasets for the Russian language and discuss their impact on the performance of synset induction methods. Finally, we perform an extensive error analysis of each approach and discuss prominent alternative methods for coping with the problem of the sparsity of the synonymy dictionaries.Comment: In Proceedings of the 6th Conference on Analysis of Images, Social Networks, and Texts (AIST'2017): Springer Lecture Notes in Computer Science (LNCS

    Generating Property-Directed Potential Invariants By Backward Analysis

    Full text link
    This paper addresses the issue of lemma generation in a k-induction-based formal analysis of transition systems, in the linear real/integer arithmetic fragment. A backward analysis, powered by quantifier elimination, is used to output preimages of the negation of the proof objective, viewed as unauthorized states, or gray states. Two heuristics are proposed to take advantage of this source of information. First, a thorough exploration of the possible partitionings of the gray state space discovers new relations between state variables, representing potential invariants. Second, an inexact exploration regroups and over-approximates disjoint areas of the gray state space, also to discover new relations between state variables. k-induction is used to isolate the invariants and check if they strengthen the proof objective. These heuristics can be used on the first preimage of the backward exploration, and each time a new one is output, refining the information on the gray states. In our context of critical avionics embedded systems, we show that our approach is able to outperform other academic or commercial tools on examples of interest in our application field. The method is introduced and motivated through two main examples, one of which was provided by Rockwell Collins, in a collaborative formal verification framework.Comment: In Proceedings FTSCS 2012, arXiv:1212.657

    Rerepresenting and Restructuring Domain Theories: A Constructive Induction Approach

    Full text link
    Theory revision integrates inductive learning and background knowledge by combining training examples with a coarse domain theory to produce a more accurate theory. There are two challenges that theory revision and other theory-guided systems face. First, a representation language appropriate for the initial theory may be inappropriate for an improved theory. While the original representation may concisely express the initial theory, a more accurate theory forced to use that same representation may be bulky, cumbersome, and difficult to reach. Second, a theory structure suitable for a coarse domain theory may be insufficient for a fine-tuned theory. Systems that produce only small, local changes to a theory have limited value for accomplishing complex structural alterations that may be required. Consequently, advanced theory-guided learning systems require flexible representation and flexible structure. An analysis of various theory revision systems and theory-guided learning systems reveals specific strengths and weaknesses in terms of these two desired properties. Designed to capture the underlying qualities of each system, a new system uses theory-guided constructive induction. Experiments in three domains show improvement over previous theory-guided systems. This leads to a study of the behavior, limitations, and potential of theory-guided constructive induction.Comment: See http://www.jair.org/ for an online appendix and other files accompanying this articl

    Optimal Algorithm for Bayesian Incentive-Compatible Exploration

    Full text link
    We consider a social planner faced with a stream of myopic selfish agents. The goal of the social planner is to maximize the social welfare, however, it is limited to using only information asymmetry (regarding previous outcomes) and cannot use any monetary incentives. The planner recommends actions to agents, but her recommendations need to be Bayesian Incentive Compatible to be followed by the agents. Our main result is an optimal algorithm for the planner, in the case that the actions realizations are deterministic and have limited support, making significant important progress on this open problem. Our optimal protocol has two interesting features. First, it always completes the exploration of a priori more beneficial actions before exploring a priori less beneficial actions. Second, the randomization in the protocol is correlated across agents and actions (and not independent at each decision time).Comment: EC 201

    Certified Context-Free Parsing: A formalisation of Valiant's Algorithm in Agda

    Get PDF
    Valiant (1975) has developed an algorithm for recognition of context free languages. As of today, it remains the algorithm with the best asymptotic complexity for this purpose. In this paper, we present an algebraic specification, implementation, and proof of correctness of a generalisation of Valiant's algorithm. The generalisation can be used for recognition, parsing or generic calculation of the transitive closure of upper triangular matrices. The proof is certified by the Agda proof assistant. The certification is representative of state-of-the-art methods for specification and proofs in proof assistants based on type-theory. As such, this paper can be read as a tutorial for the Agda system
    corecore