3,486 research outputs found

    A Universal Part-of-Speech Tagset

    Full text link
    To facilitate future research in unsupervised induction of syntactic structure and to standardize best-practices, we propose a tagset that consists of twelve universal part-of-speech categories. In addition to the tagset, we develop a mapping from 25 different treebank tagsets to this universal set. As a result, when combined with the original treebank data, this universal tagset and mapping produce a dataset consisting of common parts-of-speech for 22 different languages. We highlight the use of this resource via two experiments, including one that reports competitive accuracies for unsupervised grammar induction without gold standard part-of-speech tags

    Ludii -- The Ludemic General Game System

    Full text link
    While current General Game Playing (GGP) systems facilitate useful research in Artificial Intelligence (AI) for game-playing, they are often somewhat specialised and computationally inefficient. In this paper, we describe the "ludemic" general game system Ludii, which has the potential to provide an efficient tool for AI researchers as well as game designers, historians, educators and practitioners in related fields. Ludii defines games as structures of ludemes -- high-level, easily understandable game concepts -- which allows for concise and human-understandable game descriptions. We formally describe Ludii and outline its main benefits: generality, extensibility, understandability and efficiency. Experimentally, Ludii outperforms one of the most efficient Game Description Language (GDL) reasoners, based on a propositional network, in all games available in the Tiltyard GGP repository. Moreover, Ludii is also competitive in terms of performance with the more recently proposed Regular Boardgames (RBG) system, and has various advantages in qualitative aspects such as generality.Comment: Accepted at ECAI 202

    Clinical and Genomic Characterization of Recurrent Enterococcal Bloodstream Infection in Patients With Acute Leukemia

    Get PDF
    Background. Rates and risk factors for recurrent enterococcal bloodstream infection (R-EBSI) and whether the same genetic lineage causes index EBSI and R-EBSI are unknown in patients with acute leukemia (AL) receiving chemotherapy. Methods. Ninety-two AL patients with EBSI from 2010 to 2015 were included. Enterococcal bloodstream infection was defined by 31 positive blood cultures for Enterococcus faecium or Enterococcus faecalis and fever, hypotension, or chills. Clearance was defined by 31 negative cultures 324 hours after last positive culture and defervescence. Recurrent enterococcal bloodstream infection was defined by a positive blood culture for Enterococcus 324 hours after clearance. Categorical variables were reported as proportions and compared by the χ2 test. Continuous variables were summarized by median and interquartile range (IQR) and compared by the Wilcoxon-Mann-Whitney Test. P values \u3c.05 were considered significant. Whole-genome sequencing was performed on available paired BSI isolates from 7 patients. Results. Twenty-four patients (26%) had 31 episodes of R-EBSI. Median time to R-EBSI (IQR) was 26 (13–50) days. Patients with R-EBSI had significantly longer durations of fever and metronidazole exposure during their index EBSI. Thirty-nine percent of E. faecium R-EBSI isolates became daptomycin-nonsusceptible Enterococcus (DNSE) following daptomycin therapy for index EBSI. Whole-genome sequencing analysis confirmed high probability of genetic relatedness of index EBSI and R-EBSI isolates for 4/7 patients. Conclusions. Recurrent enterococcal bloodstream infection and DNSE are common in patients with AL and tend to occur within the first 30 days of index EBSI. Duration of fever and metronidazole exposure may be useful in determining risk for R-EBSI. Whole-genome sequencing analysis demonstrates that the same strain causes both EBSI and R-EBSI in some patients

    Directional adposition use in English, Swedish and Finnish

    Get PDF
    Directional adpositions such as to the left of describe where a Figure is in relation to a Ground. English and Swedish directional adpositions refer to the location of a Figure in relation to a Ground, whether both are static or in motion. In contrast, the Finnish directional adpositions edellä (in front of) and jäljessä (behind) solely describe the location of a moving Figure in relation to a moving Ground (Nikanne, 2003). When using directional adpositions, a frame of reference must be assumed for interpreting the meaning of directional adpositions. For example, the meaning of to the left of in English can be based on a relative (speaker or listener based) reference frame or an intrinsic (object based) reference frame (Levinson, 1996). When a Figure and a Ground are both in motion, it is possible for a Figure to be described as being behind or in front of the Ground, even if neither have intrinsic features. As shown by Walker (in preparation), there are good reasons to assume that in the latter case a motion based reference frame is involved. This means that if Finnish speakers would use edellä (in front of) and jäljessä (behind) more frequently in situations where both the Figure and Ground are in motion, a difference in reference frame use between Finnish on one hand and English and Swedish on the other could be expected. We asked native English, Swedish and Finnish speakers’ to select adpositions from a language specific list to describe the location of a Figure relative to a Ground when both were shown to be moving on a computer screen. We were interested in any differences between Finnish, English and Swedish speakers. All languages showed a predominant use of directional spatial adpositions referring to the lexical concepts TO THE LEFT OF, TO THE RIGHT OF, ABOVE and BELOW. There were no differences between the languages in directional adpositions use or reference frame use, including reference frame use based on motion. We conclude that despite differences in the grammars of the languages involved, and potential differences in reference frame system use, the three languages investigated encode Figure location in relation to Ground location in a similar way when both are in motion. Levinson, S. C. (1996). Frames of reference and Molyneux’s question: Crosslingiuistic evidence. In P. Bloom, M.A. Peterson, L. Nadel & M.F. Garrett (Eds.) Language and Space (pp.109-170). Massachusetts: MIT Press. Nikanne, U. (2003). How Finnish postpositions see the axis system. In E. van der Zee & J. Slack (Eds.), Representing direction in language and space. Oxford, UK: Oxford University Press. Walker, C. (in preparation). Motion encoding in language, the use of spatial locatives in a motion context. Unpublished doctoral dissertation, University of Lincoln, Lincoln. United Kingdo

    In search of isoglosses: continuous and discrete language embeddings in Slavic historical phonology

    Full text link
    This paper investigates the ability of neural network architectures to effectively learn diachronic phonological generalizations in a multilingual setting. We employ models using three different types of language embedding (dense, sigmoid, and straight-through). We find that the Straight-Through model outperforms the other two in terms of accuracy, but the Sigmoid model's language embeddings show the strongest agreement with the traditional subgrouping of the Slavic languages. We find that the Straight-Through model has learned coherent, semi-interpretable information about sound change, and outline directions for future research

    Learning implicational models of universal grammar parameters

    Get PDF
    The use of parameters in the description of natural language syntax has to balance between the need to discriminate among (sometimes subtly different) languages, which can be seen as a cross-linguistic version of Chomsky's descriptive adequacy (Chomsky, 1964), and the complexity of the acquisition task that a large number of parameters would imply, which is a problem for explanatory adequacy. Here we first present a novel approach in which machine learning is used to detect hidden dependencies in a table of parameters. The result is a dependency graph in which some of the parameters can be fully predicted from others. These findings can be then subjected to linguistic analysis, which may either refute them by providing typological counter-examples of languages not included in the original dataset, dismiss them on theoretical grounds, or uphold them as tentative empirical laws worth of further study. Machine learning is also used to explore the full sets of parameters that are sufficient to distinguish one historically established language family from others. These results provide a new type of empirical evidence about the historical adequacy of parameter theories

    Learning Tree Distributions by Hidden Markov Models

    Full text link
    Hidden tree Markov models allow learning distributions for tree structured data while being interpretable as nondeterministic automata. We provide a concise summary of the main approaches in literature, focusing in particular on the causality assumptions introduced by the choice of a specific tree visit direction. We will then sketch a novel non-parametric generalization of the bottom-up hidden tree Markov model with its interpretation as a nondeterministic tree automaton with infinite states.Comment: Accepted in LearnAut2018 worksho
    • …
    corecore