400 research outputs found

    A Reranking Approach for Dependency Parsing with Variable-sized Subtree Features

    Get PDF
    Employing higher-order subtree structures in graph-based dependency parsing has shown substantial improvement over the accuracy, however suffers from the inefficiency increasing with the order of subtrees. We present a new reranking approach for dependency parsing that can utilize complex subtree representation by applying efficient subtree selection heuristics. We demonstrate the effective-ness of the approach in experiments conducted on the Penn Treebank and the Chinese Treebank. Our system improves the baseline accuracy from 91.88 % to 93.37 % for English, and in the case of Chinese from 87.39 % to 89.16%. 1

    Arc-Standard Spinal Parsing with Stack-LSTMs

    Full text link
    We present a neural transition-based parser for spinal trees, a dependency representation of constituent trees. The parser uses Stack-LSTMs that compose constituent nodes with dependency-based derivations. In experiments, we show that this model adapts to different styles of dependency relations, but this choice has little effect for predicting constituent structure, suggesting that LSTMs induce useful states by themselves.Comment: IWPT 201

    Towards a machine-learning architecture for lexical functional grammar parsing

    Get PDF
    Data-driven grammar induction aims at producing wide-coverage grammars of human languages. Initial efforts in this field produced relatively shallow linguistic representations such as phrase-structure trees, which only encode constituent structure. Recent work on inducing deep grammars from treebanks addresses this shortcoming by also recovering non-local dependencies and grammatical relations. My aim is to investigate the issues arising when adapting an existing Lexical Functional Grammar (LFG) induction method to a new language and treebank, and find solutions which will generalize robustly across multiple languages. The research hypothesis is that by exploiting machine-learning algorithms to learn morphological features, lemmatization classes and grammatical functions from treebanks we can reduce the amount of manual specification and improve robustness, accuracy and domain- and language -independence for LFG parsing systems. Function labels can often be relatively straightforwardly mapped to LFG grammatical functions. Learning them reliably permits grammar induction to depend less on language-specific LFG annotation rules. I therefore propose ways to improve acquisition of function labels from treebanks and translate those improvements into better-quality f-structure parsing. In a lexicalized grammatical formalism such as LFG a large amount of syntactically relevant information comes from lexical entries. It is, therefore, important to be able to perform morphological analysis in an accurate and robust way for morphologically rich languages. I propose a fully data-driven supervised method to simultaneously lemmatize and morphologically analyze text and obtain competitive or improved results on a range of typologically diverse languages

    Exploring Higher Order Dependency Parsers

    Get PDF
    Syntakticka analyza jejednim z nejdulezitejsich kroku pocitacoveho zpracovani pfirozenych jazyku. V teto praci se zamefujeme na formalismus zavislostni gramatiky, protoze jeho hlavnf principy, zejmena vztah fidiciho a zavisleho uzlu, se ukazaly uzitecne pro fadu rozdilnych jazyku, se zvlastnim zfetelem na vysvetleni slovosledu a vztahu mezi povrchovou strukturou a vyznamem. Vetsina modernich efektivnich algoritmu zavislostni syntakticke analyzy je zalozena na faktorizaci zavislostnich stromu. Ve vetsine techto pffstupu analyzator (parser) ztraci znacnou cast kontextove informace behem procesu faktorizace. V teto praci zkoumame, jak syntakticko-semanticke rysy ovlivnuji metody diskriminativniho strojoveho uceni vyssiho fadu pro zavislostni syntaktickou analyzu. Ukazujeme, ze lingvisticke rysy v mnoha pfipadech pfinaseji vyznamne zlepseni lispesnosti. Nejdrive pfinasime pfehled nekolika diskriminativnich metod uceni pro grafove statisticke zavislostni parsery a vysvetlujeme koncept vyssiho fadu, coz je zobecneni prace (Koo a Collins 2010) a (McDonald et al. 2006). Tonas dovede kjadru prace - rysovemu inzenyrstvi pro zavislostni parsery vyssiho fadu. Experimentujeme s nekolika syntakticko-semantickymi rysy a snazime se vysvetlit jejich teoreticke zaklady. Pokusy provadime na dvou odlisnych jazycich -...Most of the recent efficient algorithms for dependency parsing work by factoring the dependency trees. In most of these approaches, the parser loses much of the contextual information during the process of factorization. There have been approaches to build higher order dependency parsers - second order, [Carreras2007] and third order [Koo and Collins2010]. In the thesis, the approach by Koo and Collins should be further exploited in one or more ways. Possible directions of further exploitation include but are not limited to: investigating possibilities of extension of the approach to non-projective parsing; integrating labeled parsing; joining word-senses during the parsing phase [Eisner2000]Institute of Formal and Applied LinguisticsÚstav formální a aplikované lingvistikyFaculty of Mathematics and PhysicsMatematicko-fyzikální fakult
    corecore