Search CORE

294 research outputs found

Evaluation of LTAG parsing with supertag compaction

Author: Carroll John
Shaumyan Olga
Weir David
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2002
Field of study

One of the biggest concerns that has been raised over the feasibility of using large-scale LTAGs in NLP is the amount of redundancy within a grammar¿s elementary tree set. This has led to various proposals on how best to represent grammars in a way that makes them compact and easily maintained (Vijay-Shanker and Schabes, 1992; Becker, 1993; Becker, 1994; Evans, Gazdar and Weir, 1995; Candito, 1996). Unfortunately, while this work can help to make the storage of grammars more efficient, it does nothing to prevent the problem reappearing when the grammar is processed by a parser and the complete set of trees is reproduced. In this paper we are concerned with an approach that addresses this problem of computational redundancy in the trees, and evaluate its effectiveness

CiteSeerX

Sussex Research Online

Building a wide coverage dynamic grammar

Author: A. Joshi
A. Joshi
C. Doran
C. Phillips
D. Milward
E.P. Stabler
M. Marcus
M.J. Steedman
P. Sturt
R. Frank
S.P. Abney
W. Marslen-Wilson
Y. Kamide
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Crossref

Institutional Research Information System University of Turin

Using Natural Language Parsers for Authorship Attribution

Author: Magnera Westerly A D
Publication venue: RIT Scholar Works
Publication date: 01/01/2003
Field of study

The goal of authorship attribution is to find a set of unconscious writing characteristics or style features that distinguish text written by one person from text written by another. Once these features are found, they can be used to pair a text with the individual who wrote it. It is now well accepted that authors develop distinct and unconscious writing features. Over one thousand stylometric features (style markers) have been proposed in a variety of research disciplines [44] but none of that research has looked at the syntactic structure of the text. I conjectures that the distinct writing features of an author are not limited to these features already studied, but also include syntactic features. To support this hypothesis, I ran experiments using two open source parsing programs and analyzed the results to see if features given to me from these programs were enough for me to determine who is the most probable author of a text. Parsing programs are designed to determine syntactic structures in nat ural language. They take a text or a writing sample and produce output showing the grammatical relationship between the words in the text. They provide a means to test the hypothesis that authors\u27 syntactic use of words provide enough identifying characteristics to differentiate between them. Using two open source natural language parsing programs, the Link Gram mar Parser and Collins\u27 Parser, this research tested to see if an authors sentence structure is unique enough to provide a means of recognizing the probable author of a text. Initial data was collected on a pool of test au thors. Sample texts by each author were run through both parsers. The output of each parser was analyzed using two multivariate analysis methods: discriminant analysis and cluster k- means. My results show that syntactic sentence structures may be a viable method for authorship attribution. The Link Grammar shows promise as a way to augment authorship attribution methods already out there. Collins\u27 Parser provided even better results that should be solid enough to stand on their own as a new and viable alternative to methods that already exist. Collins\u27 parser also provided new predictors that might improve current authorship attribution methods. For example, elements and phrases with wh- words and the length of noun phrases are highly corrolated with authorship in this study

RIT Scholar Works

Advances in discriminative dependency parsing

Author: Koo Terry (Terry Y.)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2010
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 167-176).Achieving a greater understanding of natural language syntax and parsing is a critical step in producing useful natural language processing systems. In this thesis, we focus on the formalism of dependency grammar as it allows one to model important head modifier relationships with a minimum of extraneous structure. Recent research in dependency parsing has highlighted the discriminative structured prediction framework (McDonald et al., 2005a; Carreras, 2007; Suzuki et al., 2009), which is characterized by two advantages: first, the availability of powerful discriminative learning algorithms like log-linear and max-margin models (Lafferty et al., 2001; Taskar et al., 2003), and second, the ability to use arbitrarily-defined feature representations. This thesis explores three advances in the field of discriminative dependency parsing. First, we show that the classic Matrix-Tree Theorem (Kirchhoff, 1847; Tutte, 1984) can be applied to the problem of non-projective dependency parsing, enabling both log-linear and max-margin parameter estimation in this setting. Second, we present novel third-order dependency parsing algorithms that extend the amount of context available to discriminative parsers while retaining computational complexity equivalent to existing second-order parsers. Finally, we describe a simple but effective method for augmenting the features of a dependency parser with information derived from standard clustering algorithms; our semi-supervised approach is able to deliver consistent benefits regardless of the amount of available training data.by Terry Koo.Ph.D

DSpace@MIT

Recommended from our members

Learning for semantic parsing using statistical syntactic parsing techniques

Author: Ge Ruifang
Publication venue
Publication date: 15/10/2014
Field of study

textNatural language understanding is a sub-field of natural language processing, which builds automated systems to understand natural language. It is such an ambitious task that it sometimes is referred to as an AI-complete problem, implying that its difficulty is equivalent to solving the central artificial intelligence problem -- making computers as intelligent as people. Despite its complexity, natural language understanding continues to be a fundamental problem in natural language processing in terms of its theoretical and empirical importance. In recent years, startling progress has been made at different levels of natural language processing tasks, which provides great opportunity for deeper natural language understanding. In this thesis, we focus on the task of semantic parsing, which maps a natural language sentence into a complete, formal meaning representation in a meaning representation language. We present two novel state-of-the-art learned syntax-based semantic parsers using statistical syntactic parsing techniques, motivated by the following two reasons. First, the syntax-based semantic parsing is theoretically well-founded in computational semantics. Second, adopting a syntax-based approach allows us to directly leverage the enormous progress made in statistical syntactic parsing. The first semantic parser, Scissor, adopts an integrated syntactic-semantic parsing approach, in which a statistical syntactic parser is augmented with semantic parameters to produce a semantically-augmented parse tree (SAPT). This integrated approach allows both syntactic and semantic information to be available during parsing time to obtain an accurate combined syntactic-semantic analysis. The performance of Scissor is further improved by using discriminative reranking for incorporating non-local features. The second semantic parser, SynSem, exploits an existing syntactic parser to produce disambiguated parse trees that drive the compositional semantic interpretation. This pipeline approach allows semantic parsing to conveniently leverage the most recent progress in statistical syntactic parsing. We report experimental results on two real applications: an interpreter for coaching instructions in robotic soccer and a natural-language database interface, showing that the improvement of Scissor and SynSem over other systems is mainly on long sentences, where the knowledge of syntax given in the form of annotated SAPTs or syntactic parses from an existing parser helps semantic composition. SynSem also significantly improves results with limited training data, and is shown to be robust to syntactic errors.Computer Science

Texas ScholarWorks

Current trends

Author
Publication venue
Publication date: 01/01/2019
Field of study

Deep parsing is the fundamental process aiming at the representation of the syntactic structure of phrases and sentences. In the traditional methodology this process is based on lexicons and grammars representing roughly properties of words and interactions of words and structures in sentences. Several linguistic frameworks, such as Headdriven Phrase Structure Grammar (HPSG), Lexical Functional Grammar (LFG), Tree Adjoining Grammar (TAG), Combinatory Categorial Grammar (CCG), etc., offer different structures and combining operations for building grammar rules. These already contain mechanisms for expressing properties of Multiword Expressions (MWE), which, however, need improvement in how they account for idiosyncrasies of MWEs on the one hand and their similarities to regular structures on the other hand. This collaborative book constitutes a survey on various attempts at representing and parsing MWEs in the context of linguistic theories and applications

Institutional Repository of the Freie Universität Berlin

Permutation forests for modeling word order in machine translation

Author: Stanojević M.
Publication venue
Publication date: 01/01/2017
Field of study

International Migration, Integration and Social Cohesion online publications

Representation and parsing of multiword expressions

Author
Publication venue: Language Science Press
Publication date: 01/04/2020
Field of study

This book consists of contributions related to the definition, representation and parsing of MWEs. These reflect current trends in the representation and processing of MWEs. They cover various categories of MWEs such as verbal, adverbial and nominal MWEs, various linguistic frameworks (e.g. tree-based and unification-based grammars), various languages including English, French, Modern Greek, Hebrew, Norwegian), and various applications (namely MWE detection, parsing, automatic translation) using both symbolic and statistical approaches

Directory of Open Access Books (DOAB)

A Computational Model of Syntactic Processing: Ambiguity Resolution from Interpretation

Author: Niv Michael
Publication venue
Publication date: 01/01/1993
Field of study

Syntactic ambiguity abounds in natural language, yet humans have no difficulty coping with it. In fact, the process of ambiguity resolution is almost always unconscious. But it is not infallible, however, as example 1 demonstrates. 1. The horse raced past the barn fell. This sentence is perfectly grammatical, as is evident when it appears in the following context: 2. Two horses were being shown off to a prospective buyer. One was raced past a meadow. and the other was raced past a barn. ... Grammatical yet unprocessable sentences such as 1 are called `garden-path sentences.' Their existence provides an opportunity to investigate the human sentence processing mechanism by studying how and when it fails. The aim of this thesis is to construct a computational model of language understanding which can predict processing difficulty. The data to be modeled are known examples of garden path and non-garden path sentences, and other results from psycholinguistics. It is widely believed that there are two distinct loci of computation in sentence processing: syntactic parsing and semantic interpretation. One longstanding controversy is which of these two modules bears responsibility for the immediate resolution of ambiguity. My claim is that it is the latter, and that the syntactic processing module is a very simple device which blindly and faithfully constructs all possible analyses for the sentence up to the current point of processing. The interpretive module serves as a filter, occasionally discarding certain of these analyses which it deems less appropriate for the ongoing discourse than their competitors. This document is divided into three parts. The first is introductory, and reviews a selection of proposals from the sentence processing literature. The second part explores a body of data which has been adduced in support of a theory of structural preferences --- one that is inconsistent with the present claim. I show how the current proposal can be specified to account for the available data, and moreover to predict where structural preference theories will go wrong. The third part is a theoretical investigation of how well the proposed architecture can be realized using current conceptions of linguistic competence. In it, I present a parsing algorithm and a meaning-based ambiguity resolution method.Comment: 128 pages, LaTeX source compressed and uuencoded, figures separate macros: rotate.sty, lingmacros.sty, psfig.tex. Dissertation, Computer and Information Science Dept., October 199

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

CERN Document Server