135 research outputs found

    From news to comment: Resources and benchmarks for parsing the language of web 2.0

    Get PDF
    We investigate the problem of parsing the noisy language of social media. We evaluate four all-Street-Journal-trained statistical parsers (Berkeley, Brown, Malt and MST) on a new dataset containing 1,000 phrase structure trees for sentences from microblogs (tweets) and discussion forum posts. We compare the four parsers on their ability to produce Stanford dependencies for these Web 2.0 sentences. We find that the parsers have a particular problem with tweets and that a substantial part of this problem is related to POS tagging accuracy. We attempt three retraining experiments involving Malt, Brown and an in-house Berkeley-style parser and obtain a statistically significant improvement for all three parsers

    LFG without C-structures

    Get PDF
    We explore the use of two dependency parsers, Malt and MST, in a Lexical Functional Grammar parsing pipeline. We compare this to the traditional LFG parsing pipeline which uses constituency parsers. We train the dependency parsers not on classical LFG f-structures but rather on modified dependency-tree versions of these in which all words in the input sentence are represented and multiple heads are removed. For the purposes of comparison, we also modify the existing CFG-based LFG parsing pipeline so that these "LFG-inspired" dependency trees are produced. We find that the differences in parsing accuracy over the various parsing architectures is small

    "cba to check the spelling" investigating parser performance on discussion forum posts

    Get PDF
    We evaluate the Berkeley parser on text from an online discussion forum. We evaluate the parser output with and without gold tokens and spellings (using Sparseval and Parseval), and we compile a list of problematic phenomena for this domain. The Parseval f-score for a small development set is 77.56. This increases to 80.27 when we apply a set of simple transformations to the input sentences and to the Wall Street Journal (WSJ) training sections

    An Empirical Comparison of Parsing Methods for Stanford Dependencies

    Full text link
    Stanford typed dependencies are a widely desired representation of natural language sentences, but parsing is one of the major computational bottlenecks in text analysis systems. In light of the evolving definition of the Stanford dependencies and developments in statistical dependency parsing algorithms, this paper revisits the question of Cer et al. (2010): what is the tradeoff between accuracy and speed in obtaining Stanford dependencies in particular? We also explore the effects of input representations on this tradeoff: part-of-speech tags, the novel use of an alternative dependency representation as input, and distributional representaions of words. We find that direct dependency parsing is a more viable solution than it was found to be in the past. An accompanying software release can be found at: http://www.ark.cs.cmu.edu/TBSDComment: 13 pages, 2 figure

    Cross-Domain Generalization of Neural Constituency Parsers

    Full text link
    Neural parsers obtain state-of-the-art results on benchmark treebanks for constituency parsing -- but to what degree do they generalize to other domains? We present three results about the generalization of neural parsers in a zero-shot setting: training on trees from one corpus and evaluating on out-of-domain corpora. First, neural and non-neural parsers generalize comparably to new domains. Second, incorporating pre-trained encoder representations into neural parsers substantially improves their performance across all domains, but does not give a larger relative improvement for out-of-domain treebanks. Finally, despite the rich input representations they learn, neural parsers still benefit from structured output prediction of output trees, yielding higher exact match accuracy and stronger generalization both to larger text spans and to out-of-domain corpora. We analyze generalization on English and Chinese corpora, and in the process obtain state-of-the-art parsing results for the Brown, Genia, and English Web treebanks.Comment: ACL 2019. DF and NK contributed equall

    Automatic inference of causal reasoning chains from student essays

    Get PDF
    While there has been an increasing focus on higher-level thinking skills arising from the Common Core Standards, many high-school and middle-school students struggle to combine and integrate information from multiple sources when writing essays. Writing is an important learning skill, and there is increasing evidence that writing about a topic develops a deeper understanding in the student. However, grading essays is time consuming for teachers, resulting in an increasing focus on shallower forms of assessment that are easier to automate, such as multiple-choice tests. Existing essay grading software has attempted to ease this burden but relies on shallow lexico-syntactic features and is unable to understand the structure or validity of a student’s arguments or explanations. Without the ability to understand a student’s reasoning processes, it is impossible to write automated formative assessment systems to assist students with improving their thinking skills through essay writing. In order to understand the arguments put forth in an explanatory essay in the science domain, we need a method of representing the causal structure of a piece of explanatory text. Psychologists use a representation called a causal model to represent a student\u27s understanding of an explanatory text. This consists of a number of core concepts, and a set of causal relations linking them into one or more causal chains, forming a causal model. In this thesis I present a novel system for automatically constructing causal models from student scientific essays using Natural Language Processing (NLP) techniques. The problem was decomposed into 4 sub-problems - assigning essay concepts to words, detecting causal-relations between these concepts, resolving coreferences within each essay, and using the structure of the whole essay to reconstruct a causal model. Solutions to each of these sub-problems build upon the predictions from the solutions to earlier problems, forming a sequential pipeline of models. Designing a system in this way allows later models to correct for false positive predictions from downstream models. However, this also has the disadvantage that errors made in earlier models can propagate through the system, negatively impacting the upstream models, and limiting their accuracy. Producing robust solutions for the initial 2 sub problems, detecting concepts, and parsing causal relations between them, was critical in building a robust system. A number of sequence labeling models were trained to classify the concepts associated with each word, with the most effective approach being a bidirectional recurrent neural network (RNN), a deep learning model commonly applied to word labeling problems. This is because the RNN used pre-trained word embeddings to better generalize to rarer words, and was able to use information from both ends of each sentence to infer a word\u27s concept. The concepts predicted by this model were then used to develop causal relation parsing models for detecting causal connections between these concepts. A shift-reduce dependency parsing model was trained using the SEARN algorithm and out-performed a number of other approaches by better utilizing the structure of the problem and directly optimizing the error metric used. Two pre-trained coreference resolution systems were used to resolve coreferences within the essays. However a word tagging model trained to predict anaphors combined with a heuristic for determining the antecedent out-performed these two systems. Finally, a model was developed for parsing a causal model from an entire essay, utilizing the solutions to the three previous problems. A beam search algorithm was used to produce multiple parses for each sentence, which in turn were combined to generate multiple candidate causal models for each student essay. A reranking algorithm was then used to select the optimal causal model from all of the generated candidates. An important contribution of this work is that it represents a system for parsing a complete causal model of a scientific essay from a student\u27s written answer. Existing systems have been developed to parse individual causal relations, but no existing system attempts to parse a sequence of linked causal relations forming a causal model from an explanatory scientific essay. It is hoped that this work can lead to the development of more robust essay grading software and formative assessment tools, and can be extended to build solutions for extracting causality from text in other domains. In addition, I also present 2 novel approaches for optimizing the micro-F1 score within the design of two of the algorithms studied: the dependency parser and the reranking algorithm. The dependency parser uses a custom cost function to estimate the impact of parsing mistakes on the overall micro-F1 score, while the reranking algorithm allows the micro-F1 score to be optimized by tuning the beam search parameter to balance recall and precision
    corecore