67,120 research outputs found

    Segregatory Coordination and Ellipsis in Text Generation

    Get PDF
    In this paper, we provide an account of how to generate sentences with coordination constructions from clause-sized semantic representations. An algorithm is developed to generate sentences with ellipsis, gapping, right-node-raising, and non-constituent coordination constructions. Various examples from linguistic literature will be used to demonstrate that the algorithm does its job well.Comment: 7 pages, uses colacl.st

    Parsing coordinations

    Get PDF
    The present paper is concerned with statistical parsing of constituent structures in German. The paper presents four experiments that aim at improving parsing performance of coordinate structure: 1) reranking the n-best parses of a PCFG parser, 2) enriching the input to a PCFG parser by gold scopes for any conjunct, 3) reranking the parser output for all possible scopes for conjuncts that are permissible with regard to clause structure. Experiment 4 reranks a combination of parses from experiments 1 and 3. The experiments presented show that n- best parsing combined with reranking improves results by a large margin. Providing the parser with different scope possibilities and reranking the resulting parses results in an increase in F-score from 69.76 for the baseline to 74.69. While the F-score is similar to the one of the first experiment (n-best parsing and reranking), the first experiment results in higher recall (75.48% vs. 73.69%) and the third one in higher precision (75.43% vs. 73.26%). Combining the two methods results in the best result with an F-score of 76.69

    Investigating eye movement acquisition and analysis technologies as a causal factor in differential prevalence of crossed and uncrossed fixation disparity during reading and dot scanning

    Get PDF
    Previous studies examining binocular coordination during reading have reported conflicting results in terms of the nature of disparity (e.g. Kliegl, Nuthmann, &amp; Engbert (Journal of Experimental Psychology General 135:12-35, 2006); Liversedge, White, Findlay, &amp; Rayner (Vision Research 46:2363-2374, 2006). One potential cause of this inconsistency is differences in acquisition devices and associated analysis technologies. We tested this by directly comparing binocular eye movement recordings made using SR Research EyeLink 1000 and the Fourward Technologies Inc. DPI binocular eye-tracking systems. Participants read sentences or scanned horizontal rows of dot strings; for each participant, half the data were recorded with the EyeLink, and the other half with the DPIs. The viewing conditions in both testing laboratories were set to be very similar. Monocular calibrations were used. The majority of fixations recorded using either system were aligned, although data from the EyeLink system showed greater disparity magnitudes. Critically, for unaligned fixations, the data from both systems showed a majority of uncrossed fixations. These results suggest that variability in previous reports of binocular fixation alignment is attributable to the specific viewing conditions associated with a particular experiment (variables such as luminance and viewing distance), rather than acquisition and analysis software and hardware.<br/

    Data-oriented parsing and the Penn Chinese treebank

    Get PDF
    We present an investigation into parsing the Penn Chinese Treebank using a Data-Oriented Parsing (DOP) approach. DOP comprises an experience-based approach to natural language parsing. Most published research in the DOP framework uses PStrees as its representation schema. Drawbacks of the DOP approach centre around issues of efficiency. We incorporate recent advances in DOP parsing techniques into a novel DOP parser which generates a compact representation of all subtrees which can be derived from any full parse tree. We compare our work to previous work on parsing the Penn Chinese Treebank, and provide both a quantitative and qualitative evaluation. While our results in terms of Precision and Recall are slightly below those published in related research, our approach requires no manual encoding of head rules, nor is a development phase per se necessary. We also note that certain constructions which were problematic in this previous work can be handled correctly by our DOP parser. Finally, we observe that the ‘DOP Hypothesis’ is confirmed for parsing the Penn Chinese Treebank

    A testsuite for testing parser performance on complex German grammatical constructions [TePaCoC - a corpus for testing parser performance on complex German grammatical constructions]

    Get PDF
    Traditionally, parsers are evaluated against gold standard test data. This can cause problems if there is a mismatch between the data structures and representations used by the parser and the gold standard. A particular case in point is German, for which two treebanks (TiGer and TüBa-D/Z) are available with highly different annotation schemes for the acquisition of (e.g.) PCFG parsers. The differences between the TiGer and TüBa-D/Z annotation schemes make fair and unbiased parser evaluation difficult [7, 9, 12]. The resource (TEPACOC) presented in this paper takes a different approach to parser evaluation: instead of providing evaluation data in a single annotation scheme, TEPACOC uses comparable sentences and their annotations for 5 selected key grammatical phenomena (with 20 sentences each per phenomena) from both TiGer and TüBa-D/Z resources. This provides a 2 times 100 sentence comparable testsuite which allows us to evaluate TiGer-trained parsers against the TiGer part of TEPACOC, and TüBa-D/Z-trained parsers against the TüBa-D/Z part of TEPACOC for key phenomena, instead of comparing them against a single (and potentially biased) gold standard. To overcome the problem of inconsistency in human evaluation and to bridge the gap between the two different annotation schemes, we provide an extensive error classification, which enables us to compare parser output across the two different treebanks. In the remaining part of the paper we present the testsuite and describe the grammatical phenomena covered in the data. We discuss the different annotation strategies used in the two treebanks to encode these phenomena and present our error classification of potential parser errors

    Periodic letter strokes within a word affect fixation disparity during reading

    Get PDF
    We investigated the way in which binocular coordination in reading is affected by the spatial structure of text. Vergence eye movements were measured (EyeLink II) in 32 observers while they read 120 single German sentences (Potsdam Sentence Corpus) silently for comprehension. The similarity in shape between the neighboring strokes of component letters, as measured by the first peak in the horizontal auto-correlation of the images of the words, was found to be associated with (i) a smaller minimum fixation disparity (i.e. vergence error) during fixation; (ii) a longer time to reach this minimum disparity and (iii) a longer overall fixation duration. The results were obtained only for binocular reading: no effects of auto-correlation could be observed for monocular reading. The findings help to explain the longer reading times reported for words and fonts with high auto-correlation and may also begin to provide a causal link between poor binocular control and reading difficulties. © ARVO

    CHR as grammar formalism. A first report

    Full text link
    Grammars written as Constraint Handling Rules (CHR) can be executed as efficient and robust bottom-up parsers that provide a straightforward, non-backtracking treatment of ambiguity. Abduction with integrity constraints as well as other dynamic hypothesis generation techniques fit naturally into such grammars and are exemplified for anaphora resolution, coordination and text interpretation.Comment: 12 pages. Presented at ERCIM Workshop on Constraints, Prague, Czech Republic, June 18-20, 200

    A Frobenius Algebraic Analysis for Parasitic Gaps

    Get PDF
    The interpretation of parasitic gaps is an ostensible case of non-linearity in natural language composition. Existing categorial analyses, both in the typelogical and in the combinatory traditions, rely on explicit forms of syntactic copying. We identify two types of parasitic gapping where the duplication of semantic content can be confined to the lexicon. Parasitic gaps in adjuncts are analysed as forms of generalized coordination with a polymorphic type schema for the head of the adjunct phrase. For parasitic gaps affecting arguments of the same predicate, the polymorphism is associated with the lexical item that introduces the primary gap. Our analysis is formulated in terms of Lambek calculus extended with structural control modalities. A compositional translation relates syntactic types and derivations to the interpreting compact closed category of finite dimensional vector spaces and linear maps with Frobenius algebras over it. When interpreted over the necessary semantic spaces, the Frobenius algebras provide the tools to model the proposed instances of lexical polymorphism.Comment: SemSpace 2019, to appear in Journal of Applied Logic

    Attempto Controlled English (ACE)

    Full text link
    Attempto Controlled English (ACE) allows domain specialists to interactively formulate requirements specifications in domain concepts. ACE can be accurately and efficiently processed by a computer, but is expressive enough to allow natural usage. The Attempto system translates specification texts in ACE into discourse representation structures and optionally into Prolog. Translated specification texts are incrementally added to a knowledge base. This knowledge base can be queried in ACE for verification, and it can be executed for simulation, prototyping and validation of the specification.Comment: 13 pages, compressed, uuencoded Postscript, to be presented at CLAW 96, The First International Workshop on Controlled Language Applications, Katholieke Universiteit Leuven, 26-27 March 199
    corecore