67,120 research outputs found
Segregatory Coordination and Ellipsis in Text Generation
In this paper, we provide an account of how to generate sentences with
coordination constructions from clause-sized semantic representations. An
algorithm is developed to generate sentences with ellipsis, gapping,
right-node-raising, and non-constituent coordination constructions. Various
examples from linguistic literature will be used to demonstrate that the
algorithm does its job well.Comment: 7 pages, uses colacl.st
Parsing coordinations
The present paper is concerned with statistical parsing of constituent structures in German. The paper presents four experiments that aim at improving parsing performance of coordinate structure: 1) reranking the n-best parses of a PCFG parser, 2) enriching the input to a PCFG parser by gold scopes for any conjunct, 3) reranking the parser output for all possible scopes for conjuncts that are permissible with regard to clause structure. Experiment 4 reranks a combination of parses from experiments 1 and 3. The experiments presented show that n- best parsing combined with reranking improves results by a large margin. Providing the parser with different scope possibilities and reranking the resulting parses results in an increase in F-score from 69.76 for the baseline to 74.69. While the F-score is similar to the one of the first experiment (n-best parsing and reranking), the first experiment results in higher recall (75.48% vs. 73.69%) and the third one in higher precision (75.43% vs. 73.26%). Combining the two methods results in the best result with an F-score of 76.69
Investigating eye movement acquisition and analysis technologies as a causal factor in differential prevalence of crossed and uncrossed fixation disparity during reading and dot scanning
Previous studies examining binocular coordination during reading have reported conflicting results in terms of the nature of disparity (e.g. Kliegl, Nuthmann, & Engbert (Journal of Experimental Psychology General 135:12-35, 2006); Liversedge, White, Findlay, & Rayner (Vision Research 46:2363-2374, 2006). One potential cause of this inconsistency is differences in acquisition devices and associated analysis technologies. We tested this by directly comparing binocular eye movement recordings made using SR Research EyeLink 1000 and the Fourward Technologies Inc. DPI binocular eye-tracking systems. Participants read sentences or scanned horizontal rows of dot strings; for each participant, half the data were recorded with the EyeLink, and the other half with the DPIs. The viewing conditions in both testing laboratories were set to be very similar. Monocular calibrations were used. The majority of fixations recorded using either system were aligned, although data from the EyeLink system showed greater disparity magnitudes. Critically, for unaligned fixations, the data from both systems showed a majority of uncrossed fixations. These results suggest that variability in previous reports of binocular fixation alignment is attributable to the specific viewing conditions associated with a particular experiment (variables such as luminance and viewing distance), rather than acquisition and analysis software and hardware.<br/
Data-oriented parsing and the Penn Chinese treebank
We present an investigation into parsing the Penn Chinese Treebank using a Data-Oriented Parsing (DOP) approach. DOP
comprises an experience-based approach to natural language parsing. Most published research in the DOP framework uses PStrees as its representation schema. Drawbacks of the DOP approach centre around issues of efficiency. We incorporate recent advances in DOP parsing techniques into a novel DOP parser which generates a compact representation of all subtrees which can be derived from any full parse tree.
We compare our work to previous work on parsing the Penn Chinese Treebank, and provide both a quantitative and qualitative evaluation. While our results in terms of Precision and Recall are slightly below those published in related research, our approach requires no manual encoding of head rules, nor is a development phase per se necessary.
We also note that certain constructions which were problematic in this previous work can be handled correctly by our DOP parser. Finally, we observe that the ‘DOP Hypothesis’ is confirmed for parsing the Penn Chinese Treebank
A testsuite for testing parser performance on complex German grammatical constructions [TePaCoC - a corpus for testing parser performance on complex German grammatical constructions]
Traditionally, parsers are evaluated against gold standard test data. This can cause problems if there is a mismatch between the data structures and representations used by the parser and the gold standard. A particular case in point is German, for which two treebanks (TiGer and TüBa-D/Z) are available with highly different annotation schemes for the acquisition of (e.g.) PCFG parsers. The differences between the TiGer and TüBa-D/Z annotation schemes make fair and unbiased parser evaluation difficult [7, 9, 12]. The resource (TEPACOC) presented in this paper takes a different approach to parser evaluation: instead of providing evaluation data in a single annotation scheme, TEPACOC uses comparable sentences and their annotations for 5 selected key grammatical phenomena (with 20 sentences each per phenomena) from both TiGer and TüBa-D/Z resources. This provides a 2 times 100 sentence comparable testsuite which allows us to evaluate TiGer-trained parsers against the TiGer part of TEPACOC, and TüBa-D/Z-trained parsers against the TüBa-D/Z part of TEPACOC for key phenomena, instead of comparing them against a single (and potentially biased) gold standard. To overcome the problem of inconsistency in human evaluation and to bridge the gap between the two different annotation schemes, we provide an extensive error classification, which enables us to compare parser output across the two different treebanks. In the remaining part of the paper we present the testsuite and describe the grammatical phenomena covered in the data. We discuss the different annotation strategies used in the two treebanks to encode these phenomena and present our error classification of potential parser errors
Periodic letter strokes within a word affect fixation disparity during reading
We investigated the way in which binocular coordination in reading is affected by the spatial structure of text. Vergence eye movements were measured (EyeLink II) in 32 observers while they read 120 single German sentences (Potsdam Sentence Corpus) silently for comprehension. The similarity in shape between the neighboring strokes of component letters, as measured by the first peak in the horizontal auto-correlation of the images of the words, was found to be associated with (i) a smaller minimum fixation disparity (i.e. vergence error) during fixation; (ii) a longer time to reach this minimum disparity and (iii) a longer overall fixation duration. The results were obtained only for binocular reading: no effects of auto-correlation could be observed for monocular reading. The findings help to explain the longer reading times reported for words and fonts with high auto-correlation and may also begin to provide a causal link between poor binocular control and reading difficulties. © ARVO
CHR as grammar formalism. A first report
Grammars written as Constraint Handling Rules (CHR) can be executed as
efficient and robust bottom-up parsers that provide a straightforward,
non-backtracking treatment of ambiguity. Abduction with integrity constraints
as well as other dynamic hypothesis generation techniques fit naturally into
such grammars and are exemplified for anaphora resolution, coordination and
text interpretation.Comment: 12 pages. Presented at ERCIM Workshop on Constraints, Prague, Czech
Republic, June 18-20, 200
A Frobenius Algebraic Analysis for Parasitic Gaps
The interpretation of parasitic gaps is an ostensible case of non-linearity
in natural language composition. Existing categorial analyses, both in the
typelogical and in the combinatory traditions, rely on explicit forms of
syntactic copying. We identify two types of parasitic gapping where the
duplication of semantic content can be confined to the lexicon. Parasitic gaps
in adjuncts are analysed as forms of generalized coordination with a
polymorphic type schema for the head of the adjunct phrase. For parasitic gaps
affecting arguments of the same predicate, the polymorphism is associated with
the lexical item that introduces the primary gap. Our analysis is formulated in
terms of Lambek calculus extended with structural control modalities. A
compositional translation relates syntactic types and derivations to the
interpreting compact closed category of finite dimensional vector spaces and
linear maps with Frobenius algebras over it. When interpreted over the
necessary semantic spaces, the Frobenius algebras provide the tools to model
the proposed instances of lexical polymorphism.Comment: SemSpace 2019, to appear in Journal of Applied Logic
Attempto Controlled English (ACE)
Attempto Controlled English (ACE) allows domain specialists to interactively
formulate requirements specifications in domain concepts. ACE can be accurately
and efficiently processed by a computer, but is expressive enough to allow
natural usage. The Attempto system translates specification texts in ACE into
discourse representation structures and optionally into Prolog. Translated
specification texts are incrementally added to a knowledge base. This knowledge
base can be queried in ACE for verification, and it can be executed for
simulation, prototyping and validation of the specification.Comment: 13 pages, compressed, uuencoded Postscript, to be presented at CLAW
96, The First International Workshop on Controlled Language Applications,
Katholieke Universiteit Leuven, 26-27 March 199
- …