181 research outputs found
A generation-oriented workbench for performance grammar: Capturing linear order variability in German and Dutch
We describe a generation-oriented workbench for the Performance Grammar (PG) formalism, highlighting the treatment of certain word order and movement constraints in Dutch and German. PG enables a simple and uniform treatment of a heterogeneous collection of linear order phenomena in the domain of verb constructions (variably known as Cross-serial Dependencies, Verb Raising, Clause Union, Extraposition, Third Construction, Particle Hopping, etc.). The central data structures enabling this feature are clausal âtopologiesâ: one-dimensional arrays associated with clauses, whose cells (âslotsâ) provide landing sites for the constituents of the clause. Movement operations are enabled by unification of lateral slots of topologies at adjacent levels of the clause hierarchy. The PGW generator assists the grammar developer in testing whether the implemented syntactic knowledge allows all and only the well-formed permutations of constituents
A generation-oriented workbench for performance grammar: Capturing linear order variability in German and Dutch
We describe a generation-oriented workbench for the Performance Grammar (PG) formalism, highlighting the treatment of certain word order and movement constraints in Dutch and German. PG enables a simple and uniform treatment of a heterogeneous collection of linear order phenomena in the domain of verb constructions (variably known as Cross-serial Dependencies, Verb Raising, Clause Union, Extraposition, Third Construction, Particle Hopping, etc.). The central data structures enabling this feature are clausal âtopologiesâ: one-dimensional arrays associated with clauses, whose cells (âslotsâ) provide landing sites for the constituents of the clause. Movement operations are enabled by unification of lateral slots of topologies at adjacent levels of the clause hierarchy. The PGW generator assists the grammar developer in testing whether the implemented syntactic knowledge allows all and only the well-formed permutations of constituents
Recommended from our members
Simulating the Noun-Verb Asymmetry in the Productivity of Childrenâs Speech
Several authors propose that children may acquire syntactic categories on the basis of co-occurrence statistics of words in the input. This paper assesses the relative merits of two such accounts by assessing the type and amount of productive language that results from computing co-occurrence statistics over conjoint and independent preceding and following contexts. This is achieved through the implementation of these methods in MOSAIC, a computational model of syntax acquisition that produces utterances that can be directly compared to child speech, and has a developmental component (i.e. produces increasingly long utterances). It is shown that the computation of co-occurrence statistics over conjoint contexts or frames results in a pattern of productive speech that more closely resembles that displayed by language learning children. The simulation of the developmental patterning of childrenâs productive speech furthermore suggests two refinements to this basic mechanism: inclusion of utterance boundaries, and the weighting of frames for their lexical content
A neural blackboard architecture of sentence structure
We present a neural architecture for sentence representation. Sentences are represented in terms of word representations as constituents. A word representation consists of a neural assembly distributed over the brain. Sentence representation does not result from associations between neural word assemblies. Instead, word assemblies are embedded in a neural architecture, in which the structural (thematic) relations between words can be represented. Arbitrary thematic relations between arguments and verbs can be represented. Arguments can consist of nouns and phrases, as in sentences with relative clauses. A number of sentences can be stored simultaneously in this architecture. We simulate how probe questions about thematic relations can be answered. We discuss how differences in sentence complexity, such as the difference between subject-extracted versus object-extracted relative clauses and the difference between right-branching versus center-embedded structures, can be related to the underlying neural dynamics of the model. Finally, we illustrate how memory capacity for sentence representation can be related to the nature of reverberating neural activity, which is used to store information temporarily in this architecture
The Unification Space implemented as a localist neural net: predictions and error-tolerance in a constraint-based parser
We introduce a novel computer implementation of the Unification-Space parser (Vosse and Kempen in Cognition 75:105â143, 2000) in the form of a localist neural network whose dynamics is based on interactive activation and inhibition. The wiring of the network is determined by Performance Grammar (Kempen and Harbusch in Verb constructions in German and Dutch. Benjamins, Amsterdam, 2003), a lexicalist formalism with feature unification as binding operation. While the network is processing input word strings incrementally, the evolving shape of parse trees is represented in the form of changing patterns of activation in nodes that code for syntactic properties of words and phrases, and for the grammatical functions they fulfill. The system is capable, at least qualitatively and rudimentarily, of simulating several important dynamic aspects of human syntactic parsing, including garden-path phenomena and reanalysis, effects of complexity (various types of clause embeddings), fault-tolerance in case of unification failures and unknown words, and predictive parsing (expectation-based analysis, surprisal effects). English is the target language of the parser described
From lexical bundles to lexical frames: uncovering the extent of phraseological variation in academic writing
The contextual knowledge of a word is closely related to the knowledge of phraseological sequences as words
are often used in the phraseological forms, either continuous or discontinuous. Much has been done to examine
the continuous phraseological sequences for various purposes. However, studies on phraseology often overlook
the potentially useful discontinuous phraseological sequences that allow for more flexible and productive use of
language forms. To bridge the gap in phraseology studies, this study therefore employed a corpus-driven
approach to analyse the characteristics of a form of discontinuous phraseological sequence, namely lexical
frames in a one-million-word corpus of research articles in International Business Management (IBM). The
characteristics of lexical frames were observed in four aspects: the degrees of variability and predictability of
lexical frames, the structures as well as the variable slot fillers of lexical frames. The corpus tool, Collocate 1.0
was used to extract three- and four-word lexical bundles while kfNgram was used to extract three- and fourword
lexical frames from the lexical bundles. The results revealed that three-word lexical frames are more
prevalent in IBM. The degree of variability analysis indicated that there are more fixed lexical frames in the
category of three-word lexical frames compared to the four-word category. In terms of the degree of
predictability, the category of four-word lexical frames contains more predictable lexical frames than the threeword
category. Also, most lexical frames are function word frames and the lexical frames are mostly filled up by
content words rather than function words. This study contributes to the understanding of phraseological
variation in academic writing
Syntactic structure assembly in human parsing: A computational model based on competitive inhibition and a lexicalist grammar
We present the design, implementation and simulation results of a psycholinguistic model of human syntactic processing that meets major empirical criteria. The parser operates in conjunction with a lexicalist grammar and is driven by syntactic information associated with heads of phrases. The dynamics of the model are based on competition by lateral inhibition ('competitive inhibition'). Input words activate lexical frames (i.e. elementary trees anchored to input words) in the mental lexicon, and a network of candidate 'unification links' is set up between frame nodes. These links represent tentative attachments that are graded rather than all-or-none. Candidate links that, due to grammatical or 'treehood' constraints, are incompatible, compete for inclusion in the final syntactic tree by sending each other inhibitory signals that reduce the competitor's attachment strength. The outcome of these local and simultaneous competitions is controlled by dynamic parameters, in particular by the Entry Activation and the Activation Decay rate of syntactic nodes, and by the Strength and Strength Build-up rate of Unification links. In case of a successful parse, a single syntactic tree is returned that covers the whole input string and consists of lexical frames connected by winning Unification links. Simulations are reported of a significant range of psycholinguistic parsing phenomena in both normal and aphasic speakers of English: (i) various effects of linguistic complexity (single versus double, center versus right-hand self-embeddings of relative clauses; the difference between relative clauses with subject and object extraction; the contrast between a complement clause embedded within a relative clause versus a relative clause embedded within a complement clause); (ii) effects of local and global ambiguity, and of word-class and syntactic ambiguity (including recency and length effects); (iii) certain difficulty-of-reanalysis effects (contrasts between local ambiguities that are easy to resolve versus ones that lead to serious garden-path effects); (iv) effects of agrammatism on parsing performance, in particular the performance of various groups of aphasic patients on several sentence types
HHMM at SemEval-2019 Task 2: Unsupervised Frame Induction using Contextualized Word Embeddings
We present our system for semantic frame induction that showed the best
performance in Subtask B.1 and finished as the runner-up in Subtask A of the
SemEval 2019 Task 2 on unsupervised semantic frame induction (QasemiZadeh et
al., 2019). Our approach separates this task into two independent steps: verb
clustering using word and their context embeddings and role labeling by
combining these embeddings with syntactical features. A simple combination of
these steps shows very competitive results and can be extended to process other
datasets and languages.Comment: 5 pages, 3 tables, accepted at SemEval 201
- âŠ