216 research outputs found
From Partial to Strictly Incremental Constituent Parsing
We study incremental constituent parsers to assess their capacity to output
trees based on prefix representations alone. Guided by strictly left-to-right
generative language models and tree-decoding modules, we build parsers that
adhere to a strong definition of incrementality across languages. This builds
upon work that asserted incrementality, but that mostly only enforced it on
either the encoder or the decoder. Finally, we conduct an analysis against
non-incremental and partially incremental models.Comment: Accepted at EACL 202
Persian Semantic Role Labeling Based on Dependency Tree
Semantic role labeling is the task of attaching semantic tags to the words according to the occurred event in the sentence. Persian semantic role labeling is a challenging task that most methods so far in this regard depend on a huge number of handcrafted features and are done on feature engineering to attain high performance. On the other hand, by considering the Free-Word-Order and Subject-Object-Verb-Order characteristics of Persian, the verbal predicate’s arguments are often distant and create long-range dependencies. The long-range dependencies can hardly be modeled by these methods. Our goal is to achieve a better performance only with minimal feature engineering and also to capture long-range dependencies in a sentence. To these ends, in this paper a deep model for semantic role labeling is developed with the help of dependency tree for Persian. In our proposed method, for each verbal predicate, the potential arguments are identified with the help of dependency relationships, and then the dependency path for each pair of predicate and its candidate argument is embedded using the information in the dependency trees. In the next step, we employed a bi-directional recurrent neural network with long short-term memory units to transform word features into semantic role scores. Experiments have been done on the first semantic role corpus in Persian language and the corpus provided by the authors. The achieved Macro-average F1-measure is 80.01 for the first corpus and 82.48 for the second one
Learning Language from a Large (Unannotated) Corpus
A novel approach to the fully automated, unsupervised extraction of
dependency grammars and associated syntax-to-semantic-relationship mappings
from large text corpora is described. The suggested approach builds on the
authors' prior work with the Link Grammar, RelEx and OpenCog systems, as well
as on a number of prior papers and approaches from the statistical language
learning literature. If successful, this approach would enable the mining of
all the information needed to power a natural language comprehension and
generation system, directly from a large, unannotated corpus.Comment: 29 pages, 5 figures, research proposa
Recommended from our members
Neurobiology of incremental speech comprehension
Understanding spoken language requires the rapid transition from perceptual processing of the auditory input through a variety of cognitive processes involved in constructing the mental representation of the message that the speaker is intending to convey. Listeners carry out these complex processes very rapidly and accurately as they hear each word incrementally unfolding in a sentence. However, little is known about the specific spatiotemporal patterning of this wide range of incremental processing operations that underpin the dynamic transitions from the speech input to the development of a meaning interpretation of an utterance. This thesis aims to address this set of issues by investigating the spatiotemporal dynamics of brain activity as spoken sentences unfold over time in order to illuminate the neurocomputational properties of the human language processing system and determine how the representation of a spoken sentence develops incrementally as each upcoming word is heard.
Using a novel application of multidimensional probabilistic modelling combined with models from computational linguistics, I developed models of a variety of computational processes associated with accessing and processing the syntactic and semantic properties of sentences and tested these models at various points as sentences unfolded over time. Since a wide range of incremental processes occur very rapidly during speech comprehension, it is crucial to keep track of the temporal dynamics of the neural computations involved. To do this, I used combined electroencephalography and magnetoencephalography (EMEG) to record neural activity with millisecond resolution and analyzed the recordings in source space using univariate and/or multivariate approaches. The results confirm the value of this combination of methods in examining the properties of incremental speech processing. My findings corroborate the predictive nature of human speech comprehension and demonstrate that the effects of early semantic constraint are not dependent on explicit syntactic knowledge
- …