1,625 research outputs found
Using Sentence Plausibility to Learn the Semantics of Transitive Verbs
The functional approach to compositional distributional semantics considers
transitive verbs to be linear maps that transform the distributional vectors
representing nouns into a vector representing a sentence. We conduct an initial
investigation that uses a matrix consisting of the parameters of a logistic
regression classifier trained on a plausibility task as a transitive verb
function. We compare our method to a commonly used corpus-based method for
constructing a verb matrix and find that the plausibility training may be more
effective for disambiguation tasks.Comment: Full updated paper for NIPS learning semantics workshop, with some
minor errata fixe
Do as I say, not as I do:a lexical distributional account of English locative verb class acquisition
Children overgeneralise verbs to ungrammatical structures early in acquisition, but retreat from these overgeneralisations as they learn semantic verb classes. In a large corpus of English locative utterances (e.g., the woman sprayed water onto the wall/wall with water), we found structural biases which changed over development and which could explain overgeneralisation behaviour. Children and adults had similar verb classes and a correspondence analysis suggested that lexical distributional regularities in the adult input could help to explain the acquisition of these classes. A connectionist model provided an explicit account of how structural biases could be learned over development and how these biases could be reduced by learning verb classes from distributional regularities
Evaluating Composition Models for Verb Phrase Elliptical Sentence Embeddings
Ellipsis is a natural language phenomenon where part of a sentence is missing and its information must be recovered from its surrounding context, as in “Cats chase dogs and so do foxes.”. Formal semantics has different methods for resolving ellipsis and recovering the missing information, but the problem has not been considered for distributional semantics, where words have vector embeddings and combinations thereof provide embeddings for sentences. In elliptical sentences these combinations go beyond linear as copying of elided information is necessary. In this paper, we develop different models for embedding VP-elliptical sentences. We extend existing verb disambiguation and sentence similarity datasets to ones containing elliptical phrases and evaluate our models on these datasets for a variety of non-linear combinations and their linear counterparts. We compare results of these compositional models to state of the art holistic sentence encoders. Our results show that non-linear addition and a non-linear tensor-based composition outperform the naive non-compositional baselines and the linear models, and that sentence encoders perform well on sentence similarity, but not on verb disambiguation
An autoencoder-based neural network model for selectional preference: evidence from pseudo-disambiguation and cloze tasks
Intuitively, some predicates have a better fit with certain arguments than others. Usage-based models of language emphasize the importance of semantic similarity in shaping the structuring of constructions (form and meaning). In this study, we focus on modeling the semantics of transitive constructions in Finnish and present an autoencoder-based neural network model trained on semantic vectors based on Word2vec. This model builds on the distributional hypothesis according to which semantic information is primarily shaped by contextual information. Specifically, we focus on the realization of the object. The performance of the model is evaluated in two tasks: a pseudo-disambiguation and a cloze task. Additionally, we contrast the performance of the autoencoder with a previously implemented neural model. In general, the results show that our model achieves an excellent performance on these tasks in comparison to the other models. The results are discussed in terms of usage-based construction grammar.Kokkuvõte. Aki-Juhani Kyröläinen, M. Juhani Luotolahti ja Filip Ginter: Autokoodril põhinev närvivõrkude mudel valikulisel eelistamisel. Intuitiivselt tundub, et mõned argumendid sobivad teatud predikaatidega paremini kokku kui teised. Kasutuspõhised keelemudelid rõhutavad konstruktsioonide struktuuri (nii vormi kui tähenduse) kujunemisel tähendusliku sarnasuse olulisust. Selles uurimuses modelleerime soome keele transitiivsete konstruktsioonide semantikat ja esitame närvivõrkude mudeli ehk autokoodri. Mudel põhineb distributiivse semantika hüpoteesil, mille järgi kujuneb semantiline info peamiselt konteksti põhjal. Täpsemalt keskendume uurimuses objektile. Mudelit hindame nii valeühestamise kui ka lünkülesande abil. Kõrvutame autokoodri tulemusi varem välja töötatud neurovõrgumudelitega ja tõestame, et meie mudel töötab võrreldes teiste mudelitega väga hästi. Tulemused esitame kasutuspõhise konstruktsioonigrammatika kontekstis.Võtmesõnad: neurovõrk; autokooder; tähendusvektor; kasutuspõhine mudel; soome kee
About Edible Restaurants: Conflicts between Syntax and Semantics as Revealed by ERPs
In order to investigate conflicts between semantics and syntax, we recorded ERPs, while participants read Dutch sentences. Sentences containing conflicts between syntax and semantics (Fred eats in a sandwich…/Fred eats a restaurant…) elicited an N400. These results show that conflicts between syntax and semantics not necessarily lead to P600 effects and are in line with the processing competition account. According to this parallel account the syntactic and semantic processing streams are fully interactive and information from one level can influence the processing at another level. The relative strength of the cues of the processing streams determines which level is affected most strongly by the conflict. The processing competition account maintains the distinction between the N400 as index for semantic processing and the P600 as index for structural processing
A Computational Model of Syntactic Processing: Ambiguity Resolution from Interpretation
Syntactic ambiguity abounds in natural language, yet humans have no
difficulty coping with it. In fact, the process of ambiguity resolution is
almost always unconscious. But it is not infallible, however, as example 1
demonstrates.
1. The horse raced past the barn fell.
This sentence is perfectly grammatical, as is evident when it appears in the
following context:
2. Two horses were being shown off to a prospective buyer. One was raced past
a meadow. and the other was raced past a barn. ...
Grammatical yet unprocessable sentences such as 1 are called `garden-path
sentences.' Their existence provides an opportunity to investigate the human
sentence processing mechanism by studying how and when it fails. The aim of
this thesis is to construct a computational model of language understanding
which can predict processing difficulty. The data to be modeled are known
examples of garden path and non-garden path sentences, and other results from
psycholinguistics.
It is widely believed that there are two distinct loci of computation in
sentence processing: syntactic parsing and semantic interpretation. One
longstanding controversy is which of these two modules bears responsibility for
the immediate resolution of ambiguity. My claim is that it is the latter, and
that the syntactic processing module is a very simple device which blindly and
faithfully constructs all possible analyses for the sentence up to the current
point of processing. The interpretive module serves as a filter, occasionally
discarding certain of these analyses which it deems less appropriate for the
ongoing discourse than their competitors.
This document is divided into three parts. The first is introductory, and
reviews a selection of proposals from the sentence processing literature. The
second part explores a body of data which has been adduced in support of a
theory of structural preferences --- one that is inconsistent with the present
claim. I show how the current proposal can be specified to account for the
available data, and moreover to predict where structural preference theories
will go wrong. The third part is a theoretical investigation of how well the
proposed architecture can be realized using current conceptions of linguistic
competence. In it, I present a parsing algorithm and a meaning-based ambiguity
resolution method.Comment: 128 pages, LaTeX source compressed and uuencoded, figures separate
macros: rotate.sty, lingmacros.sty, psfig.tex. Dissertation, Computer and
Information Science Dept., October 199
Psych verbs, the linking problem, and the acquisition of language
In acquiring language, children must learn to appropriately place the different participants of an event (e.g., causal agent, affected entity) into the correct syntactic positions (e.g., subject, object) so that listeners will know who did what to whom. While many of these mappings can be characterized by broad generalizations, both within and across languages (e.g., semantic agents tend to be mapped onto syntactic subjects), not all verbs fit neatly into these generalizations. One particularly striking example is verbs of psychological state: The experiencer of the state can appear as either the subject (Agnes fears/hates/loves Bartholomew) or the direct object (Agnes frightens/angers/delights Bartholomew). The present studies explore whether this apparent variability in subject/object mapping may actually result from differences in these verbs’ underlying meanings. Specifically, we suggest that verbs like fear describe a habitual attitude towards some entity whereas verbs like frighten describe an externally caused emotional episode. We find that this distinction systematically characterizes verbs in English, Mandarin, and Korean. This pattern is generalized to novel verbs by adults in English, Japanese, and Russian, and even by English-speaking children who are just beginning to acquire psych verbs. This results support a broad role for systematic mappings between semantics and syntax in language acquisition
- …