22,298 research outputs found
Learning semantic sentence representations from visually grounded language without lexical knowledge
Current approaches to learning semantic representations of sentences often
use prior word-level knowledge. The current study aims to leverage visual
information in order to capture sentence level semantics without the need for
word embeddings. We use a multimodal sentence encoder trained on a corpus of
images with matching text captions to produce visually grounded sentence
embeddings. Deep Neural Networks are trained to map the two modalities to a
common embedding space such that for an image the corresponding caption can be
retrieved and vice versa. We show that our model achieves results comparable to
the current state-of-the-art on two popular image-caption retrieval benchmark
data sets: MSCOCO and Flickr8k. We evaluate the semantic content of the
resulting sentence embeddings using the data from the Semantic Textual
Similarity benchmark task and show that the multimodal embeddings correlate
well with human semantic similarity judgements. The system achieves
state-of-the-art results on several of these benchmarks, which shows that a
system trained solely on multimodal data, without assuming any word
representations, is able to capture sentence level semantics. Importantly, this
result shows that we do not need prior knowledge of lexical level semantics in
order to model sentence level semantics. These findings demonstrate the
importance of visual information in semantics
Political Text Scaling Meets Computational Semantics
During the last fifteen years, automatic text scaling has become one of the
key tools of the Text as Data community in political science. Prominent text
scaling algorithms, however, rely on the assumption that latent positions can
be captured just by leveraging the information about word frequencies in
documents under study. We challenge this traditional view and present a new,
semantically aware text scaling algorithm, SemScale, which combines recent
developments in the area of computational linguistics with unsupervised
graph-based clustering. We conduct an extensive quantitative analysis over a
collection of speeches from the European Parliament in five different languages
and from two different legislative terms, and show that a scaling approach
relying on semantic document representations is often better at capturing known
underlying political dimensions than the established frequency-based (i.e.,
symbolic) scaling method. We further validate our findings through a series of
experiments focused on text preprocessing and feature selection, document
representation, scaling of party manifestos, and a supervised extension of our
algorithm. To catalyze further research on this new branch of text scaling
methods, we release a Python implementation of SemScale with all included data
sets and evaluation procedures.Comment: Updated version - accepted for Transactions on Data Science (TDS
The formulation of argument structure in SLI: an eye-movement study
This study investigated the formulation of verb argument structure in Catalan- and Spanishspeaking children with specific language impairment (SLI) and typically developing age-matched controls. We compared how language production can be guided by conceptual factors, such as the organization of the entities participating in an event and knowledge regarding argument structure. Eleven children with SLI (aged 3;8 to 6;6) and eleven control children participated in an eyetracking experiment in which participants had to describe events with different argument structure in the presence of visual scenes. Picture descriptions, latency time and eye movements were recorded and analyzed. The picture description results showed that the percentage of responses in which children with SLI substituted a non-target verb for the target verb was significantly different from that for the control group. Children with SLI made more omissions of obligatory arguments, especially of themes, as the verb argument complexity increased. Moreover, when the number of arguments of the verb increased, the children took more time to begin their descriptions, but no differences between groups were found. For verb type latency, all children were significantly faster to start describing one-argument events than two- and three-argument events. No differences in latency time were found between two- and three-argument events. There were no significant differences between the groups. Eye-movement showed that children with SLI looked less at the event zone than the age-matched controls during the first two seconds. These differences between the groups were significant for three-argument verbs, and only marginally significant for one- and two-argument verbs. Children with SLI also spent significantly less time looking at the theme zones than their age-matched controls. We suggest that both processing limitations and deficits in the semantic representation of verbs may play a role in these difficulties
Filling Knowledge Gaps in a Broad-Coverage Machine Translation System
Knowledge-based machine translation (KBMT) techniques yield high quality in
domains with detailed semantic models, limited vocabulary, and controlled input
grammar. Scaling up along these dimensions means acquiring large knowledge
resources. It also means behaving reasonably when definitive knowledge is not
yet available. This paper describes how we can fill various KBMT knowledge
gaps, often using robust statistical techniques. We describe quantitative and
qualitative results from JAPANGLOSS, a broad-coverage Japanese-English MT
system.Comment: 7 pages, Compressed and uuencoded postscript. To appear: IJCAI-9
Lexical representation explains cortical entrainment during speech comprehension
Results from a recent neuroimaging study on spoken sentence comprehension
have been interpreted as evidence for cortical entrainment to hierarchical
syntactic structure. We present a simple computational model that predicts the
power spectra from this study, even though the model's linguistic knowledge is
restricted to the lexical level, and word-level representations are not
combined into higher-level units (phrases or sentences). Hence, the cortical
entrainment results can also be explained from the lexical properties of the
stimuli, without recourse to hierarchical syntax.Comment: Submitted for publicatio
- …