22,298 research outputs found

    Learning semantic sentence representations from visually grounded language without lexical knowledge

    Get PDF
    Current approaches to learning semantic representations of sentences often use prior word-level knowledge. The current study aims to leverage visual information in order to capture sentence level semantics without the need for word embeddings. We use a multimodal sentence encoder trained on a corpus of images with matching text captions to produce visually grounded sentence embeddings. Deep Neural Networks are trained to map the two modalities to a common embedding space such that for an image the corresponding caption can be retrieved and vice versa. We show that our model achieves results comparable to the current state-of-the-art on two popular image-caption retrieval benchmark data sets: MSCOCO and Flickr8k. We evaluate the semantic content of the resulting sentence embeddings using the data from the Semantic Textual Similarity benchmark task and show that the multimodal embeddings correlate well with human semantic similarity judgements. The system achieves state-of-the-art results on several of these benchmarks, which shows that a system trained solely on multimodal data, without assuming any word representations, is able to capture sentence level semantics. Importantly, this result shows that we do not need prior knowledge of lexical level semantics in order to model sentence level semantics. These findings demonstrate the importance of visual information in semantics

    Political Text Scaling Meets Computational Semantics

    Full text link
    During the last fifteen years, automatic text scaling has become one of the key tools of the Text as Data community in political science. Prominent text scaling algorithms, however, rely on the assumption that latent positions can be captured just by leveraging the information about word frequencies in documents under study. We challenge this traditional view and present a new, semantically aware text scaling algorithm, SemScale, which combines recent developments in the area of computational linguistics with unsupervised graph-based clustering. We conduct an extensive quantitative analysis over a collection of speeches from the European Parliament in five different languages and from two different legislative terms, and show that a scaling approach relying on semantic document representations is often better at capturing known underlying political dimensions than the established frequency-based (i.e., symbolic) scaling method. We further validate our findings through a series of experiments focused on text preprocessing and feature selection, document representation, scaling of party manifestos, and a supervised extension of our algorithm. To catalyze further research on this new branch of text scaling methods, we release a Python implementation of SemScale with all included data sets and evaluation procedures.Comment: Updated version - accepted for Transactions on Data Science (TDS

    The formulation of argument structure in SLI: an eye-movement study

    Get PDF
    This study investigated the formulation of verb argument structure in Catalan- and Spanishspeaking children with specific language impairment (SLI) and typically developing age-matched controls. We compared how language production can be guided by conceptual factors, such as the organization of the entities participating in an event and knowledge regarding argument structure. Eleven children with SLI (aged 3;8 to 6;6) and eleven control children participated in an eyetracking experiment in which participants had to describe events with different argument structure in the presence of visual scenes. Picture descriptions, latency time and eye movements were recorded and analyzed. The picture description results showed that the percentage of responses in which children with SLI substituted a non-target verb for the target verb was significantly different from that for the control group. Children with SLI made more omissions of obligatory arguments, especially of themes, as the verb argument complexity increased. Moreover, when the number of arguments of the verb increased, the children took more time to begin their descriptions, but no differences between groups were found. For verb type latency, all children were significantly faster to start describing one-argument events than two- and three-argument events. No differences in latency time were found between two- and three-argument events. There were no significant differences between the groups. Eye-movement showed that children with SLI looked less at the event zone than the age-matched controls during the first two seconds. These differences between the groups were significant for three-argument verbs, and only marginally significant for one- and two-argument verbs. Children with SLI also spent significantly less time looking at the theme zones than their age-matched controls. We suggest that both processing limitations and deficits in the semantic representation of verbs may play a role in these difficulties

    Filling Knowledge Gaps in a Broad-Coverage Machine Translation System

    Full text link
    Knowledge-based machine translation (KBMT) techniques yield high quality in domains with detailed semantic models, limited vocabulary, and controlled input grammar. Scaling up along these dimensions means acquiring large knowledge resources. It also means behaving reasonably when definitive knowledge is not yet available. This paper describes how we can fill various KBMT knowledge gaps, often using robust statistical techniques. We describe quantitative and qualitative results from JAPANGLOSS, a broad-coverage Japanese-English MT system.Comment: 7 pages, Compressed and uuencoded postscript. To appear: IJCAI-9

    Lexical representation explains cortical entrainment during speech comprehension

    Get PDF
    Results from a recent neuroimaging study on spoken sentence comprehension have been interpreted as evidence for cortical entrainment to hierarchical syntactic structure. We present a simple computational model that predicts the power spectra from this study, even though the model's linguistic knowledge is restricted to the lexical level, and word-level representations are not combined into higher-level units (phrases or sentences). Hence, the cortical entrainment results can also be explained from the lexical properties of the stimuli, without recourse to hierarchical syntax.Comment: Submitted for publicatio
    • …
    corecore