66 research outputs found
Multimedia search without visual analysis: the value of linguistic and contextual information
This paper addresses the focus of this special issue by analyzing the potential contribution of linguistic content and other non-image aspects to the processing of audiovisual data. It summarizes the various ways in which linguistic content analysis contributes to enhancing the semantic annotation of multimedia content, and, as a consequence, to improving the effectiveness of conceptual media access tools. A number of techniques are presented, including the time-alignment of textual resources, audio and speech processing, content reduction and reasoning tools, and the exploitation of surface features
Synonymy and Translation
This paper is meant to give some insight into the interaction between on the one hand theoretical concepts in the field of formal semantics, and on the other hand linguistic research directed towards an application, more specifically, the research in the machine translation project Rosetta. The central notion is ‘synonymy’. It will be used to discuss sameness of meaning for expressions belonging to different languages
Sommige niet, andere wel; de verklaring van een raadselachtig verschil
This paper (in English) presents a comparison of two analyses of so-called existential sentences: Milsark (1977) and Zwarts (1981). The goal is to increase the insight into the peculiarities of the Dutch quantifying expression (or determiner) 'sommige' (in English: some of the). The contrast between '*Er spelen sommige kinderen op straat' and 'Er spelen enkele kinderen op straat' is the focus of attention
Speech-based recognition of self-reported and observed emotion in a dimensional space
The differences between self-reported and observed emotion have only marginally been investigated in the context of speech-based automatic emotion recognition. We address this issue by comparing self-reported emotion ratings to observed emotion ratings and look at how differences between these two types of ratings affect the development and performance of automatic emotion recognizers developed with these ratings. A dimensional approach to emotion modeling is adopted: the ratings are based on continuous arousal and valence scales. We describe the TNO-Gaming Corpus that contains spontaneous vocal and facial expressions elicited via a multiplayer videogame and that includes emotion annotations obtained via self-report and observation by outside observers. Comparisons show that there are discrepancies between self-reported and observed emotion ratings which are also reflected in the performance of the emotion recognizers developed. Using Support Vector Regression in combination with acoustic and textual features, recognizers of arousal and valence are developed that can predict points in a 2-dimensional arousal-valence space. The results of these recognizers show that the self-reported emotion is much harder to recognize than the observed emotion, and that averaging ratings from multiple observers improves performance
Temporal Language Models for the Disclosure of Historical Text
Contains fulltext :
228230.pdf (preprint version ) (Open Access
The Influence of Basic Tokenization on Biomedical Document Retrieval
Tokenization is a fundamental preprocessing step in Information Retrieval systems in which text is turned into index terms. This paper quantifies and compares the influence of various simple tokenization techniques on document retrieval effectiveness in two domains: biomedicine and news. As expected, biomedical retrieval is more sensitive to small changes in the tokenization method. The tokenization strategy can make the difference between a mediocre and well performing IR system, especially in the biomedical domain
Arousal and Valence Prediction in Spontaneous Emotional Speech: Felt versus Perceived Emotion
In this paper, we describe emotion recognition experiments carried out for spontaneous affective speech with the aim to compare the added value of annotation of felt emotion versus annotation of perceived emotion. Using speech material available in the TNO-GAMING corpus (a corpus containing audiovisual recordings of people playing videogames), speech-based affect recognizers were developed that can predict Arousal and Valence scalar values. Two types of recognizers were developed in parallel: one trained with felt emotion annotations (generated by the gamers themselves) and one trained with perceived/observed emotion annotations (generated by a group of observers). The experiments showed that, in speech, with the methods and features currently used, observed emotions are easier to predict than felt emotions. The results suggest that recognition performance strongly depends on how and by whom the emotion annotations are carried out. \u
- …