36 research outputs found

    Is the Red Square Big? MALeViC: Modeling Adjectives Leveraging Visual Contexts

    Get PDF
    This work aims at modeling how the meaning of gradable adjectives of size (`big', `small') can be learned from visually-grounded contexts. Inspired by cognitive and linguistic evidence showing that the use of these expressions relies on setting a threshold that is dependent on a specific context, we investigate the ability of multi-modal models in assessing whether an object is `big' or `small' in a given visual scene. In contrast with the standard computational approach that simplistically treats gradable adjectives as `fixed' attributes, we pose the problem as relational: to be successful, a model has to consider the full visual context. By means of four main tasks, we show that state-of-the-art models (but not a relatively strong baseline) can learn the function subtending the meaning of size adjectives, though their performance is found to decrease while moving from simple to more complex tasks. Crucially, models fail in developing abstract representations of gradable adjectives that can be used compositionally.Comment: Accepted at EMNLP-IJCNLP 201

    HiER 2015. Proceedings des 9. Hildesheimer Evaluierungs- und Retrievalworkshop

    Get PDF
    Die Digitalisierung formt unsere Informationsumwelten. Disruptive Technologien dringen verstärkt und immer schneller in unseren Alltag ein und verändern unser Informations- und Kommunikationsverhalten. Informationsmärkte wandeln sich. Der 9. Hildesheimer Evaluierungs- und Retrievalworkshop HIER 2015 thematisiert die Gestaltung und Evaluierung von Informationssystemen vor dem Hintergrund der sich beschleunigenden Digitalisierung. Im Fokus stehen die folgenden Themen: Digital Humanities, Internetsuche und Online Marketing, Information Seeking und nutzerzentrierte Entwicklung, E-Learning

    Examples and Specifications that Prove a Point: Identifying Elaborative and Argumentative Discourse Relations

    Get PDF
    Examples and specifications occur frequently in text, but not much is known about how they function in discourse and how readers interpret them. Looking at how they’re annotated in existing discourse corpora, we find that annotators often disagree on these types of relations; specifically, there is disagreement about whether these relations are elaborative (additive) or argumentative (pragmatic causal). To investigate how readers interpret examples and specifications, we conducted a crowdsourced discourse annotation study. The results show that these relations can indeed have two functions: they can be used to both illustrate/specify a situation and serve as an argument for a claim. These findings suggest that examples and specifications can have multiple simultaneous readings. We discuss the implications of these results for discourse annotation.&nbsp

    The value of numbers in clinical text classification

    Get PDF
    Clinical text often includes numbers of various types and formats. However, most current text classification approaches do not take advantage of these numbers. This study aims to demonstrate that using numbers as features can significantly improve the performance of text classification models. This study also demonstrates the feasibility of extracting such features from clinical text. Unsupervised learning was used to identify patterns of number usage in clinical text. These patterns were analyzed manually and converted into pattern-matching rules. Information extraction was used to incorporate numbers as features into a document representation model. We evaluated text classification models trained on such representation. Our experiments were performed with two document representation models (vector space model and word embedding model) and two classification models (support vector machines and neural networks). The results showed that even a handful of numerical features can significantly improve text classification performance. We conclude that commonly used document representations do not represent numbers in a way that machine learning algorithms can effectively utilize them as features. Although we demonstrated that traditional information extraction can be effective in converting numbers into features, further community-wide research is required to systematically incorporate number representation into the word embedding process

    A corpus analysis of online news comments using the Appraisal framework

    Get PDF
    We present detailed analyses of the distribution of Appraisal categories (Martin and White, 2005) in a corpus of online news comments. The corpus consists of just over one thousand comments posted in response to a variety of opinion pieces on the website of the Canadian newspaper The Globe and Mail. We annotated all the comments with labels corresponding to different categories of the Appraisal framework. Analyses of the annotations show that comments are overwhelmingly negative, and that they favour two of the subtypes of Attitude (Judgment and Appreciation) over the third, Affect. The paper contributes a methodology for annotating Appraisal, and results that show the interaction of Appraisal with negation, the constructive (or not) nature of comments, and the level of toxicity found in them. The results show that highly opinionated language is expressed as an objective opinion (Judgement and Appreciation) rather than an emotional reaction (Affect). This finding, together with the interplay of evaluative language with constructiveness and toxicity in the comments, can be applied to the automatic moderation of comments
    corecore