457 research outputs found
On the Processing and Analysis of Microtexts: From Normalization to Semantics
Trátase dun resumo estendido da ponencia[Abstract] User-generated content published on microblogging social platforms constitutes an invaluable source of information for diverse purposes: health surveillance, business intelligence, political analysis, etc. We present an overview of our work on the field of microtext processing covering the entire pipeline: from input preprocessing to high-level text mining applications.Ministerio de EconomÃa Industria y Competitividad; FFI2014-51978-C2-2-RMinisterio de EconomÃa Industria y Competitividad; TIN2017–85160–C2–1-RMinisterio de EconomÃa Industria y Competitividad; TIN2017–85160–C2–2-RMinisterio de EconomÃa Industria y Competitividad; BES-2015-073768Ministerio de EconomÃa Industria y Competitividad; FFI2014-51978-C2-1-RXunta de Galicia; ED431D 2017/1
Language and Linguistics in a Complex World Data, Interdisciplinarity, Transfer, and the Next Generation. ICAME41 Extended Book of Abstracts
This is a collection of papers, work-in-progress reports, and other contributions that were part of the ICAME41 digital conference
Language and Linguistics in a Complex World Data, Interdisciplinarity, Transfer, and the Next Generation. ICAME41 Extended Book of Abstracts
This is a collection of papers, work-in-progress reports, and other contributions that were part of the ICAME41 digital conference
Descubriendo temas en Twitter sobre el brote del COVID-19 en España
[Resumen] En este trabajo, analizamos lo que los usuarios han estado discutiendo en Twitter durante el comienzo de la pandemia causada por el COVID-19. Concretamente, analizamos tres fases diferenciadas de la crisis del COVID-19 en España: el propio tiempo de pre-crisis, el estallido de la enfermedad y el confinamiento. Para llevar esto a cabo, primero recolectamos una gran cantidad de tuits que son preprocesados. A continuación, agrupamos los tuits en distintas temáticas usando un modelo de Latent Dirichlet Allocation, y definimos estrategias generativas y discriminativas para extraer las palabras clave y oraciones más representativas para cada tema. Finalmente, incluimos un exhaustivo análisis cualitativo sobre dichos temas, y cómo estos se corresponden con distintas problemáticas surgidas en España en distintos momentos de la crisis.[Abstract] In this work, we apply topic modeling to study what users have been discussing in Twitter during the beginning of the COVID-19 pandemic. More particularly, we explore the period of time that includes three differentiated phases of the COVID-19 crisis in Spain: the pre-crisis time, the outbreak, and the beginning of the lockdown. To do so, we first collect a large corpus of Spanish tweets and clean them. Then, we cluster the tweets into topics using a Latent Dirichlet Allocation model, and define generative and discriminative routes to later extract the most relevant keywords and sentences for each topic. Finally, we provide an exhaustive qualitative analysis about how such topics correspond to the situation in Spain at different stages of the crisis.MMAT has been partially funded by Barcelona Supercomputing Center (BSC) through the Spanish Plan for advancement of Language Technologies `Plan TL' and the SecretarÃa de Estado de Digitalización e Inteligencia Artificial (SEDIA). DV is supported by MINECO (TIN2017-85160-C2-1-R), by Xunta de Galicia (ED431C 2020/11), by Centro de Investigación de Galicia `CITIC' (European Regional Development Fund-Galicia 2014-2020 Program, ED431G 2019/01), and by a 2020 Leonardo Grant for Researchers and Cultural Creators from the BBVA FoundationXunta de Galicia; ED431C 2020/11Xunta de Galicia; ED431G 2019/0
Introduction to the second international symposium of platial information science
People ‘live’ and constitute places every day through recurrent practices and experience. Our everyday lives, however, are complex, and so are places. In contrast to abstract space, the way people experience places includes a range of aspects like physical setting, meaning, and emotional attachment. This inherent complexity requires researchers to investigate the concept of place from a variety of viewpoints. The formal representation of place – a major goal in GIScience related to place – is no exception and can only be successfully addressed if we consider geographical, psychological, anthropological, sociological, cognitive, and other perspectives. This year’s symposium brings together place-based researchers from different disciplines to discuss the current state of platial research. Therefore, this volume contains contributions from a range of fields including geography, psychology, cognitive science, linguistics, and cartography
Recommended from our members
Large-scale Affective Computing for Visual Multimedia
In recent years, Affective Computing has arisen as a prolific interdisciplinary field for engineering systems that integrate human affections. While human-computer relationships have long revolved around cognitive interactions, it is becoming increasingly important to account for human affect, or feelings or emotions, to avert user experience frustration, provide disability services, predict virality of social media content, etc. In this thesis, we specifically focus on Affective Computing as it applies to large-scale visual multimedia, and in particular, still images, animated image sequences and video streams, above and beyond the traditional approaches of face expression and gesture recognition. By taking a principled psychology-grounded approach, we seek to paint a more holistic and colorful view of computational affect in the context of visual multimedia. For example, should emotions like 'surprise' and `fear' be assumed to be orthogonal output dimensions? Or does a 'positive' image in one culture's view elicit the same feelings of positivity in another culture? We study affect frameworks and ontologies to define, organize and develop machine learning models with such questions in mind to automatically detect affective visual concepts.
In the push for what we call "Big Affective Computing," we focus on two dimensions of scale for affect -- scaling up and scaling out -- which we propose are both imperative if we are to scale the Affective Computing problem successfully. Intuitively, simply increasing the number of data points corresponds to "scaling up". However, less intuitive, is when problems like Affective Computing "scale out," or diversify. We show that this latter dimension of introducing data variety, alongside the former of introducing data volume, can yield particular insights since human affections naturally depart from traditional Machine Learning and Computer Vision problems where there is an objectively truthful target. While no one might debate a picture of a 'dog' should be tagged as a 'dog,' but not all may agree that it looks 'ugly'. We present extensive discussions on why scaling out is critical and how it can be accomplished while in the context of large-volume visual data.
At a high-level, the main contributions of this thesis include:
Multiplicity of Affect Oracles:
Prior to the work in this thesis, little consideration has been paid to the affective label generating mechanism when learning functional mappings between inputs and labels. Throughout this thesis but first in Chapter 2, starting in Section 2.1.2, we make a case for a conceptual partitioning of the affect oracle governing the label generation process in Affective Computing problems resulting a multiplicity of oracles, whereas prior works assumed there was a single universal oracle. In Chapter 3, the differences between intended versus expressed versus induced versus perceived emotion are discussed, where we argue that perceived emotion is particularly well-suited for scaling up because it reduces the label variance due to its more objective nature compared to other affect states. And in Chapter 4 and 5, a division of the affect oracle along cultural lines with manifestations along both language and geography is explored. We accomplish all this without sacrificing the 'scale up' dimension, and tackle significantly larger volume problems than prior comparable visual affective computing research.
Content-driven Visual Affect Detection:
Traditionally, in most Affective Computing work, prediction tasks use psycho-physiological signals from subjects viewing the stimuli of interest, e.g., a video advertisement, as the system inputs. In essence, this means that the machine learns to label a proxy signal rather than the stimuli itself. In this thesis, with the rise of strong Computer Vision and Multimedia techniques, we focus on the learning to label the stimuli directly without a human subject provided biometric proxy signal (except in the unique circumstances of Chapter 7). This shift toward learning from the stimuli directly is important because it allows us to scale up with much greater ease given that biometric measurement acquisition is both low-throughput and somewhat invasive while stimuli are often readily available. In addition, moving toward learning directly from the stimuli will allow researchers to precisely determine which low-level features in the stimuli are actually coupled with affect states, e.g., which set of frames caused viewer discomfort rather a broad sense that a video was discomforting. In Part I of this thesis, we illustrate an emotion prediction task with a psychology-grounded affect representation. In particular, in Chapter 3, we develop a prediction task over semantic emotional classes, e.g., 'sad,' 'happy' and 'angry,' using animated image sequences given annotations from over 2.5 million users. Subsequently, in Part II, we develop visual sentiment and adjective-based semantics models from million-scale digital imagery mined from a social multimedia platform.
Mid-level Representations for Visual Affect:
While discrete semantic emotions and sentiment are classical representations of affect with decades of psychology grounding, the interdisciplinary nature of Affective Computing, now only about two decades old, allows for new avenues of representation. Mid-level representations have been proposed in numerous Computer Vision and Multimedia problems as an intermediary, and often more computable, step toward bridging the semantic gap between low-level system inputs and high-level label semantic abstractions. In Part II, inspired by this work, we adapt it for vision-based Affective Computing and adopt a semantic construct called adjective-noun pairs. Specifically, in Chapter 4, we explore the use of such adjective-noun pairs in the context of a social multimedia platform and develop a multilingual visual sentiment ontology with over 15,000 affective mid-level visual concepts across 12 languages associated with over 7.3 million images and representations from over 235 countries, resulting in the largest affective digital image corpus in both depth and breadth to date. In Chapter 5, we develop computational methods to predict such adjective-noun pairs and also explore their usefulness in traditional sentiment analysis but with a previously unexplored cross-lingual perspective. And in Chapter 6, we propose a new learning setting called 'cross-residual learning' building off recent successes in deep neural networks, and specifically, in residual learning; we show that cross-residual learning can be used effectively to jointly learn across even multiple related tasks in object detection (noun), more traditional affect modeling (adjectives), and affective mid-level representations (adjective-noun pairs), giving us a framework for better grounding the adjective-noun pair bridge in both vision and affect simultaneously
HiER 2015. Proceedings des 9. Hildesheimer Evaluierungs- und Retrievalworkshop
Die Digitalisierung formt unsere Informationsumwelten. Disruptive Technologien dringen verstärkt und immer schneller in unseren Alltag ein und verändern unser Informations- und Kommunikationsverhalten. Informationsmärkte wandeln sich. Der 9. Hildesheimer Evaluierungs- und Retrievalworkshop HIER 2015 thematisiert die Gestaltung und Evaluierung von Informationssystemen vor dem Hintergrund der sich beschleunigenden Digitalisierung. Im Fokus stehen die folgenden Themen: Digital Humanities, Internetsuche und Online Marketing, Information Seeking und nutzerzentrierte Entwicklung, E-Learning
Over-reliance on English hinders cognitive science
English is the dominant language in the study of human cognition and behavior: the individuals studied by cognitive scientists, as well as most of the scientists themselves, are frequently English speakers. However, English differs from other languages in ways that have consequences for the whole of the cognitive sciences, reaching far beyond the study of language itself. Here, we review an emerging body of evidence that highlights how the particular characteristics of English and the linguistic habits of English speakers bias the field by both warping research programs (e.g., overemphasizing features and mechanisms present in English over others) and overgeneralizing observations from English speakers’ behaviors, brains, and cognition to our entire species. We propose mitigating strategies that could help avoid some of these pitfalls
- …