457 research outputs found

    On the Processing and Analysis of Microtexts: From Normalization to Semantics

    Get PDF
    Trátase dun resumo estendido da ponencia[Abstract] User-generated content published on microblogging social platforms constitutes an invaluable source of information for diverse purposes: health surveillance, business intelligence, political analysis, etc. We present an overview of our work on the field of microtext processing covering the entire pipeline: from input preprocessing to high-level text mining applications.Ministerio de Economía Industria y Competitividad; FFI2014-51978-C2-2-RMinisterio de Economía Industria y Competitividad; TIN2017–85160–C2–1-RMinisterio de Economía Industria y Competitividad; TIN2017–85160–C2–2-RMinisterio de Economía Industria y Competitividad; BES-2015-073768Ministerio de Economía Industria y Competitividad; FFI2014-51978-C2-1-RXunta de Galicia; ED431D 2017/1

    Language and Linguistics in a Complex World Data, Interdisciplinarity, Transfer, and the Next Generation. ICAME41 Extended Book of Abstracts

    Get PDF
    This is a collection of papers, work-in-progress reports, and other contributions that were part of the ICAME41 digital conference

    Language and Linguistics in a Complex World Data, Interdisciplinarity, Transfer, and the Next Generation. ICAME41 Extended Book of Abstracts

    Get PDF
    This is a collection of papers, work-in-progress reports, and other contributions that were part of the ICAME41 digital conference

    Descubriendo temas en Twitter sobre el brote del COVID-19 en España

    Get PDF
    [Resumen] En este trabajo, analizamos lo que los usuarios han estado discutiendo en Twitter durante el comienzo de la pandemia causada por el COVID-19. Concretamente, analizamos tres fases diferenciadas de la crisis del COVID-19 en España: el propio tiempo de pre-crisis, el estallido de la enfermedad y el confinamiento. Para llevar esto a cabo, primero recolectamos una gran cantidad de tuits que son preprocesados. A continuación, agrupamos los tuits en distintas temáticas usando un modelo de Latent Dirichlet Allocation, y definimos estrategias generativas y discriminativas para extraer las palabras clave y oraciones más representativas para cada tema. Finalmente, incluimos un exhaustivo análisis cualitativo sobre dichos temas, y cómo estos se corresponden con distintas problemáticas surgidas en España en distintos momentos de la crisis.[Abstract] In this work, we apply topic modeling to study what users have been discussing in Twitter during the beginning of the COVID-19 pandemic. More particularly, we explore the period of time that includes three differentiated phases of the COVID-19 crisis in Spain: the pre-crisis time, the outbreak, and the beginning of the lockdown. To do so, we first collect a large corpus of Spanish tweets and clean them. Then, we cluster the tweets into topics using a Latent Dirichlet Allocation model, and define generative and discriminative routes to later extract the most relevant keywords and sentences for each topic. Finally, we provide an exhaustive qualitative analysis about how such topics correspond to the situation in Spain at different stages of the crisis.MMAT has been partially funded by Barcelona Supercomputing Center (BSC) through the Spanish Plan for advancement of Language Technologies `Plan TL' and the Secretaría de Estado de Digitalización e Inteligencia Artificial (SEDIA). DV is supported by MINECO (TIN2017-85160-C2-1-R), by Xunta de Galicia (ED431C 2020/11), by Centro de Investigación de Galicia `CITIC' (European Regional Development Fund-Galicia 2014-2020 Program, ED431G 2019/01), and by a 2020 Leonardo Grant for Researchers and Cultural Creators from the BBVA FoundationXunta de Galicia; ED431C 2020/11Xunta de Galicia; ED431G 2019/0

    Introduction to the second international symposium of platial information science

    Get PDF
    People ‘live’ and constitute places every day through recurrent practices and experience. Our everyday lives, however, are complex, and so are places. In contrast to abstract space, the way people experience places includes a range of aspects like physical setting, meaning, and emotional attachment. This inherent complexity requires researchers to investigate the concept of place from a variety of viewpoints. The formal representation of place – a major goal in GIScience related to place – is no exception and can only be successfully addressed if we consider geographical, psychological, anthropological, sociological, cognitive, and other perspectives. This year’s symposium brings together place-based researchers from different disciplines to discuss the current state of platial research. Therefore, this volume contains contributions from a range of fields including geography, psychology, cognitive science, linguistics, and cartography

    HiER 2015. Proceedings des 9. Hildesheimer Evaluierungs- und Retrievalworkshop

    Get PDF
    Die Digitalisierung formt unsere Informationsumwelten. Disruptive Technologien dringen verstärkt und immer schneller in unseren Alltag ein und verändern unser Informations- und Kommunikationsverhalten. Informationsmärkte wandeln sich. Der 9. Hildesheimer Evaluierungs- und Retrievalworkshop HIER 2015 thematisiert die Gestaltung und Evaluierung von Informationssystemen vor dem Hintergrund der sich beschleunigenden Digitalisierung. Im Fokus stehen die folgenden Themen: Digital Humanities, Internetsuche und Online Marketing, Information Seeking und nutzerzentrierte Entwicklung, E-Learning

    Over-reliance on English hinders cognitive science

    Get PDF
    English is the dominant language in the study of human cognition and behavior: the individuals studied by cognitive scientists, as well as most of the scientists themselves, are frequently English speakers. However, English differs from other languages in ways that have consequences for the whole of the cognitive sciences, reaching far beyond the study of language itself. Here, we review an emerging body of evidence that highlights how the particular characteristics of English and the linguistic habits of English speakers bias the field by both warping research programs (e.g., overemphasizing features and mechanisms present in English over others) and overgeneralizing observations from English speakers’ behaviors, brains, and cognition to our entire species. We propose mitigating strategies that could help avoid some of these pitfalls
    • …
    corecore