13 research outputs found
Overview of the GermEval 2020 shared task on Swiss German language identification
In this paper, we present the findings of the Shared Task on Swiss German Language Identification organised as part of the 7th edition of GermEval, co-locatedwith SwissText and KONVENS 2020
Human-in-the-Loop Hate Speech Classification in a Multilingual Context
The shift of public debate to the digital sphere has been accompanied by a rise in online hate speech. While many promising approaches for hate speech classification have been pro- posed, studies often focus only on a single language, usually English, and do not address three key concerns: post-deployment perfor- mance, classifier maintenance and infrastruc- tural limitations. In this paper, we introduce a new human-in-the-loop BERT-based hate speech classification pipeline and trace its de- velopment from initial data collection and an- notation all the way to post-deployment. Our classifier, trained using data from our original corpus of over 422k examples, is specifically developed for the inherently multilingual set- ting of Switzerland and outperforms with its F1 score of 80.5 the currently best-performing BERT-based multilingual classifier by 5.8 F1 points in German and 3.6 F1 points in French. Our systematic evaluations over a 12-month period further highlight the vital importance of continuous, human-in-the-loop classifier main- tenance to ensure robust hate speech classifica- tion post-deployment
ZHAW-InIT : social media geolocation at VarDial 2020
We describe our approaches for the Social Media Geolocation (SMG) task at the VarDial Evaluation Campaign 2020. The goal was to predict geographical location (latitudes and longitudes) given an input text. There were three subtasks corresponding to German-speaking Switzerland (CH), Germany and Austria (DE-AT), and Croatia, Bosnia and Herzegovina, Montenegro and Serbia (BCMS). We submitted solutions to all subtasks but focused our development efforts on the CH subtask, where we achieved third place out of 16 submissions with a median distance of 15.93 km and had the best result of 14 unconstrained systems. In the DE-AT subtask, we ranked sixth out of ten submissions (fourth of 8 unconstrained systems) and for BCMS we achieved fourth place out of 13 submissions (second of 11 unconstrained systems)
Multimodal Emotion Recognition among Couples from Lab Settings to Daily Life using Smartwatches
Couples generally manage chronic diseases together and the management takes
an emotional toll on both patients and their romantic partners. Consequently,
recognizing the emotions of each partner in daily life could provide an insight
into their emotional well-being in chronic disease management. The emotions of
partners are currently inferred in the lab and daily life using self-reports
which are not practical for continuous emotion assessment or observer reports
which are manual, time-intensive, and costly. Currently, there exists no
comprehensive overview of works on emotion recognition among couples.
Furthermore, approaches for emotion recognition among couples have (1) focused
on English-speaking couples in the U.S., (2) used data collected from the lab,
and (3) performed recognition using observer ratings rather than partner's
self-reported / subjective emotions. In this body of work contained in this
thesis (8 papers - 5 published and 3 currently under review in various
journals), we fill the current literature gap on couples' emotion recognition,
develop emotion recognition systems using 161 hours of data from a total of
1,051 individuals, and make contributions towards taking couples' emotion
recognition from the lab which is the status quo, to daily life. This thesis
contributes toward building automated emotion recognition systems that would
eventually enable partners to monitor their emotions in daily life and enable
the delivery of interventions to improve their emotional well-being.Comment: PhD Thesis, 2022 - ETH Zuric
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020
Welcome to EVALITA 2020! EVALITA is the evaluation campaign of Natural Language Processing and Speech Tools for Italian. EVALITA is an initiative of the Italian Association for Computational Linguistics (AILC, http://www.ai-lc.it) and it is endorsed by the Italian Association for Artificial Intelligence (AIxIA, http://www.aixia.it) and the Italian Association for Speech Sciences (AISV, http://www.aisv.it)
Geographic information extraction from texts
A large volume of unstructured texts, containing valuable geographic information, is available online. This information – provided implicitly or explicitly – is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020
Welcome to EVALITA 2020! EVALITA is the evaluation campaign of Natural Language Processing and Speech Tools for Italian. EVALITA is an initiative of the Italian Association for Computational Linguistics (AILC, http://www.ai-lc.it) and it is endorsed by the Italian Association for Artificial Intelligence (AIxIA, http://www.aixia.it) and the Italian Association for Speech Sciences (AISV, http://www.aisv.it)