Search CORE

7,445 research outputs found

Speech Emotion Diarization: Which Emotion Appears When?

Author: Nfissi Alaa
Ravanelli Mirco
Wang Yingzhi
Yacoubi Alya
Publication venue
Publication date: 22/06/2023
Field of study

Speech Emotion Recognition (SER) typically relies on utterance-level solutions. However, emotions conveyed through speech should be considered as discrete speech events with definite temporal boundaries, rather than attributes of the entire utterance. To reflect the fine-grained nature of speech emotions, we propose a new task: Speech Emotion Diarization (SED). Just as Speaker Diarization answers the question of "Who speaks when?", Speech Emotion Diarization answers the question of "Which emotion appears when?". To facilitate the evaluation of the performance and establish a common benchmark for researchers, we introduce the Zaion Emotion Dataset (ZED), an openly accessible speech emotion dataset that includes non-acted emotions recorded in real-life conditions, along with manually-annotated boundaries of emotion segments within the utterance. We provide competitive baselines and open-source the code and the pre-trained models

arXiv.org e-Print Archive

Fourteenth Biennial Status Report: März 2017 - February 2019

Author
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/2019
Field of study

AnuraSet: A dataset for benchmarking Neotropical anuran calls identification in passive acoustic monitoring

Author: Bastos Rogério Pereira
Bautista Breyner Posso
Bernardy José Vinícius
Carvalho-Rocha Vítor
Cañas Juan Sebastián
da Rosa Anderson
de Souza Franco Leandro
Dena Simone
Domingos Adão Henrique Rosa
Llusia Diego
Neckel-Oliveira Selvino
Restrepo Hernán Darío Benítez
Rudas Jorge
Santos Carolina Emília dos
Sugai José Luiz Massao Moreira
Sugai Larissa Sayuri Moreira
Toledo Luís Felipe
Toro-Gómez Maria Paula
Ulloa Juan Sebastián
Publication venue
Publication date: 11/07/2023
Field of study

Global change is predicted to induce shifts in anuran acoustic behavior, which can be studied through passive acoustic monitoring (PAM). Understanding changes in calling behavior requires the identification of anuran species, which is challenging due to the particular characteristics of neotropical soundscapes. In this paper, we introduce a large-scale multi-species dataset of anuran amphibians calls recorded by PAM, that comprises 27 hours of expert annotations for 42 different species from two Brazilian biomes. We provide open access to the dataset, including the raw recordings, experimental setup code, and a benchmark with a baseline model of the fine-grained categorization problem. Additionally, we highlight the challenges of the dataset to encourage machine learning researchers to solve the problem of anuran call identification towards conservation policy. All our experiments and resources can be found on our GitHub repository https://github.com/soundclim/anuraset

arXiv.org e-Print Archive

Human-centred artificial intelligence for mobile health sensing:challenges and opportunities

Author: Dang Ting
Ghosh Abhirup
Mascolo Cecilia
Spathis Dimitris
Publication venue
Publication date: 01/11/2023
Field of study

Advances in wearable sensing and mobile computing have enabled the collection of health and well-being data outside of traditional laboratory and hospital settings, paving the way for a new era of mobile health. Meanwhile, artificial intelligence (AI) has made significant strides in various domains, demonstrating its potential to revolutionize healthcare. Devices can now diagnose diseases, predict heart irregularities and unlock the full potential of human cognition. However, the application of machine learning (ML) to mobile health sensing poses unique challenges due to noisy sensor measurements, high-dimensional data, sparse and irregular time series, heterogeneity in data, privacy concerns and resource constraints. Despite the recognition of the value of mobile sensing, leveraging these datasets has lagged behind other areas of ML. Furthermore, obtaining quality annotations and ground truth for such data is often expensive or impractical. While recent large-scale longitudinal studies have shown promise in leveraging wearable sensor data for health monitoring and prediction, they also introduce new challenges for data modelling. This paper explores the challenges and opportunities of human-centred AI for mobile health, focusing on key sensing modalities such as audio, location and activity tracking. We discuss the limitations of current approaches and propose potential solutions

Directory of Open Access Journals