Search CORE

17 research outputs found

Real-time acoustic event classification in urban environments using low-cost devices

Author: Vidaña Vila Ester
Publication venue: Blanquerna - Universitat Ramon Llull
Publication date: 06/04/2022
Field of study

En la societat moderna i en constant evolució, la presència de soroll s'ha convertit en un perill diari per a una quantitat preocupant de la població. Estar sobreexposats a alts nivells de soroll pot interferir en activitats quotidianes i podria causar greus efectes secundaris en termes de salut com mal humor, deteriorament cognitiu en nens o malalties cardiovasculars. Hi ha estudis que assenyalen que no només afecta el nivell de soroll al qual estan exposats els ciutadans, sinó que també és important el tipus de so. Així doncs, no tots els esdeveniments acústics tenen el mateix impacte en la població. Amb les tecnologies que es fan servir actualment per a monitorar la contaminació acústica, és difícil identificar automàticament quins sorolls estan més presents en les zones més contaminades. De fet, per avaluar les queixes dels ciutadans, normalment s'envien tècnics a la zona on s'hi ha produït la queixa per avaluar si aquesta és rellevant. A causa de l'elevat nombre de queixes que es generen diàriament (especialment en zones molt poblades), el desenvolupament de Xarxes de Sensors Acústics Sense Fils (WASN) que monitorin automàticament la contaminació acústica d'una zona s'ha convertit en una tendència d'investigació. En l'actualitat, la majoria de les xarxes desplegades en entorns urbans només mesuren el nivell de soroll equivalent fent servir equipaments cars i precisos però no permeten d'identificar les fonts de soroll presents a cada lloc. Donat l'elevat cost dels sensors, aquests solen col·locar-se en llocs estratègics, però no monitoren zones àmplies. L'objectiu d'aquesta tesi és abordar un important repte que encara està latent en aquest camp: monitorar acústicament zones de gran envergadura en temps real i de forma escalable i econòmica. En aquest sentit, s'ha seleccionat el centre de la ciutat de Barcelona com a cas d'ús de referència per a dur a terme aquesta investigació. En primer lloc, aquesta tesi parteix d'una anàlisi precís d'un conjunt de 6 hores de dades anotades corresponents al paisatge sonor d'una zona concreta de la ciutat (l'Eixample). Després, es presenta una arquitectura distribuïda escalable que fa servir dispositius de baix cost per a reconèixer esdeveniments acústics. Per validar la viabilitat d'aquest enfocament, s'ha implementat un algorisme d'aprenentatge profund que s'executa sobre aquesta arquitectura per a classificar 10 categories acústiques diferents. Com que els nodes del sistema proposats estan disposats en una topologia amb redundància física (més d'un node pot escoltar el mateix esdeveniment acústic simultàniament), s'han recollit dades en quatre punts del centre de Barcelona respectant l'arquitectura dels sensors. Per últim, donat que els esdeveniments del món real tendeixen a produir-se simultàniament, s'ha millorat l'algorisme d'aprenentatge profund perquè suporti la classificació multietiqueta (és a dir, polifònica). Els resultats mostren que, amb l'arquitectura del sistema proposat, és possible classificat esdeveniments acústic en temps real. En general, les contribucions d'aquesta investigació són les següents: (1) el disseny d'una WASN de baix cost i escalable, que pugui monitorar àrees a gran escala i (2) el desenvolupament d'un algorisme de classificació en temps real executat sobre els nodes de detecció dissenyats.En la sociedad moderna y en constante evolución, la presencia de ruido se ha convertido en una amenaza diaria para una cantidad preocupante de la población. Estar sobreexpuesto a altos niveles de ruido puede interferir en las actividades cotidianas y podría acarrear graves efectos secundarios en términos de salud como mal humor, deterioro cognitivo en niños o enfermedades cardiovasculares. Hay estudios que señalan que no solo afecta el nivel de ruido al que están expuestos los ciudadanos: también es importante el tipo de sonido. Es decir, no todos los eventos acústicos tienen el mismo impacto en la población. Con las tecnologías que se utilizan actualmente para monitorizar la contaminación acústica, es difícil identificar automáticamente qué sonidos están más presentes en las zonas más contaminadas. De hecho, para evaluar las quejas de los ciudadanos, normalmente se envían técnicos a la zona donde se ha realizado la queja para evaluar si ésta es relevante. Debido al elevado número de quejas que se generan diariamente (especialmente en zonas muy pobladas), el desarrollo de Redes de Sensores Acústicos Inalámbricos (WASN) que monitoricen automáticamente la contaminación acústica se ha convertido en una tendencia de investigación. Actualmente, la mayoría de redes desplegadas en entornos urbanos solo miden el nivel de ruido equivalente mediante equipos caros y precisos, pero no son capaces de identificar las fuentes de ruido presentes en cada lugar. Dado el elevado precio de estos sensores, los nodos suelen colocarse en lugares estratégicos, pero no monitorizan zonas amplias. El objetivo de esta tesis es abordar un importante reto aún latente en este campo: monitorizar acústicamente zonas de gran tamaño en tiempo real y de forma escalable y económica. En este sentido, se ha seleccionado la ciudad de Barcelona como caso de uso para llevar a cabo esta investigación. Primeramente, esta tesis parte de un análisis preciso de un conjunto de 6 horas de datos anotados correspondientes al paisaje sonoro de una zona concreta de la ciudad (l'Eixample). Después, se presenta una arquitectura distribuida escalable que utiliza dispositivos de bajo coste para reconocer eventos acústicos. Para validar la viabilidad del enfoque, se ha implementado un algoritmo de aprendizaje profundo ejecutado sobre esta arquitectura para clasificar 10 categorías acústicas diferentes. Como los nodos del sistema propuesto están dispuestos en una topología con redundancia física (más de un nodo puede escuchar el mismo evento acústico a la vez), se han recogido datos en cuatro puntos del centro de Barcelona respetando la arquitectura de los sensores. Por último, dado que los eventos del mundo real tienden a producirse simultáneamente, se ha mejorado el algoritmo de aprendizaje profundo para que soporte la clasificación multietiqueta (polifónica). Los resultados muestran que, con la arquitectura del sistema propuesto, es posible clasificar eventos acústicos en tiempo real. En general, las contribuciones de esta investigación son las siguientes (1) diseño de una WASN de bajo coste y escalable, capaz de monitorizar áreas a gran escala y (2) desarrollo de un algoritmo de clasificación en tiempo real ejecutado sobre los nodos de detección diseñados.In the modern and ever-evolving society, the presence of noise has become a daily threat to a worrying amount of the population. Being overexposed to high levels of noise may interfere with day-to-day activities and, thus, could potentially bring severe side-effects in terms of health such as annoyance, cognitive impairment in children or cardiovascular diseases. Some studies point out that it is not only the level of noise that matters but also the type of sound that the citizens are exposed to. That is, not all the acoustic events have the same impact on the population. With current technologies used to track noise levels, for both private and public administrations, it is hard to automatically identify which sounds are more present in most polluted areas. Actually, to assess citizen complaints, technicians are typically sent to the area to be surveyed to evaluate if the complaint is relevant. Due to the high number of complaints that are generated every day (specially in highly populated areas), the development of Wireless Acoustic Sensor Networks (WASN) that would automatically monitor the noise pollution of a certain area have become a research trend. Currently, most of the networks that are deployed in cities measure only the equivalent noise level by means of expensive but highly accurate hardware but cannot identify the noise sources that are present in each spot. Given the elevated price of these sensors, nodes are typically placed in specific locations, but do not monitor wide areas. The purpose of this thesis is to address an important challenge still latent in this field: to acoustically monitor large-scale areas in real-time and in a scalable and cost efficient way. In this regard, the city centre of Barcelona has been selected as a reference use-case scenario to conduct this research. First, this dissertation starts with an accurate analysis of an annotated dataset of 6 hours corresponding to the soundscape of a specific area of the city (l’Eixample). Next, a scalable distributed architecture using low-cost computing devices to recognize acoustic events is presented. To validate the feasibility of this approach, a deep learning algorithm running on top of this architecture has been implemented to classify 10 different acoustic categories. As the sensing nodes of the proposed system are arranged in such a way that it is possible to take advantage of physical redundancy (that is, more than one node may hear the same acoustic event), data has been gathered in four spots of the city centre of Barcelona respecting the sensors topology. Finally, as real-world events tend to occur simultaneously, the deep learning algorithm has been enhanced to support multilabel (i.e., polyphonic) classification. Results show that, with the proposed system architecture, it is possible to classify acoustic events in real-time. Overall, the contributions of this research are the following: (1) the design of a low-cost, scalable WASN able to monitor large-scale areas and (2) the development of a real-time classification algorithm able to run over the designed sensing nodes

Tesis Doctorals en Xarxa

Western Mediterranean wetlands bird species classification: evaluating small-footprint deep learning approaches on a new annotated dataset

Author: Gómez-Gómez Juan
Sevillano Xavier
Vidaña-Vila Ester
Publication venue
Publication date: 12/07/2022
Field of study

The deployment of an expert system running over a wireless acoustic sensors network made up of bioacoustic monitoring devices that recognise bird species from their sounds would enable the automation of many tasks of ecological value, including the analysis of bird population composition or the detection of endangered species in areas of environmental interest. Endowing these devices with accurate audio classification capabilities is possible thanks to the latest advances in artificial intelligence, among which deep learning techniques excel. However, a key issue to make bioacoustic devices affordable is the use of small footprint deep neural networks that can be embedded in resource and battery constrained hardware platforms. For this reason, this work presents a critical comparative analysis between two heavy and large footprint deep neural networks (VGG16 and ResNet50) and a lightweight alternative, MobileNetV2. Our experimental results reveal that MobileNetV2 achieves an average F1-score less than a 5\% lower than ResNet50 (0.789 vs. 0.834), performing better than VGG16 with a footprint size nearly 40 times smaller. Moreover, to compare the models, we have created and made public the Western Mediterranean Wetland Birds dataset, consisting of 201.6 minutes and 5,795 audio excerpts of 20 endemic bird species of the Aiguamolls de l'Empord\`a Natural Park.Comment: 17 pages, 8 figures, 3 table

arXiv.org e-Print Archive

Album cover art image generation with Generative Adversarial Networks

Author: Navarro Joan
Stoppa Felipe Perez
Vidaña-Vila Ester
Publication venue
Publication date: 09/12/2022
Field of study

Generative Adversarial Networks (GANs) were introduced by Goodfellow in 2014, and since then have become popular for constructing generative artificial intelligence models. However, the drawbacks of such networks are numerous, like their longer training times, their sensitivity to hyperparameter tuning, several types of loss and optimization functions and other difficulties like mode collapse. Current applications of GANs include generating photo-realistic human faces, animals and objects. However, I wanted to explore the artistic ability of GANs in more detail, by using existing models and learning from them. This dissertation covers the basics of neural networks and works its way up to the particular aspects of GANs, together with experimentation and modification of existing available models, from least complex to most. The intention is to see if state of the art GANs (specifically StyleGAN2) can generate album art covers and if it is possible to tailor them by genre. This was attempted by first familiarizing myself with 3 existing GANs architectures, including the state of the art StyleGAN2. The StyleGAN2 code was used to train a model with a dataset containing 80K album cover images, then used to style images by picking curated images and mixing their styles

arXiv.org e-Print Archive

Utilizando analítica del aprendizaje en una clase invertida: Experiencia de uso en la asignatura de Sistemas Digitales y Microprocesadores

Author: Amo Daniel
Canaleta Llampallas Xavier
Martínez Carme
Navarro Joan
Vidaña-Vila Ester
Publication venue: Asociación de Enseñantes Universitarios de la Informática (AENUI)
Publication date: 01/01/2018
Field of study

El modelo de clase invertida (o flipped classroom) permite al docente implementar actividades de aprendizaje activo en las sesiones presenciales dejando que el alumno trabaje los contenidos más teóricos por su cuenta en su tiempo de estudio. Uno de los principales inconvenientes de este modelo reside en el hecho de que mientras el alumno estudia y prepara las sesiones, éste es vulnerable a distracciones que lo pueden llevar a alejarse de la asignatura. Es muy difícil saber hasta qué punto o con qué intensidad los alumnos han trabajado los documentos y materiales que se les han preparado. Por esta razón, muchas de las implementaciones actuales del modelo de clase invertida dedican parte de las sesiones presenciales a constatar que los alumnos han trabajado la documentación—lo que inevitablemente reduce el tiempo disponible para desarrollar las actividades que realmente dan valor a este método de aprendizaje. El propósito de este trabajo es presentar una herramienta basada en la analítica del aprendizaje, la cual da al equipo docente información objetiva y automática acerca de las interacciones que han habido entre cada alumno y los documentos PDF que se han publicado en el sistema de gestión del aprendizaje (por ejemplo Moodle). Además, se presentan los resultados de una prueba de concepto que se ha llevado a cabo con esta herramienta sobre la asignatura de Sistemas Digitales y Microprocesadores. Los resultados académicos preliminares que se han obtenido están alineados con la satisfacción del equipo docente al implementar, por primera vez en esta asignatura, el método de la clase invertida.The flipped classroom model allows the teacher to implement active learning activities in face-to-face sessions, which enables students to focus on the most theoretical aspects of the syllabus during their own study time. One of the main concerns of this model is that students can easily get distracted while preparing these sessions on their own due to the lack of teacher’s tracking. Actually, it is very difficult to know how much, how hard, or up to what extent students have worked in the provided materials before starting the class. For this reason, many of the current implementations of the flipped classroom approach spend a considerable amount of time verifying that students have achieved a minimum level to conduct the class properly during face-to-face sessions—which unavoidably reduces the available time to develop the activities that are truly valuable for this learning model. The purpose of this work is to present a tool based on learning analytics, which gives the teaching staff objective and automatic information about the interaction between each student and the PDF documents uploaded in a learning management system (e.g., Moodle). In addition, the results of a proof of concept that has been applied to the Digital Systems and Microprocessors subject are presented. The preliminary academic results obtained are aligned with the satisfaction of the teaching staff when implementing, for the first time ever in this subject, the flipped classroom approach

Repositorio Institucional de la Universidad de Alicante

Cows vocalization and behavioral characterization during eutocic and dystocic calvings

Author: Alsina Pagès Rosa M.
Duboc Leticia
Freixes Marc
Guevara Raúl
Larrondo Cristian
Llonch Pol
Mainau Eva
Malé Jordi
Miranda Joana
Vidaña Vila Ester
Publication venue: Universidad de León
Publication date: 11/09/2023
Field of study

Oral session 3[EN] Calving is a painful and stressful event for dairy cows. Continuous monitoring can provide quick and accurate assistance to the cow, reducing stress, pain, and preventing calving difficulties (dystocia). Vocalizations can provide information on cow welfare problems, such as pain. The aims of the current study were: (1) to characterize cows' vocalizations before and during calving and (2) to determine the relationship between cow vocalizations and painrelated behavior in eutocic and dystocic calvings

Leon University (Spain)

Might cows have accents? Acoustic characterization of calves vocalizations from two different geographical locations

Author: Alsina Pagès Rosa M.
Cano Carmen
Carulla Patricia
Duboc Leticia
Freixes Marc
Guevara Guillermo
Larrondo Cristian
Llonch Pol
Mainau Eva
Malé Jordi
Miranda Joana
Vidaña Vila Ester
Publication venue: Universidad de León
Publication date: 11/09/2023
Field of study

Oral session 2[EN] The development of artificial intelligence algorithms and monitoring technologies has led to the increased use of sensors in animal production. Animal production stakeholders have emphasized the importance of non-invasive methods that provide accurate information without compromising the physical integrity of the animals. Animal vocalizations offer an opportunity to capture data of biological relevance without animal manipulation

Leon University (Spain)

Prediction of the acoustic comfort of a dwelling based on automatic sound event detection

Author: Alsina-Pagès Rosa Ma
Bonet-Solà Daniel
Vidaña-Vila Ester
Publication venue: De Gruyter
Publication date: 01/12/2023
Field of study

There is an increasing concern about noise pollution around the world. As a first step to tackling the problem of deteriorated urban soundscapes, this article aims to develop a tool that automatically evaluates the soundscape quality of dwellings based on the acoustic events obtained from short videos recorded on-site. A sound event classifier based on a convolutional neural network has been used to detect the sounds present in those videos. Once the events are detected, our distinctive approach proceeds in two steps. First, the detected acoustic events are employed as inputs in a binary assessment system, utilizing logistic regression to predict whether the user’s perception of the soundscape (and, therefore, the soundscape quality estimator) is categorized as “comfortable” or “uncomfortable”. Additionally, an Acoustic Comfort Index (ACI) on a scale of 1–5 is estimated, facilitated by a linear regression model. The system achieves an accuracy value over 80% in predicting the subjective opinion of citizens based only on the automatic sound event detected on their balconies. The ultimate goal is to be able to predict an ACI on new locations using solely a 30-s video as an input. The potential of the tool might offer data-driven insights to map the annoyance or the pleasantness of the acoustic environment for people, and gives the possibility to support the administration to mitigate noise pollution and enhance urban living conditions, contributing to improved well-being and community engagement

Directory of Open Access Journals

Multilabel Acoustic Event Classification Using Real-World Urban Data and Physical Redundancy of Sensors

Author: Alsina-Pagès Rosa Ma
Navarro Joan
Stowell Dan
Vidaña-Vila Ester
Publication venue: 'MDPI AG'
Publication date: 01/11/2021
Field of study

Many people living in urban environments nowadays are overexposed to noise, which results in adverse effects on their health. Thus, urban sound monitoring has emerged as a powerful tool that might enable public administrations to automatically identify and quantify noise pollution. Therefore, identifying multiple and simultaneous acoustic sources in these environments in a reliable and cost-effective way has emerged as a hot research topic. The purpose of this paper is to propose a two-stage classifier able to identify, in real time, a set of up to 21 urban acoustic events that may occur simultaneously (i.e., multilabel), taking advantage of physical redundancy in acoustic sensors from a wireless acoustic sensors network. The first stage of the proposed system consists of a multilabel deep neural network that makes a classification for each 4-s window. The second stage intelligently aggregates the classification results from the first stage of four neighboring nodes to determine the final classification result. Conducted experiments with real-world data and up to three different computing devices show that the system is able to provide classification results in less than 1 s and that it has good performance when classifying the most common events from the dataset. The results of this research may help civic organisations to obtain actionable noise monitoring information from automatic systems

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Tilburg University Repository

Real-Time Distributed Architecture for Remote Acoustic Elderly Monitoring in Residential-Scale Ambient Assisted Living Scenarios

Author: Ester Vidaña-Vila
Joan Navarro
Marcos Hervás
Rosa Ma Alsina-Pagès
Publication venue: 'MDPI AG'
Publication date: 01/08/2018
Field of study

Ambient Assisted Living (AAL) has become a powerful alternative to improving the life quality of elderly and partially dependent people in their own living environments. In this regard, tele-care and remote surveillance AAL applications have emerged as a hot research topic in this domain. These services aim to infer the patients’ status by means of centralized architectures that collect data from a set of sensors deployed in their living environment. However, when the size of the scenario and number of patients to be monitored increase (e.g., residential areas, retirement homes), these systems typically struggle at processing all associated data and providing a reasonable output in real time. The purpose of this paper is to present a fog-inspired distributed architecture to collect, analyze and identify up to nine acoustic events that represent abnormal behavior or dangerous health conditions in large-scale scenarios. Specifically, the proposed platform collects data from a set of wireless acoustic sensors and runs an automatic two-stage audio event classification process to decide whether or not to trigger an alarm. Conducted experiments over a labeled dataset of 7116 s based on the priorities of the Fundació Ave Maria health experts have obtained an overall accuracy of 94.6%

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

A two-stage approach to automatically detect and classify woodpecker (Fam. Picidae) sounds

Author: Alsina-Pagès Rosa María
Navarro Joan
Ramírez García Álvaro
Vidaña-Vila Ester
Publication venue: 'Elsevier BV'
Publication date: 03/04/2020
Field of study

Inventorying and monitoring which bird species inhabit a specific area give rich and reliable information regarding its conservation status and other meaningful biological parameters. Typically, this surveying process is carried out manually by ornithologists and birdwatchers who spend long periods of time in the areas of interest trying to identify which species occur. Such methodology is based on the experts’ own knowledge, experience, visualization and hearing skills, which results in an expensive, subjective and error prone process. The purpose of this paper is to present a computing friendly system able to automatically detect and classify woodpecker acoustic signals from a real-world environment. More specifically, the proposed architecture features a two-stage Learning Classifier System that uses (1) Mel Frequency Cepstral Coefficients and Zero Crossing Rate to detect bird sounds over environmental noise, and (2) Linear Predictive Cepstral Coefficients, Perceptual Linear Predictive Coefficients and Mel Frequency Cepstral Coefficients to identify the bird species and sound type (i.e., vocal sounds such as advertising calls, excitement calls, call notes and drumming events) associated to that bird sound. Conducted experiments over a data set of the known woodpeckers species belonging to the Picidae family that live in the Iberian peninsula have resulted in an overall accuracy of 94,02%, which endorses the feasibility of this proposal and encourage practitioners to work toward this direction

Docta Complutense