Search CORE

27,028 research outputs found

Robust sound event detection in bioacoustic sensor networks

Author: Bello Juan Pablo
Farnsworth Andrew
Kelling Steve
Lostanlen Vincent
Salamon Justin
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2019
Field of study

Bioacoustic sensors, sometimes known as autonomous recording units (ARUs), can record sounds of wildlife over long periods of time in scalable and minimally invasive ways. Deriving per-species abundance estimates from these sensors requires detection, classification, and quantification of animal vocalizations as individual acoustic events. Yet, variability in ambient noise, both over time and across sensors, hinders the reliability of current automated systems for sound event detection (SED), such as convolutional neural networks (CNN) in the time-frequency domain. In this article, we develop, benchmark, and combine several machine listening techniques to improve the generalizability of SED models across heterogeneous acoustic environments. As a case study, we consider the problem of detecting avian flight calls from a ten-hour recording of nocturnal bird migration, recorded by a network of six ARUs in the presence of heterogeneous background noise. Starting from a CNN yielding state-of-the-art accuracy on this task, we introduce two noise adaptation techniques, respectively integrating short-term (60 milliseconds) and long-term (30 minutes) context. First, we apply per-channel energy normalization (PCEN) in the time-frequency domain, which applies short-term automatic gain control to every subband in the mel-frequency spectrogram. Secondly, we replace the last dense layer in the network by a context-adaptive neural network (CA-NN) layer. Combining them yields state-of-the-art results that are unmatched by artificial data augmentation alone. We release a pre-trained version of our best performing system under the name of BirdVoxDetect, a ready-to-use detector of avian flight calls in field recordings.Comment: 32 pages, in English. Submitted to PLOS ONE journal in February 2019; revised August 2019; published October 201

arXiv.org e-Print Archive

Directory of Open Access Journals

Semi-automatic semantic enrichment of raw sensor data

Author: Jones Gareth J.F.
Legeay Nicolas
O'Connor Noel E.
Roantree Mark
Smeaton Alan F.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

One of the more recent sources of large volumes of generated data is sensor devices, where dedicated sensing equipment is used to monitor events and happenings in a wide range of domains, including monitoring human biometrics. In recent trials to examine the effects that key moments in movies have on the human body, we fitted fitted with a number of biometric sensor devices and monitored them as they watched a range of dierent movies in groups. The purpose of these experiments was to examine the correlation between humans' highlights in movies as observed from biometric sensors, and highlights in the same movies as identified by our automatic movie analysis techniques. However,the problem with this type of experiment is that both the analysis of the video stream and the sensor data readings are not directly usable in their raw form because of the sheer volume of low-level data values generated both from the sensors and from the movie analysis. This work describes the semi-automated enrichment of both video analysis and sensor data and the mechanism used to query the data in both centralised environments, and in a peer-to-peer architecture when the number of sensor devices grows to large numbers. We present and validate a scalable means of semi-automating the semantic enrichment of sensor data, thereby providing a means of large-scale sensor management

Crossref

Irish Universities

DCU Online Research Access Service

DeepCough: A Deep Convolutional Neural Network in A Wearable Cough Detection System

Author: Amoh Justice
Odame Kofi
Publication venue
Publication date: 08/09/2015
Field of study

In this paper, we present a system that employs a wearable acoustic sensor and a deep convolutional neural network for detecting coughs. We evaluate the performance of our system on 14 healthy volunteers and compare it to that of other cough detection systems that have been reported in the literature. Experimental results show that our system achieves a classification sensitivity of 95.1% and a specificity of 99.5%.Comment: BioCAS-201

arXiv.org e-Print Archive

Crossref

Deep Neural Networks for the Recognition and Classification of Heart Murmurs Using Neuromorphic Auditory Sensors

Author: Domínguez Morales Juan Pedro
Domínguez Morales Manuel Jesús
Jiménez Fernández Ángel Francisco
Jiménez Moreno Gabriel
Publication venue: IEEE Computer Society
Publication date: 01/01/2017
Field of study

Auscultation is one of the most used techniques for detecting cardiovascular diseases, which is one of the main causes of death in the world. Heart murmurs are the most common abnormal finding when a patient visits the physician for auscultation. These heart sounds can either be innocent, which are harmless, or abnormal, which may be a sign of a more serious heart condition. However, the accuracy rate of primary care physicians and expert cardiologists when auscultating is not good enough to avoid most of both type-I (healthy patients are sent for echocardiogram) and type-II (pathological patients are sent home without medication or treatment) errors made. In this paper, the authors present a novel convolutional neural network based tool for classifying between healthy people and pathological patients using a neuromorphic auditory sensor for FPGA that is able to decompose the audio into frequency bands in real time. For this purpose, different networks have been trained with the heart murmur information contained in heart sound recordings obtained from nine different heart sound databases sourced from multiple research groups. These samples are segmented and preprocessed using the neuromorphic auditory sensor to decompose their audio information into frequency bands and, after that, sonogram images with the same size are generated. These images have been used to train and test different convolutional neural network architectures. The best results have been obtained with a modified version of the AlexNet model, achieving 97% accuracy (specificity: 95.12%, sensitivity: 93.20%, PhysioNet/CinC Challenge 2016 score: 0.9416). This tool could aid cardiologists and primary care physicians in the auscultation process, improving the decision making task and reducing type-I and type-II errors.Ministerio de Economía y Competitividad TEC2016-77785-

idUS. Depósito de Investigación Universidad de Sevilla

SALSA: A Novel Dataset for Multimodal Group Behavior Analysis

Author: Alameda-Pineda Xavier
Batrinca Ligia
Lanz Oswald
Lepri Bruno
Ricci Elisa
Sebe Nicu
Staiano Jacopo
Subramanian Ramanathan
Publication venue
Publication date: 23/06/2015
Field of study

Studying free-standing conversational groups (FCGs) in unstructured social settings (e.g., cocktail party ) is gratifying due to the wealth of information available at the group (mining social networks) and individual (recognizing native behavioral and personality traits) levels. However, analyzing social scenes involving FCGs is also highly challenging due to the difficulty in extracting behavioral cues such as target locations, their speaking activity and head/body pose due to crowdedness and presence of extreme occlusions. To this end, we propose SALSA, a novel dataset facilitating multimodal and Synergetic sociAL Scene Analysis, and make two main contributions to research on automated social interaction analysis: (1) SALSA records social interactions among 18 participants in a natural, indoor environment for over 60 minutes, under the poster presentation and cocktail party contexts presenting difficulties in the form of low-resolution images, lighting variations, numerous occlusions, reverberations and interfering sound sources; (2) To alleviate these problems we facilitate multimodal analysis by recording the social interplay using four static surveillance cameras and sociometric badges worn by each participant, comprising the microphone, accelerometer, bluetooth and infrared sensors. In addition to raw data, we also provide annotations concerning individuals' personality as well as their position, head, body orientation and F-formation information over the entire event duration. Through extensive experiments with state-of-the-art approaches, we show (a) the limitations of current methods and (b) how the recorded multiple cues synergetically aid automatic analysis of social interactions. SALSA is available at http://tev.fbk.eu/salsa.Comment: 14 pages, 11 figure

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

University of Canberra Research Repository