1,177 research outputs found

    Audio Surveillance of Roads:A System for Detecting Anomalous Sounds

    Get PDF
    In the last decades, several systems based on video analysis have been proposed for automatically detecting accidents on roads to ensure a quick intervention of emergency teams. However, in some situations, the visual information is not sufficient or sufficiently reliable, whereas the use of microphones and audio event detectors can significantly improve the overall reliability of surveillance systems. In this paper, we propose a novel method for detecting road accidents by analyzing audio streams to identify hazardous situations such as tire skidding and car crashes. Our method is based on a two-layer representation of an audio stream: at a low level, the system extracts a set of features that is able to capture the discriminant properties of the events of interest, and at a high level, a representation based on a bag-of-words approach is then exploited in order to detect both short and sustained events. The deployment architecture for using the system in real environments is discussed, together with an experimental analysis carried out on a data set made publicly available for benchmarking purposes. The obtained results confirm the effectiveness of the proposed approach.</p

    Overview and Evaluation of Sound Event Localization and Detection in DCASE 2019

    Full text link
    Sound event localization and detection is a novel area of research that emerged from the combined interest of analyzing the acoustic scene in terms of the spatial and temporal activity of sounds of interest. This paper presents an overview of the first international evaluation on sound event localization and detection, organized as a task of the DCASE 2019 Challenge. A large-scale realistic dataset of spatialized sound events was generated for the challenge, to be used for training of learning-based approaches, and for evaluation of the submissions in an unlabeled subset. The overview presents in detail how the systems were evaluated and ranked and the characteristics of the best-performing systems. Common strategies in terms of input features, model architectures, training approaches, exploitation of prior knowledge, and data augmentation are discussed. Since ranking in the challenge was based on individually evaluating localization and event classification performance, part of the overview focuses on presenting metrics for the joint measurement of the two, together with a reevaluation of submissions using these new metrics. The new analysis reveals submissions that performed better on the joint task of detecting the correct type of event close to its original location than some of the submissions that were ranked higher in the challenge. Consequently, ranking of submissions which performed strongly when evaluated separately on detection or localization, but not jointly on both, was affected negatively

    Dual shots detection

    Get PDF
    The identification of a special kind of acoustic events such as dual gunshots and single gunshots in the traffic background is described in this work. The recognition of dangerous sounds may help to prevent the abnormal or criminal activities that happened near to the public transport stations. Therefore in this paper the methodology of dual shots detection in a noisy background was developed and evaluated. For this purpose, we investigated various feature extraction methods and combinations of different feature sets. These approaches were evaluated by the widely used classification technique based on the Hidden Markov Models

    Sound Event Localization, Detection, and Tracking by Deep Neural Networks

    Get PDF
    In this thesis, we present novel sound representations and classification methods for the task of sound event localization, detection, and tracking (SELDT). The human auditory system has evolved to localize multiple sound events, recognize and further track their motion individually in an acoustic environment. This ability of humans makes them context-aware and enables them to interact with their surroundings naturally. Developing similar methods for machines will provide an automatic description of social and human activities around them and enable machines to be context-aware similar to humans. Such methods can be employed to assist the hearing impaired to visualize sounds, for robot navigation, and to monitor biodiversity, the home, and cities. A real-life acoustic scene is complex in nature, with multiple sound events that are temporally and spatially overlapping, including stationary and moving events with varying angular velocities. Additionally, each individual sound event class, for example, a car horn can have a lot of variabilities, i.e., different cars have different horns, and within the same model of the car, the duration and the temporal structure of the horn sound is driver dependent. Performing SELDT in such overlapping and dynamic sound scenes while being robust is challenging for machines. Hence we propose to investigate the SELDT task in this thesis and use a data-driven approach using deep neural networks (DNNs). The sound event detection (SED) task requires the detection of onset and offset time for individual sound events and their corresponding labels. In this regard, we propose to use spatial and perceptual features extracted from multichannel audio for SED using two different DNNs, recurrent neural networks (RNNs) and convolutional recurrent neural networks (CRNNs). We show that using multichannel audio features improves the SED performance for overlapping sound events in comparison to traditional single-channel audio features. The proposed novel features and methods produced state-of-the-art performance for the real-life SED task and won the IEEE AASP DCASE challenge consecutively in 2016 and 2017. Sound event localization is the task of spatially locating the position of individual sound events. Traditionally, this has been approached using parametric methods. In this thesis, we propose a CRNN for detecting the azimuth and elevation angles of multiple temporally overlapping sound events. This is the first DNN-based method performing localization in complete azimuth and elevation space. In comparison to parametric methods which require the information of the number of active sources, the proposed method learns this information directly from the input data and estimates their respective spatial locations. Further, the proposed CRNN is shown to be more robust than parametric methods in reverberant scenarios. Finally, the detection and localization tasks are performed jointly using a CRNN. This method additionally tracks the spatial location with time, thus producing the SELDT results. This is the first DNN-based SELDT method and is shown to perform equally with stand-alone baselines for SED, localization, and tracking. The proposed SELDT method is evaluated on nine datasets that represent anechoic and reverberant sound scenes, stationary and moving sources with varying velocities, a different number of overlapping sound events and different microphone array formats. The results show that the SELDT method can track multiple overlapping sound events that are both spatially stationary and moving

    Aplicações De Métodos De Sensoriamento De Vibração Baseados Em Técnicas

    Get PDF
    Orientadores: Fabiano Fruett, Claudio FloridiaTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Sensores à fibra óptica distribuídos têm sido empregados para monitorar vários parâmetros, tais como temperatura, vibração, tensão mecânica, campo magnético e corrente elétrica. Quando comparados a outras técnicas convencionais, tais sensores são vantajosos devido a suas pequenas dimensões, imunidade a interferências eletromagnéticas, alta adaptabilidade, robustez a ambientes nocivos, dentre outros. Sensores acústicos distribuídos em particular são interessantes devido a sua capacidade em serem usados em aplicações tais como monitoração de saúde de estruturas e vigilância de perímetros. Através da análise em frequência da estrutura, por exemplo uma aeronave, uma ponte, um edifício ou mesmo máquinas em uma fábrica, é possível avaliar sua condição e detectar danos e falhas em um estágio primário. Tais soluções podem cobrir ambas as aplicações de detecção de intrusão e monitoração estrutural com mínimas adaptações no sistema sensor. Desta forma, vibrações e distúrbios pequenas estruturas com resolução de dezenas de centímetros e em grandes estruturas ou perímetros com alguns metros de resolução espacial e centenas de quilômetros de alcance podem ser detectadas. Outra característica útil desta solução baseada em fibra óptica é a possibilidade de ser combinada com técnicas de processamento digital de sinais, permitindo a detecção e localização de perturbações rápidas, reconhecimento de padrões de intrusão em tempo real e multiplexação de dados de superfícies estruturais para aplicações SHM. O principal objetivo desta tese é fazer uso desses recursos para empregar técnicas de DAS como soluções de tecnologias- chave para várias aplicações. Neste trabalho, as técnicas de phase-OTDR foram estudadas e as principais contribuições da tese focaram em trazer soluções inovadoras e validações para aplicações de vigilância e vigilância. Este doutorado teve um período sanduíche nas instalações da RISE Acreo AB, Estocolmo, Suécia, onde experimentos foram realizados e foi parte da 42ª Chamada CISB/Saab/CNPqAbstract: Distributed optical fiber sensors have been increasingly employed for monitoring several parameters, such as temperature, vibration, strain, magnetic field and current. When compared to other conventional techniques, these sensors are advantageous due to their small dimensions, lightweight, immunity to electromagnetic interference, high adaptability, robustness to hazardous environments, less complex data multiplexing, the feasibility to be embedded into structures with minimum invasion, the capability to extract data with high resolution from long perimeters using a single optical fiber and detect multiple events along the fiber. In particular, distributed acoustic sensors (DAS) based on optical time domain reflectometry (OTDR), are of high interest, due to their capability to be used in applications such as structural health monitoring (SHM) and perimeter surveillance. Through the frequency analysis of a structure, for instance an aircraft, a bridge, a building or even machines in a workshop, it is possible to evaluate its condition and detect damages and failures at an early stage. Also, OTDR based solutions for vibration monitoring can be easily adapted with minimum setup modifications to detect intrusion in a perimeter, a useful tool for surveillance of military facilities, laboratories, power plants and homeland security. The same OTDR technique can be used as a non-destructive diagnostic tool to evaluate vibrations and disturbances on both small structures with some dozens of centimeters¿ resolution and in big structures or perimeters with some meters of spatial resolution and hundreds of kilometers of reach. Another useful feature of this optical fiber based solution is the possibility to be combined with high-performance digital signal processing techniques, enabling fast disturbance detection and location, real-time intrusion pattern recognition and fast data multiplexing of structure surfaces for SHM applications. The main goal of this thesis is to make use of these features to employ DAS techniques as key enabling technologies solutions for several applications. In this work, OTDR based techniques were studied and the thesis main contributions were focused on bringing innovative solutions and validations for SHM and surveillance applications. This PhD had a sandwich period at Acreo AB, Stockholm, Sweden, where experimental tests were performed and it was part of the 42ª CISB/Saab/CNPq CalDoutoradoEletrônica, Microeletrônica e OptoeletrônicaDoutora em Engenharia Elétrica202816/2015-0CAPESCNP

    Incremental multiclass open-set audio recognition

    Get PDF
    Incremental learning aims to learn new classes if they emerge while maintaining the performance for previously known classes. It acquires useful information from incoming data to update the existing models. Open-set recognition, however, requires the ability to recognize examples from known classes and reject examples from new/unknown classes. There are two main challenges in this matter. First, new class discovery: the algorithm needs to not only recognize known classes but it must also detect unknown classes. Second, model extension: after the new classes are identified, the model needs to be updated. Focusing on this matter, we introduce incremental open-set multiclass support vector machine algorithms that can classify examples from seen/unseen classes, using incremental learning to increase the current model with new classes without entirely retraining the system. Comprehensive evaluations are carried out on both open set recognition and incremental learning. For open-set recognition, we adopt the openness test that examines the effectiveness of a varying number of known/unknown labels. For incremental learning, we adapt the model to detect a single novel class in each incremental phase and update the model with unknown classes. Experimental results show promising performance for the proposed methods, compared with some representative previous methods
    corecore