81,284 research outputs found

    Foreground-Background Ambient Sound Scene Separation

    Get PDF
    Ambient sound scenes typically comprise multiple short events occurring on top of a somewhat stationary background. We consider the task of separating these events from the background, which we call foreground-background ambient sound scene separation. We propose a deep learning-based separation framework with a suitable feature normaliza-tion scheme and an optional auxiliary network capturing the background statistics, and we investigate its ability to handle the great variety of sound classes encountered in ambient sound scenes, which have often not been seen in training. To do so, we create single-channel foreground-background mixtures using isolated sounds from the DESED and Audioset datasets, and we conduct extensive experiments with mixtures of seen or unseen sound classes at various signal-to-noise ratios. Our experimental findings demonstrate the generalization ability of the proposed approach

    Deep Learning for Audio Signal Processing

    Full text link
    Given the recent surge in developments of deep learning, this article provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross-fertilization between areas. The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently, prominent deep learning application areas are covered, i.e. audio recognition (automatic speech recognition, music information retrieval, environmental sound detection, localization and tracking) and synthesis and transformation (source separation, audio enhancement, generative models for speech, sound, and music synthesis). Finally, key issues and future questions regarding deep learning applied to audio signal processing are identified.Comment: 15 pages, 2 pdf figure

    Olympic Coast National Marine Sanctuary Area to be Avoided (ATBA) Education and Monitoring Program

    Get PDF
    The National Marine Sanctuaries Act (16 U.S.C. 1431, as amended) gives the Secretary of Commerce the authority to designate discrete areas of the marine environment as National Marine Sanctuaries and provides the authority to promulgate regulations to provide for the conservation and management of these marine areas. The waters of the Outer Washington Coast were recognized for their high natural resource and human use values and placed on the National Marine Sanctuary Program Site Evaluation List in 1983. In 1988, Congress directed NOAA to designate the Olympic Coast National Marine Sanctuary (Pub. L. 100-627). The Sanctuary, designated in May 1994, worked with the U.S. Coast Guard to request the International Maritime Organization designate an Area to be Avoided (ATBA) on the Olympic Coast. The IMO defines an ATBA as "a routeing measure comprising an area within defined limits in which either navigation is particularly hazardous or it is exceptionally important to avoid casualties and which should be avoided by all ships, or certain classes of ships" (IMO, 1991). This ATBA was adopted in December 1994 by the Maritime Safety Committee of the IMO, “in order to reduce the risk of marine casualty and resulting pollution and damage to the environment of the Olympic Coast National Marine Sanctuary”, (IMO, 1994). The ATBA went into effect in June 1995 and advises operators of vessels carrying petroleum and/or hazardous materials to maintain a 25-mile buffer from the coast. Since that time, Olympic Coast National Marine Sanctuary (OCNMS) has created an education and monitoring program with the goal of ensuring the successful implementation of the ATBA. The Sanctuary enlisted the aid of the U.S. and Canadian coast guards, and the marine industry to educate mariners about the ATBA and to use existing radar data to monitor compliance. Sanctuary monitoring efforts have targeted education on tank vessels observed transiting the ATBA. OCNMS's monitoring efforts allow quantitative evaluation of this voluntary measure. Finally, the tools developed to monitor the ATBA are also used for the more general purpose of monitoring vessel traffic within the Sanctuary. While the Olympic Coast National Marine Sanctuary does not currently regulate vessel traffic, such regulations are within the scope of the Sanctuary’s Final Environmental Impact Statement/Management Plan. Sanctuary staff participate in ongoing maritime and environmental safety initiatives and continually seek opportunities to mitigate risks from marine shipping.(PDF contains 44 pages.

    Improving Sound Event Detection In Domestic Environments Using Sound Separation

    Get PDF
    Performing sound event detection on real-world recordings often implies dealing with overlapping target sound events and non-target sounds, also referred to as interference or noise. Until now these problems were mainly tackled at the classifier level. We propose to use sound separation as a pre-processing for sound event detection. In this paper we start from a sound separation model trained on the Free Universal Sound Separation dataset and the DCASE 2020 task 4 sound event detection baseline. We explore different methods to combine separated sound sources and the original mixture within the sound event detection. Furthermore, we investigate the impact of adapting the sound separation model to the sound event detection data on both the sound separation and the sound event detection

    A Hybrid Approach with Multi-channel I-Vectors and Convolutional Neural Networks for Acoustic Scene Classification

    Full text link
    In Acoustic Scene Classification (ASC) two major approaches have been followed . While one utilizes engineered features such as mel-frequency-cepstral-coefficients (MFCCs), the other uses learned features that are the outcome of an optimization algorithm. I-vectors are the result of a modeling technique that usually takes engineered features as input. It has been shown that standard MFCCs extracted from monaural audio signals lead to i-vectors that exhibit poor performance, especially on indoor acoustic scenes. At the same time, Convolutional Neural Networks (CNNs) are well known for their ability to learn features by optimizing their filters. They have been applied on ASC and have shown promising results. In this paper, we first propose a novel multi-channel i-vector extraction and scoring scheme for ASC, improving their performance on indoor and outdoor scenes. Second, we propose a CNN architecture that achieves promising ASC results. Further, we show that i-vectors and CNNs capture complementary information from acoustic scenes. Finally, we propose a hybrid system for ASC using multi-channel i-vectors and CNNs by utilizing a score fusion technique. Using our method, we participated in the ASC task of the DCASE-2016 challenge. Our hybrid approach achieved 1 st rank among 49 submissions, substantially improving the previous state of the art
    • …
    corecore