355 research outputs found

    Music Source Separation in the Waveform Domain

    Get PDF
    Source separation for music is the task of isolating contributions, or stems, from different instruments recorded individually and arranged together to form a song. Such components include voice, bass, drums and any other accompaniments. Contrarily to many audio synthesis tasks where the best performances are achieved by models that directly generate the waveform, the state-of-the-art in source separation for music is to compute masks on the magnitude spectrum. In this paper, we first show that an adaptation of Conv-Tasnet (Luo & Mesgarani, 2019), a waveform-to-waveform model for source separation for speech, significantly beats the state-of-the-art on the MusDB dataset, the standard benchmark of multi-instrument source separation. Second, we observe that Conv-Tasnet follows a masking approach on the input signal, which has the potential drawback of removing parts of the relevant source without the capacity to reconstruct it. We propose Demucs, a new waveform-to-waveform model, which has an architecture closer to models for audio generation with more capacity on the decoder. Experiments on the MusDB dataset show that Demucs beats previously reported results in terms of signal to distortion ratio (SDR), but lower than Conv-Tasnet. Human evaluations show that Demucs has significantly higher quality (as assessed by mean opinion score) than Conv-Tasnet, but slightly more contamination from other sources, which explains the difference in SDR. Additional experiments with a larger dataset suggest that the gap in SDR between Demucs and Conv-Tasnet shrinks, showing that our approach is promising

    Music Source Separation in the Waveform Domain

    Get PDF
    Source separation for music is the task of isolating contributions, or stems, from different instruments recorded individually and arranged together to form a song. Such components include voice, bass, drums and any other accompaniments. Contrarily to many audio synthesis tasks where the best performances are achieved by models that directly generate the waveform, the state-of-the-art in source separation for music is to compute masks on the magnitude spectrum. In this paper, we first show that an adaptation of Conv-Tasnet (Luo & Mesgarani, 2019), a waveform-to-waveform model for source separation for speech, significantly beats the state-of-the-art on the MusDB dataset, the standard benchmark of multi-instrument source separation. Second, we observe that Conv-Tasnet follows a masking approach on the input signal, which has the potential drawback of removing parts of the relevant source without the capacity to reconstruct it. We propose Demucs, a new waveform-to-waveform model, which has an architecture closer to models for audio generation with more capacity on the decoder. Experiments on the MusDB dataset show that Demucs beats previously reported results in terms of signal to distortion ratio (SDR), but lower than Conv-Tasnet. Human evaluations show that Demucs has significantly higher quality (as assessed by mean opinion score) than Conv-Tasnet, but slightly more contamination from other sources, which explains the difference in SDR. Additional experiments with a larger dataset suggest that the gap in SDR between Demucs and Conv-Tasnet shrinks, showing that our approach is promising

    Earth resources, a continuing bibliography with indexes

    Get PDF
    This bibliography lists 541 reports, articles and other documents introduced into the NASA scientific and technical information system. Emphasis is placed on the use of remote sensing and geophysical instrumentation in spacecraft and aircraft to survey and inventory natural resources and urban areas. Subject matter is grouped according to agriculture and forestry, environmental changes and cultural resources, geodesy and cartography, geology and mineral resources, hydrology and water management, data processing and distribution systems, instrumentation and sensors, and economic analysis

    Music Source Separation Using Deep Neural Networks

    Get PDF
    Last years, Sound Source Separation (SSS) has been one of the most active fields within signal processing. The design of such algorithms seeks to recreate the human ability to identify individual sound sources. In the music field, efforts are being made to isolate the main instruments from a single audio file with a mixture of stereo audio. The goal of these algorithms is to extract multiple audio files with specific instruments, such as bass, voice or drums. This project focuses on analyzing the existing systems based on neural networks and their performance. In addition, it goes deeply into the Open-Unmix algorithm structure and tries to improve its results.En los Ășltimos años, la SeparaciĂłn de Fuentes Sonoras (SSS) ha sido uno de los campos mĂĄs activos dentro del procesado de señal. El diseño de estos algoritmos intenta recrear la habilidad humana de identificar fuentes sonoras individuales. En el campo de la mĂșsica, se trabaja para aislar los principales instrumentos de un Ășnico fichero con una mezcla de audio estĂ©reo. AsĂ­ pues, el objetivo de estos algoritmos es obtener varios archivos de audio con instrumentos concretos, como el bajo, la voz o la baterĂ­a. Este trabajo se centra en analizar las propuestas existentes de sistemas basados en las redes neuronales y su rendimiento. AdemĂĄs, estudia a fondo la estructura propuesta en el algoritmo Open-Unmix y trata de mejorar sus resultados

    Earth resources: A continuing bibliography, issue 46

    Get PDF
    This bibliography lists 467 reports, articles and other documents introdcued into the NASA scientific and technical information system between April 1 and June 30, 1985. Emphasis is placed on the use of remote sensing and geophysical instrumentation in spacecraft and aircraft to survey and inventory natural resources and urban areas. Subject matter is grouped according to agriculture and forestry, environmental cultural resources geodesy and cartography, geology and mineral resources, hydrology and water management, data processing and distribution systems, instrumentation and sensors, and economical analysis

    A 32-Channel Time-Multiplexed Artifact-Aware Neural Recording System

    Get PDF
    This paper presents a low-power, low-noise microsystem for the recording of neural local field potentials or intracranial electroencephalographic signals. It features 32 time-multiplexed channels at the electrode interface and offers the possibility to spatially delta encode data to take advantage of the large correlation of signals captured from nearby channels. The circuit also implements a mixed-signal voltage-triggered auto-ranging algorithm which allows to attenuate large interferers in digital domain while preserving neural information. This effectively increases the system dynamic range and avoids the onset of saturation. A prototype, fabricated in a standard 180 nm CMOS process, has been experimentally verified in-vitro with cellular cultures of primary cortical neurons from mice. The system shows an integrated input-referred noise in the 0.5–200 Hz band of 1.4 ”Vrms for a spot noise of about 85 nV / √Hz. The system draws 1.5 ”W per channel from 1.2 V supply and obtains 71 dB + 26 dB dynamic range when the artifact-aware auto-ranging mechanism is enabled, without penalising other critical specifications such as crosstalk between channels or common-mode and power supply rejection ratios
    • 

    corecore