355 research outputs found
Music Source Separation in the Waveform Domain
Source separation for music is the task of isolating contributions, or stems, from different instruments recorded individually and arranged together to form a song. Such components include voice, bass, drums and any other accompaniments. Contrarily to many audio synthesis tasks where the best performances are achieved by models that directly generate the waveform, the state-of-the-art in source separation for music is to compute masks on the magnitude spectrum. In this paper, we first show that an adaptation of Conv-Tasnet (Luo & Mesgarani, 2019), a waveform-to-waveform model for source separation for speech, significantly beats the state-of-the-art on the MusDB dataset, the standard benchmark of multi-instrument source separation. Second, we observe that Conv-Tasnet follows a masking approach on the input signal, which has the potential drawback of removing parts of the relevant source without the capacity to reconstruct it. We propose Demucs, a new waveform-to-waveform model, which has an architecture closer to models for audio generation with more capacity on the decoder. Experiments on the MusDB dataset show that Demucs beats previously reported results in terms of signal to distortion ratio (SDR), but lower than Conv-Tasnet. Human evaluations show that Demucs has significantly higher quality (as assessed by mean opinion score) than Conv-Tasnet, but slightly more contamination from other sources, which explains the difference in SDR. Additional experiments with a larger dataset suggest that the gap in SDR between Demucs and Conv-Tasnet shrinks, showing that our approach is promising
Music Source Separation in the Waveform Domain
Source separation for music is the task of isolating contributions, or stems, from different instruments recorded individually and arranged together to form a song. Such components include voice, bass, drums and any other accompaniments. Contrarily to many audio synthesis tasks where the best performances are achieved by models that directly generate the waveform, the state-of-the-art in source separation for music is to compute masks on the magnitude spectrum. In this paper, we first show that an adaptation of Conv-Tasnet (Luo & Mesgarani, 2019), a waveform-to-waveform model for source separation for speech, significantly beats the state-of-the-art on the MusDB dataset, the standard benchmark of multi-instrument source separation. Second, we observe that Conv-Tasnet follows a masking approach on the input signal, which has the potential drawback of removing parts of the relevant source without the capacity to reconstruct it. We propose Demucs, a new waveform-to-waveform model, which has an architecture closer to models for audio generation with more capacity on the decoder. Experiments on the MusDB dataset show that Demucs beats previously reported results in terms of signal to distortion ratio (SDR), but lower than Conv-Tasnet. Human evaluations show that Demucs has significantly higher quality (as assessed by mean opinion score) than Conv-Tasnet, but slightly more contamination from other sources, which explains the difference in SDR. Additional experiments with a larger dataset suggest that the gap in SDR between Demucs and Conv-Tasnet shrinks, showing that our approach is promising
Earth resources, a continuing bibliography with indexes
This bibliography lists 541 reports, articles and other documents introduced into the NASA scientific and technical information system. Emphasis is placed on the use of remote sensing and geophysical instrumentation in spacecraft and aircraft to survey and inventory natural resources and urban areas. Subject matter is grouped according to agriculture and forestry, environmental changes and cultural resources, geodesy and cartography, geology and mineral resources, hydrology and water management, data processing and distribution systems, instrumentation and sensors, and economic analysis
Music Source Separation Using Deep Neural Networks
Last years, Sound Source Separation (SSS) has been one of the most active fields within signal processing. The design of such algorithms seeks to recreate the human ability to identify individual sound sources. In the music field, efforts are being made to isolate the main instruments from a single audio file with a mixture of stereo audio. The goal of these algorithms is to extract multiple audio files with specific instruments, such as bass, voice or drums. This project focuses on analyzing the existing systems based on neural networks and their performance. In addition, it goes deeply into the Open-Unmix algorithm structure and tries to improve its results.En los Ășltimos años, la SeparaciĂłn de Fuentes Sonoras (SSS) ha sido uno de los campos mĂĄs activos dentro del procesado de señal. El diseño de estos algoritmos intenta recrear la habilidad humana de identificar fuentes sonoras individuales. En el campo de la mĂșsica, se trabaja para aislar los principales instrumentos de un Ășnico fichero con una mezcla de audio estĂ©reo. AsĂ pues, el objetivo de estos algoritmos es obtener varios archivos de audio con instrumentos concretos, como el bajo, la voz o la baterĂa. Este trabajo se centra en analizar las propuestas existentes de sistemas basados en las redes neuronales y su rendimiento. AdemĂĄs, estudia a fondo la estructura propuesta en el algoritmo Open-Unmix y trata de mejorar sus resultados
Earth resources: A continuing bibliography, issue 46
This bibliography lists 467 reports, articles and other documents introdcued into the NASA scientific and technical information system between April 1 and June 30, 1985. Emphasis is placed on the use of remote sensing and geophysical instrumentation in spacecraft and aircraft to survey and inventory natural resources and urban areas. Subject matter is grouped according to agriculture and forestry, environmental cultural resources geodesy and cartography, geology and mineral resources, hydrology and water management, data processing and distribution systems, instrumentation and sensors, and economical analysis
Recommended from our members
Fully-passive switched-capacitor techniques for high performance SAR ADC design
In recent years, SAR ADC becomes more and more popular in various low-power applications such as wireless sensors and low energy radios due to its circuit simplicity, high power efficiency, and scaling compatibility. However, its speed is limited by its successive approximation procedures and its power efficiency greatly reduces with the ADC resolution going beyond 10 bit. To address these issues, this thesis proposes to embed two techniques: 1) compressive sensing (CS) and 2) noise shaping (NS) to a conventional SAR ADC. The realization of both techniques are based on fully-passive switched-capacitor techniques.
CS is a recently emerging sampling paradigm, stating that the sparsity of a signal can be exploited to reduce the ADC sampling rate below the Nyquist rate. Different from conventional CS frameworks which require dedicated analog CS encoders, this thesis proposes a fully-passive CS-SAR ADC architecture which only requires minor modification to a conventional SAR ADC. Two chips are fabricated in a 0.13 ”m process to prove the concept. One chip is a single-channel CS-SAR ADC which can reduce the ADC conversion rate by 4 times, thus reducing the ADC power by 4 times. In many wireless sensing applications, multiple ADCs are commonly required to sense multi-channel signals such as multi-lead ECG sensing and parallel neural recording. Therefore, the other chip is a multi-channel CS-SAR ADC which can simultaneously convert 4-channel signals with a sampling rate of one channelâs Nyquist rate. At 0.8 V and 1 MS/s, both chips achieve an effective Walden FoM of around 5 fJ/conversion-step.
This thesis also proposes a novel NS SAR ADC architecture that is simple, robust and low power for high-resolution applications. Compared to conventional âÎŁ ADCs, it replaces the power-hungry active integrator with a passive integrator which only requires one switch and two capacitors. Compared to previous 1st-order NS SAR ADC works, it achieves the best NS performance and can be easily extended to 2nd-order. A 1st-order 10-bit NS SAR ADC is fabricated in a 0.13 ”m process. Through NS, SNDR increases by 6 dB with OSR doubled, achieving a 12- bit ENOB at OSR = 8. An improved version of a 2nd-order 9-bit NS SAR ADC is designed and simulated in a 40 nm process. The SNDR increases by 10 dB with OSR doubled, achieving a 14-bit ENOB at OSR = 16. At a bandwidth of 312.5 kHz, the Schreier FoM is 181 dB and the Walden FoM is 12.5 fJ/conversion-step, proving that the proposed NS SAR ADC architecture can achieve high resolution and high power efficiency simultaneously.Electrical and Computer Engineerin
A 32-Channel Time-Multiplexed Artifact-Aware Neural Recording System
This paper presents a low-power, low-noise microsystem for the recording of neural local field potentials or intracranial
electroencephalographic signals. It features 32 time-multiplexed
channels at the electrode interface and offers the possibility to spatially delta encode data to take advantage of the large correlation
of signals captured from nearby channels. The circuit also implements a mixed-signal voltage-triggered auto-ranging algorithm
which allows to attenuate large interferers in digital domain while
preserving neural information. This effectively increases the system
dynamic range and avoids the onset of saturation. A prototype,
fabricated in a standard 180 nm CMOS process, has been experimentally verified in-vitro with cellular cultures of primary cortical
neurons from mice. The system shows an integrated input-referred
noise in the 0.5â200 Hz band of 1.4 ”Vrms for a spot noise of
about 85 nV /
âHz. The system draws 1.5 ”W per channel from
1.2 V supply and obtains 71 dB + 26 dB dynamic range when
the artifact-aware auto-ranging mechanism is enabled, without
penalising other critical specifications such as crosstalk between
channels or common-mode and power supply rejection ratios
- âŠ