166 research outputs found

    APHONIC: Adaptive thresholding for noise cancellation in smart mobile environments

    Get PDF
    We propose a signal-channel, adaptive threshold selection technique for binary mask construction, namely APHONIC, (AdaPtive tHreshOlding for NoIse Cancellation) for smart mobile environments. Using this mask, we introduce two noise cancellation techniques that perform robustly in the presence of real-world interfering signals that are typically encountered by mobile users: a violin busker, a subway and busy city square sounds. We demonstrate that when the power of the time-frequency components of the voice of a mobile user does not significantly overlap with the components of the interference signal, the threshold learning and noise cancellation techniques significantly improve the Signal-to-Interference Ratio (SIR) and the Signal-Distortion Ratio (SDR) of the recovered voice. When a mobile user\u27s speech is mixed with music or with the sounds of a city square, or subway station, the speech energy is captured by a few large magnitude coefficients and APHONIC improves the SIR by greater than 20dB and the SDR by up to 5dB. The robustness of the threshold selection step and the noise cancellation algorithms is evaluated using environments typically experienced by mobile phone users. Listening tests indicate that the interference signal is no longer audible in the denoised signals. We outline how this approach could be used in many mobile voice-driven applications

    APHONIC: Adaptive Thresholding for Noise Cancellation in Smart Mobile Environments

    Get PDF
    We propose a signal-channel, adaptive threshold selection technique for binary mask construction, namely APHONIC, (AdaPtive tHreshOlding for NoIse Cancellation) for smart mobile environments. Using this mask, we introduce two noise cancellation techniques that perform robustly in the presence of real-world interfering signals that are typically encountered by mobile users: a violin busker, a subway and busy city square sounds. We demonstrate that when the power of the time-frequency components of the voice of a mobile user does not significantly overlap with the components of the interference signal, the threshold learning and noise cancellation techniques significantly improve the Signal-to-Interference Ratio (SIR) and the Signal-Distortion Ratio (SDR) of the recovered voice. When a mobile user\u27s speech is mixed with music or with the sounds of a city square, or subway station, the speech energy is captured by a few large magnitude coefficients and APHONIC improves the SIR by greater than 20dB and the SDR by up to 5dB. The robustness of the threshold selection step and the noise cancellation algorithms is evaluated using environments typically experienced by mobile phone users. Listening tests indicate that the interference signal is no longer audible in the denoised signals. We outline how this approach could be used in many mobile voice-driven applications

    The Synchronized Short-Time-Fourier-Transform: Properties and Definitions for Multichannel Source Separation.

    Get PDF
    This paper proposes the use of a synchronized linear transform, the synchronized short-time-Fourier-transform (sSTFT), for time-frequency analysis of anechoic mixtures. We address the short comings of the commonly used time-frequency linear transform in multichannel settings, namely the classical short-time-Fourier-transform (cSTFT). We propose a series of desirable properties for the linear transform used in a multichannel source separation scenario: stationary invertibility, relative delay, relative attenuation, and finally delay invariant relative windowed-disjoint orthogonality (DIRWDO). Multisensor source separation techniques which operate in the time-frequency domain, have an inherent error unless consideration is given to the multichannel properties proposed in this paper. The sSTFT preserves these relationships for multichannel data. The crucial innovation of the sSTFT is to locally synchronize the analysis to the observations as opposed to a global clock. Improvement in separation performance can be achieved because assumed properties of the time-frequency transform are satisfied when it is appropriately synchronized. Numerical experiments show the sSTFT improves instantaneous subsample relative parameter estimation in low noise conditions and achieves good synthesis

    Effect of System Load on Video Service Metrics

    Get PDF
    Model selection, in order to learn the mapping between the kernel metrics of a machine in a server cluster and a service quality metric on a client\u27s machine, has been addressed by directly applying Linear Regression (LR) to the observations. The popularity of the LR approach is due to: 1) its implementation efficiency; 2) its low computational complexity; and finally, 3) it generally captures the data relatively accurately. LR, can however, produce misleading results if the LR model does not characterize the system: this deception is due in part to its accuracy. In the client-server service modeling literature LR is applied to the server and client metrics without treating the load on the system as the cause for the excitation of the system. By contrast, in this paper, we propose a generative model for the server and client metrics and a hierarchical model to explain the mapping between them, which is cognizant of the effects of the load on the system. Evaluations using real traces support the following conclusions: The system load accounts for ≥ 50% of the energy of a high proportion of the client and server metric traces -modeling the load is crucial; the load signal is localized in the frequency domain: we can remove the load by deconvolution; There is a significant phase shift between both the kernel and the service-level metrics, which, coupled with the load, heavily biases the results obtained from out-of-the-box LR without any system identification pre-processing

    Iterative Separation of Note Events from Single-Channel Polyphonic Recordings

    Get PDF
    This thesis is concerned with the separation of audio sources from single-channel polyphonic musical recordings using the iterative estimation and separation of note events. Each event is defined as a section of audio containing largely harmonic energy identified as coming from a single sound source. Multiple events can be clustered to form separated sources. This solution is a model-based algorithm that can be applied to a large variety of audio recordings without requiring previous training stages. The proposed system embraces two principal stages. The first one considers the iterative detection and separation of note events from within the input mixture. In every iteration, the pitch trajectory of the predominant note event is automatically selected from an array of fundamental frequency estimates and used to guide the separation of the event's spectral content using two different methods: time-frequency masking and time-domain subtraction. A residual signal is then generated and used as the input mixture for the next iteration. After convergence, the second stage considers the clustering of all detected note events into individual audio sources. Performance evaluation is carried out at three different levels. Firstly, the accuracy of the note-event-based multipitch estimator is compared with that of the baseline algorithm used in every iteration to generate the initial set of pitch estimates. Secondly, the performance of the semi-supervised source separation process is compared with that of another semi-automatic algorithm. Finally, a listening test is conducted to assess the audio quality and naturalness of the separated sources when they are used to create stereo mixes from monaural recordings. Future directions for this research focus on the application of the proposed system to other music-related tasks. Also, a preliminary optimisation-based approach is presented as an alternative method for the separation of overlapping partials, and as a high resolution time-frequency representation for digital signals

    Constructing Time-Frequency Dictionaries for Source Separation via Time-Frequency Masking and Source Localisation

    Get PDF
    We describe a new localisation and source separation algorithm which is based upon the accurate construction of time-frequency spatial signatures. We present a technique for constructing time-frequency spatial signatures with the required accuracy. This algorithm for multi-channel source separation and localisation allows arbitrary placement of microphones yet achieves good performance. We demonstrate the efficacy of the technique using source location estimates and compare estimated time-frequency masks with the ideal 0 dB mask

    Bayesian Variational Regularisation for Dark Matter Reconstruction with Uncertainty Quantification

    Get PDF
    Despite the great wealth of cosmological knowledge accumulated since the early 20th century, the nature of dark-matter, which accounts for ~85% of the matter content of the universe, remains illusive. Unfortunately, though dark-matter is scientifically interesting, with implications for our fundamental understanding of the Universe, it cannot be directly observed. Instead, dark-matter may be inferred from e.g. the optical distortion (lensing) of distant galaxies which, at linear order, manifests as a perturbation to the apparent magnitude (convergence) and ellipticity (shearing). Ensemble observations of the shear are collected and leveraged to construct estimates of the convergence, which can directly be related to the universal dark-matter distribution. Imminent stage IV surveys are forecast to accrue an unprecedented quantity of cosmological information; a discriminative partition of which is accessible through the convergence, and is disproportionately concentrated at high angular resolutions, where the echoes of cosmological evolution under gravity are most apparent. Capitalising on advances in probability concentration theory, this thesis merges the paradigms of Bayesian inference and optimisation to develop hybrid convergence inference techniques which are scalable, statistically principled, and operate over the Euclidean plane, celestial sphere, and 3-dimensional ball. Such techniques can quantify the plausibility of inferences at one-millionth the computational overhead of competing sampling methods. These Bayesian techniques are applied to the hotly debated Abell-520 merging cluster, concluding that observational catalogues contain insufficient information to determine the existence of dark-matter self-interactions. Further, these techniques were applied to all public lensing catalogues, recovering the then largest global dark-matter mass-map. The primary methodological contributions of this thesis depend only on posterior log-concavity, paving the way towards a, potentially revolutionary, complete hybridisation with artificial intelligence techniques. These next-generation techniques are the first to operate over the full 3-dimensional ball, laying the foundations for statistically principled universal dark-matter cartography, and the cosmological insights such advances may provide
    • …
    corecore