19,842 research outputs found
An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony
In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique
An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony
In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique
Spread spectrum-based video watermarking algorithms for copyright protection
Merged with duplicate record 10026.1/2263 on 14.03.2017 by CS (TIS)Digital technologies know an unprecedented expansion in the last years. The consumer can
now benefit from hardware and software which was considered state-of-the-art several years
ago. The advantages offered by the digital technologies are major but the same digital
technology opens the door for unlimited piracy. Copying an analogue VCR tape was certainly
possible and relatively easy, in spite of various forms of protection, but due to the analogue
environment, the subsequent copies had an inherent loss in quality. This was a natural way of
limiting the multiple copying of a video material. With digital technology, this barrier
disappears, being possible to make as many copies as desired, without any loss in quality
whatsoever. Digital watermarking is one of the best available tools for fighting this threat.
The aim of the present work was to develop a digital watermarking system compliant with the
recommendations drawn by the EBU, for video broadcast monitoring. Since the watermark
can be inserted in either spatial domain or transform domain, this aspect was investigated and
led to the conclusion that wavelet transform is one of the best solutions available. Since
watermarking is not an easy task, especially considering the robustness under various attacks
several techniques were employed in order to increase the capacity/robustness of the system:
spread-spectrum and modulation techniques to cast the watermark, powerful error correction
to protect the mark, human visual models to insert a robust mark and to ensure its invisibility.
The combination of these methods led to a major improvement, but yet the system wasn't
robust to several important geometrical attacks. In order to achieve this last milestone, the
system uses two distinct watermarks: a spatial domain reference watermark and the main
watermark embedded in the wavelet domain. By using this reference watermark and techniques
specific to image registration, the system is able to determine the parameters of the attack and
revert it. Once the attack was reverted, the main watermark is recovered. The final result is a
high capacity, blind DWr-based video watermarking system, robust to a wide range of attacks.BBC Research & Developmen
VI Workshop on Computational Data Analysis and Numerical Methods: Book of Abstracts
The VI Workshop on Computational Data Analysis and Numerical Methods (WCDANM) is going to be held on June 27-29, 2019, in the Department of Mathematics of the University of Beira Interior (UBI), CovilhĂŁ, Portugal and it is a unique opportunity to disseminate scientific research related to the areas of Mathematics in general, with particular relevance to the areas of Computational Data Analysis and Numerical Methods in theoretical and/or practical field, using new techniques, giving especial emphasis to applications in Medicine, Biology, Biotechnology, Engineering, Industry, Environmental Sciences, Finance, Insurance, Management and Administration. The meeting will provide a forum for discussion and debate of ideas with interest to the scientific community in general. With this meeting new scientific collaborations among colleagues, namely new collaborations in Masters and PhD projects are expected. The event is open to the entire scientific community (with or without communication/poster)
Review of Top Quark Physics Results
As the heaviest known fundamental particle, the top quark has taken a central
role in the study of fundamental interactions. Production of top quarks in
pairs provides an important probe of strong interactions. The top quark mass is
a key fundamental parameter which places a valuable constraint on the Higgs
boson mass and electroweak symmetry breaking. Observations of the relative
rates and kinematics of top quark final states constrain potential new physics.
In many cases, the tests available with study of the top quark are both
critical and unique. Large increases in data samples from the Fermilab Tevatron
have been coupled with major improvements in experimental techniques to produce
many new precision measurements of the top quark. The first direct evidence for
electroweak production of top quarks has been obtained, with a resulting direct
determination of . Several of the properties of the top quark have been
measured. Progress has also been made in obtaining improved limits on potential
anomalous production and decay mechanisms. This review presents an overview of
recent theoretical and experimental developments in this field. We also provide
a brief discussion of the implications for further efforts.Comment: 119 pages, 55 figure
Structure Functions of the Nucleon and their Interpretation
The current status of measurements of the nucleon structure functions and
their understanding is reviewed. The fixed target experiments E665, CCFR and
NMC and the HERA experiments H1 and ZEUS are discussed in some detail. The
extraction of parton momentum distribution functions from global fits is
described, with particular attention paid to much improved information on the
gluon momentum distribution. The status of alpha_s measurements from deep
inelastic data is reviewed. Models and non-perturbative approaches for the
parton input distributions are outlined. The impact on the phenomenology of QCD
of the data at very low values of the Bjorken x variable is discussed in
detail. Recent advances in the understanding of the transition from deep
inelastic scattering to photoproduction are summarised. Some brief comments are
made on the recent HERA measurements of the ep NC and CC cross-sections at very
high Q2.Comment: 196 pages, 79 figures, uses ijmpa.sty and psfig.tex (included
- …