148 research outputs found
A Novel Efficient Algorithm for Voice Gender Conversion
Realistic Voice Gender Conversion (VGC) requires
independent scaling of the glottal (pitch) and vocal tract
(formant) related features of the input speech signal. We
present a VGC algorithm which has two novel features.
Firstly, an efficient frequency scaling algorithm is presented.
Secondly, we use this to scale all frequencies in the input
signal by the desired formant scaling factor. We then
deconvolve the glottal contribution using a standard linear
predictive analysis and frequency scale it further such that the
desired pitch scaling factor is equal to the product of the two
frequency scaling factors. Finally, we resynthesize the
converted speech. The female-to-male results were excellent
while the male-to-female results sounded synthetic
Using Podcasts to support Communication Skills Development: A Case Study for Content Preferences among Postgraduate Research Students
The need for the integration of generic skills training into structured PhD programmes is widely accepted. However, effective integration of such training requires flexible delivery mechanisms which facilitate self-paced and independent learning. A video recording was made of an eminent speaker delivering a 1-h live presentation to a group of 15 first-year science and engineering PhD research students. The topic of the presentation was inter-disciplinary professional communication skills. Following the presentation, the video recording was post-processed into seven alternative podcast formats. These podcast formats included a typed transcription, a full audio recording, a full video recording, presentation slides with embedded speech etc. The choice of podcast formats was based on ease-of-production by a typical computer-literate academic and ease-of-use by a typical computer-literate student. At a subsequent session, the seven podcast formats were shown to the 15 students and a survey to assess their reactions to the various formats was carried out. The survey results (quantitative and qualitative) were analysed to provide useful insight into the student preferences in relation to podcast formats. The students expressed a clear preference for summary key-point slides with explanatory voice-over by the original speaker
A novel approach to Acoustic Echo cancellation
In this paper a novel approach to single microphone Acoustic Echo cancellation (AEC) is presented. This approach performs AEC by employing techniques developed for monaural sound source separation. It is shown that the AEC problem can be cast in a monaural sound source separation framework and through this framework significant echo suppression can be achieved. The new approach is evaluated through experiments on simulated data
A novel approach to Acoustic Echo cancellation
In this paper a novel approach to single microphone Acoustic Echo cancellation (AEC) is presented. This approach performs AEC by employing techniques developed for monaural sound source separation. It is shown that the AEC problem can be cast in a monaural sound source separation framework and through this framework significant echo suppression can be achieved. The new approach is evaluated through experiments on simulated data
Multi-Channel Audio Time-Scale Modification
Phase vecoder based approaches to audio time-scale modification introduce a reverberant artefact into the time scaled output. Recent techniques have been developed to reduce the presence of this artefact; however, these techniques have the effect of introducing additional issues relating to their application to multi-channel recordings. This paper addresses these issues by collectively analysing all channels prior to time-scaling each individual channel
Sub-band Independent Subspace Analysis for Drum Transcription
While Independent Subspace Analysis provides a means of separating sound sources from a single channel signal, making it an effective tool for drum transcription, it does have a number of problems. Not least of these is that the amount of information required to allow separation of sound sources varies from signal to signal. To overcome this indeterminacy and improve the robustness of transcription an extension of Independent Subspace Analysis to include sub-band processing is proposed. The use of this approach is demonstrated by its application in a simple drum transcription algorithm
Comparison of Signal Reconstruction Methods for the Azimuth Discrimination and Resynthesis Algorithm
The Azimuth Discrimination and Resynthesis algorithm, (ADRess), has been shown to produce high quality sound
source separation results for intensity panned stereo recordings. There are however, artifacts such as phasiness
which become apparent in the separated signals under certain conditions. This is largely due to the fact that only the
magnitude spectra for the separated sources are estimated. Each source is then resynthesised using the phase
information obtained from the original mixture. This paper describes the nature and origin of the associated artifacts
and proposes alternative techniques for resynthesising the separated signals. A comparison of each technique is then
presented
GSE1 Postgraduate Information Literacy and Communication Skills Training-Project Orientated Delivery
Module presented by the Faculty of Science and Engineering in cooperation with the Subject Librarian from Learning Teaching and Research Development in the Library.
Original format was four three hour workshops plus 18 hrs self paced course work using interdisciplinary written and verbal skills training tasks .
Translate a peer reviewed Journal paper to communicate it to a interdisciplinary audience and prepare a short presentation.
It was enhanced with the introduction of a peer-learning component by dividing participants into four groups of eight.
A project orientated and problem based learning (POPBL) approach has been shown to work well in the delivery of educational outcomes and also in the delivery of Information Literac
Real-time Sound Source Separation: Azimuth Discrimination and Resynthesis
We present a real-time sound source separation algorithm which performs the task of source separation based on the
lateral displacement of a source within the stereo field. The algorithm exploits the use of the pan pot as a means to
achieve image localisation within stereophonic recordings. As such, only an interaural intensity difference exists
between left and right channels for a single source. Gain scaling and phase cancellation techniques are used in the
frequency domain to expose frequency dependent nulls across the azimuth plane. The position of these nulls in
conjunction with magnitude estimation and grouping techniques are then used to resynthesise separated sources.
Results obtained from real recordings show that for music, this algorithm outperforms current source separation
schemes
An Efficient Phasiness Reduction Technique for Moderate Audio Time-scale Modification
Phase vocoder approaches to timescale modification of audio introduce a reverberant/phasy artifact into the time-scaled output due to a loss in phase coherence between short-time Fourier transform (STFT) bins. Recent improvements to the phase vocoder have reduced the presence of this artifact, however, it remains a problem. A method of time-scaling is presented that results in a further reduction in phasiness, for moderate timescale factors, by taking advantage of some flexibility that exists in the choice of phase required so as to maintain horizontal phase coherence between related STFT bins. Furthermore, the approach leads to a reduction in computational load within the range of time-scaling factors for which phasi-ness is reduced
- …