1,664 research outputs found
Recommended from our members
Modelling and extraction of fundamental frequency in speech signals
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.One of the most important parameters of speech is the fundamental frequency of vibration of voiced sounds. The audio sensation of the fundamental frequency is known as the pitch. Depending on the tonal/non-tonal category of language, the fundamental frequency conveys intonation, pragmatics and meaning. In addition the fundamental frequency and intonation carry speaker gender, age, identity, speaking style and emotional state. Accurate estimation of the fundamental frequency is critically important for functioning of speech processing applications such as speech coding, speech recognition, speech synthesis and voice morphing. This thesis makes contributions to the development of accurate pitch estimation research in three distinct ways: (1) an investigation of the impact of the window length on pitch estimation error, (2) an investigation of the use of the higher order moments and (3) an investigation of an analysis-synthesis method for selection of the best pitch value among N proposed candidates. Experimental evaluations show that the length of the speech window has a major impact on the accuracy of pitch estimation. Depending on the similarity criteria and the order of the statistical moment a window length of 37 to 80 ms gives the least error. In order to avoid excessive delay as a consequence of using a longer window, a method is proposed
ii where the current short window is concatenated with the previous frames to form a longer signal window for pitch extraction. The use of second order and higher order moments, and the magnitude difference function, as the similarity criteria were explored and compared. A novel method of calculation of moments is introduced where the signal is split, i.e. rectified, into positive and negative valued samples. The moments for the positive and negative parts of the signal are computed separately and combined. The new method of calculation of moments from positive and negative parts and the higher order criteria provide competitive results. A challenging issue in pitch estimation is the determination of the best candidate from N extrema of the similarity criteria. The analysis-synthesis method proposed in this thesis selects the pitch candidate that provides the best reproduction (synthesis) of the harmonic spectrum of the original speech. The synthesis method must be such that the distortion increases with the increasing error in the estimate of the fundamental frequency. To this end a new method of spectral synthesis is proposed using an estimate of the spectral envelop and harmonically spaced asymmetric Gaussian pulses as excitation. The N-best method provides consistent reduction in pitch estimation error. The methods described in this thesis result in a significant improvement in the pitch accuracy and outperform the benchmark YIN method
Detail Enhancing Denoising of Digitized 3D Models from a Mobile Scanning System
The acquisition process of digitizing a large-scale environment produces an enormous amount of raw geometry data. This data is corrupted by system noise, which leads to 3D surfaces that are not smooth and details that are distorted. Any scanning system has noise associate with the scanning hardware, both digital quantization errors and measurement inaccuracies, but a mobile scanning system has additional system noise introduced by the pose estimation of the hardware during data acquisition. The combined system noise generates data that is not handled well by existing noise reduction and smoothing techniques.
This research is focused on enhancing the 3D models acquired by mobile scanning systems used to digitize large-scale environments. These digitization systems combine a variety of sensors – including laser range scanners, video cameras, and pose estimation hardware – on a mobile platform for the quick acquisition of 3D models of real world environments. The data acquired by such systems are extremely noisy, often with significant details being on the same order of magnitude as the system noise. By utilizing a unique 3D signal analysis tool, a denoising algorithm was developed that identifies regions of detail and enhances their geometry, while removing the effects of noise on the overall model.
The developed algorithm can be useful for a variety of digitized 3D models, not just those involving mobile scanning systems. The challenges faced in this study were the automatic processing needs of the enhancement algorithm, and the need to fill a hole in the area of 3D model analysis in order to reduce the effect of system noise on the 3D models. In this context, our main contributions are the automation and integration of a data enhancement method not well known to the computer vision community, and the development of a novel 3D signal decomposition and analysis tool. The new technologies featured in this document are intuitive extensions of existing methods to new dimensionality and applications. The totality of the research has been applied towards detail enhancing denoising of scanned data from a mobile range scanning system, and results from both synthetic and real models are presented
An Adaptive Parameterisation Method for Shape Optimisation Using Adjoint Sensitivities.
PhD Theses.Adjoint methods are the most e cient approach to compute the design sensitivities
as the entire gradient vector of a single objective function is obtained in
a single adjoint system solve. This in turn opens up a wide range of possibilities
to parameterise the shape. Most shape parameterisation methods require manual
set-up which typically results in a restricted design space. In this work, two parameterisation
methods that can be derived automatically from existing information are
extended to include adaptive design space in shape optimisation.
The node-based method derives parameterisation directly from the computational
mesh employed for simulation and normal displacements of the surface grid
nodes are taken as design variables. This method o ers the richest design space for
shape optimisation. However, this method requires an additional surface regularization
method to annihilate high-frequency shape modes. Hence the best achievable
design depends on the amount of smoothing applied on the design surface. An improved
adaptive explicit surface regularization method is proposed in this thesis to
capture superior shape modes in the design process.
The NSPCC approach takes CAD descriptions as input and perturbs the control
points of the NURBS boundary representation to modify the shape. The adaptive
NSPCC method is proposed where the optimisation begins with a coarser design
space and adapts to ner parameterisation during the design process. Driven by adjoint
sensitivity information the control points on the design surfaces are adaptively
enriched using knot insertion algorithm without modifying the shape. Both parameterisation
methods are coupled in the adjoint-based shape optimisation process to
reduce the total pressure loss of a turbine blade internal cooling channel. Based
on analyses regarding the quality of the optima and the rate of convergence of the
design process the adaptive NSPCC method outperforms both adaptive node-based
and the static NSPCC approach
Real-time Ultrasound Signals Processing: Denoising and Super-resolution
Ultrasound acquisition is widespread in the biomedical field, due to its properties of low cost, portability, and non-invasiveness for the patient. The processing and analysis of US signals, such as images, 2D videos, and volumetric images, allows the physician to monitor the evolution of the patient's disease, and support diagnosis, and treatments (e.g., surgery). US images are affected by speckle noise, generated by the overlap of US waves. Furthermore, low-resolution images are acquired when a high acquisition frequency is applied to accurately characterise the behaviour of anatomical features that quickly change over time. Denoising and super-resolution of US signals are relevant to improve the visual evaluation of the physician and the performance and accuracy of processing methods, such as segmentation and classification. The main requirements for the processing and analysis of US signals are real-time execution, preservation of anatomical features, and reduction of artefacts. In this context, we present a novel framework for the real-time denoising of US 2D images based on deep learning and high-performance computing, which reduces noise while preserving anatomical features in real-time execution. We extend our framework to the denoise of arbitrary US signals, such as 2D videos and 3D images, and we apply denoising algorithms that account for spatio-temporal signal properties into an image-to-image deep learning model. As a building block of this framework, we propose a novel denoising method belonging to the class of low-rank approximations, which learns and predicts the optimal thresholds of the Singular Value Decomposition. While previous denoise work compromises the computational cost and effectiveness of the method, the proposed framework achieves the results of the best denoising algorithms in terms of noise removal, anatomical feature preservation, and geometric and texture properties conservation, in a real-time execution that respects industrial constraints. The framework reduces the artefacts (e.g., blurring) and preserves the spatio-temporal consistency among frames/slices; also, it is general to the denoising algorithm, anatomical district, and noise intensity. Then, we introduce a novel framework for the real-time reconstruction of the non-acquired scan lines through an interpolating method; a deep learning model improves the results of the interpolation to match the target image (i.e., the high-resolution image). We improve the accuracy of the prediction of the reconstructed lines through the design of the network architecture and the loss function. %The design of the deep learning architecture and the loss function allow the network to improve the accuracy of the prediction of the reconstructed lines. In the context of signal approximation, we introduce our kernel-based sampling method for the reconstruction of 2D and 3D signals defined on regular and irregular grids, with an application to US 2D and 3D images. Our method improves previous work in terms of sampling quality, approximation accuracy, and geometry reconstruction with a slightly higher computational cost. For both denoising and super-resolution, we evaluate the compliance with the real-time requirement of US applications in the medical domain and provide a quantitative evaluation of denoising and super-resolution methods on US and synthetic images. Finally, we discuss the role of denoising and super-resolution as pre-processing steps for segmentation and predictive analysis of breast pathologies
Prediction of nonlinear nonstationary time series data using a digital filter and support vector regression
Volatility is a key parameter when measuring the size of the errors made in modelling returns
and other nonlinear nonstationary time series data. The Autoregressive Integrated Moving-
Average (ARIMA) model is a linear process in time series; whilst in the nonlinear system, the
Generalised Autoregressive Conditional Heteroskedasticity (GARCH) and Markov Switching
GARCH (MS-GARCH) models have been widely applied. In statistical learning theory,
Support Vector Regression (SVR) plays an important role in predicting nonlinear and
nonstationary time series data. We propose a new class model comprised of a combination of
a novel derivative Empirical Mode Decomposition (EMD), averaging intrinsic mode function
(aIMF) and a novel of multiclass SVR using mean reversion and coefficient of variance (CV)
to predict financial data i.e. EUR-USD exchange rates. The proposed novel aIMF is capable
of smoothing and reducing noise, whereas the novel of multiclass SVR model can predict
exchange rates. Our simulation results show that our model significantly outperforms
simulations by state-of-art ARIMA, GARCH, Markov Switching generalised Autoregressive
conditional Heteroskedasticity (MS-GARCH), Markov Switching Regression (MSR) models
and Markov chain Monte Carlo (MCMC) regression.Open Acces
Error norms for the adaptive solution of the Navier-Stokes equations
The adaptive solution of the Navier-Stokes equations depends upon the successful interaction of three key elements: (1) the ability to flexibly select grid length scales in composite grids, (2) the ability to efficiently control residual error in composite grids, and (3) the ability to define reliable, convenient error norms to guide the grid adjustment and optimize the residual levels relative to the local truncation errors. An initial investigation was conducted to explore how to approach developing these key elements. Conventional error assessment methods were defined and defect and deferred correction methods were surveyed. The one dimensional potential equation was used as a multigrid test bed to investigate how to achieve successful interaction of these three key elements
Scaling Multidimensional Inference for Big Structured Data
In information technology, big data is a collection of data sets so large and complex that it becomes difficult to process using traditional data processing applications [151]. In a
world of increasing sensor modalities, cheaper storage, and more data oriented questions, we are quickly passing the limits of tractable computations using traditional statistical analysis
methods. Methods which often show great results on simple data have difficulties processing complicated multidimensional data. Accuracy alone can no longer justify unwarranted memory
use and computational complexity. Improving the scaling properties of these methods for multidimensional data is the only way to make these methods relevant. In this work we explore methods for improving the scaling properties of parametric and nonparametric
models. Namely, we focus on the structure of the data to lower the complexity of a specific family of problems. The two types of structures considered in this work are distributive
optimization with separable constraints (Chapters 2-3), and scaling Gaussian processes for multidimensional lattice input (Chapters 4-5). By improving the scaling of these methods, we can expand their use to a wide range of applications which were previously intractable
open the door to new research questions
- …