925 research outputs found

    Objective and Subjective Evaluation of Wideband Speech Quality

    Get PDF
    Traditional landline and cellular communications use a bandwidth of 300 - 3400 Hz for transmitting speech. This narrow bandwidth impacts quality, intelligibility and naturalness of transmitted speech. There is an impending change within the telecommunication industry towards using wider bandwidth speech, but the enlarged bandwidth also introduces a few challenges in speech processing. Echo and noise are two challenging issues in wideband telephony, due to increased perceptual sensitivity by users. Subjective and/or objective measurements of speech quality are important in benchmarking speech processing algorithms and evaluating the effect of parameters like noise, echo, and delay in wideband telephony. Subjective measures include ratings of speech quality by listeners, whereas objective measures compute a metric based on the reference and degraded speech samples. While subjective quality ratings are the gold - standard\u27\u27, they are also time- and resource- consuming. An objective metric that correlates highly with subjective data is attractive, as it can act as a substitute for subjective quality scores in gauging the performance of different algorithms and devices. This thesis reports results from a series of experiments on subjective and objective speech quality evaluation for wideband telephony applications. First, a custom wideband noise reduction database was created that contained speech samples corrupted by different background noises at different signal to noise ratios (SNRs) and processed by six different noise reduction algorithms. Comprehensive subjective evaluation of this database revealed an interaction between the algorithm performance, noise type and SNR. Several auditory-based objective metrics such as the Loudness Pattern Distortion (LPD) measure based on the Moore - Glasberg auditory model were evaluated in predicting the subjective scores. In addition, the performance of Bayesian Multivariate Regression Splines(BMLS) was also evaluated in terms of mapping the scores calculated by the objective metrics to the true quality scores. The combination of LPD and BMLS resulted in high correlation with the subjective scores and was used as a substitution for fine - tuning the noise reduction algorithms. Second, the effect of echo and delay on the wideband speech was evaluated in both listening and conversational context, through both subjective and objective measures. A database containing speech samples corrupted by echo with different delay and frequency response characteristics was created, and was later used to collect subjective quality ratings. The LPD - BMLS objective metric was then validated using the subjective scores. Third, to evaluate the effect of echo and delay in conversational context, a realtime simulator was developed. Pairs of subjects conversed over the simulated system and rated the quality of their conversations which were degraded by different amount of echo and delay. The quality scores were analysed and LPD+BMLS combination was found to be effective in predicting subjective impressions of quality for condition-averaged data

    An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

    Get PDF
    In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique

    An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

    Get PDF
    In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique

    Conservation biology and genomics of a flagship endangered species: the Mauritian pink pigeon Nesoenas mayeri

    Get PDF
    The reduction of natural populations is a major conservation problem and the main cause of increased extinction risk. I investigated conservation issues that limit the growth of the pink pigeon population, including infection with Trichomonas gallinae, biased sex ratios and the reduction in both reproductive fitness and life history traits. In particular, I examined the association between these issues in the pink pigeon and genome-wide genetic variation using 45,841 single nucleotide polymorphisms (SNPs) generated using restriction site associated DNA sequencing (RAD-seq). The average observed and expected heterozygosity (Ho and He respectively) were low, at 0.27 and 0.28 respectively. Rapid genetic loss has increased inbreeding depression in the population. The effective population size was found to be particularly small in one subpopulation, Ile aux Aigrettes. Examining genome-wide heterozygosity for both immune and non-immune genes between males and females showed that males have a higher level of gene variation than females, which may explain the male-biased sex ratio in fledglings that this study found to be significant. A significant negative association was found between genome-wide heterozygosity and infection with Trichomonas gallinae. The longevity and body weight of adult birds and fledgling success showed a significant positive relationship with the level of genome-wide heterozygosity. Reproductive success, in terms of the number of nests, eggs laid and hatched young that died before fledging, did not show any significant relationship with the level of genome-wide heterozygosity. However, using genome-wide association studies (GWAS), this study identified a genomic region close to the progesterone receptor gene (PRG) that potential affects egg-laying in the pink pigeon. The findings of this thesis suggest an association between the problems limiting the growth of the pink pigeon population and a reduction in genome-wide variation, suggesting that the pink pigeon may be entering a vortex that may drive the species to extinction and thus emphasising the urgent need for conservation management to avoid its extinction

    Study of multicomponent T2* relaxation in brain tumours

    Get PDF
    Tese de mestrado integrado em Engenharia Biomédica e Biofísica (Sinais e Imagens Médicas), Universidade de Lisboa, Faculdade de Ciências, 2020Nas últimas décadas, o desenvolvimento e aperfeiçoamento de modalidades de imagem tem revolucionado o tratamento de pacientes com as mais variadas patologias. Inúmeros estudos científicos têm surgido no seguimento destas técnicas com o objetivo de ajudar no diagnóstico, no prognóstico e na monitorização das doenças. É o caso do cancro que segundo a Organização Mundial de Saúde (acrónimo em inglês: WHO), é a segunda maior causa de morte a nível mundial. Tumores cerebrais resultam do crescimento descontrolado de células, processo que pode ter origem direta no cérebro ou resultar da invasão de células de outros tecidos do corpo, também conhecido como metastização. Podem ser classificados como benignos ou malignos consoante os critérios de agressividade definidos pela WHO numa escala de I-IV. A alta subjetividade entre médicos na classificação dos tumores cerebrais levou a uma recente reformulação dos critérios de diferenciação que agora engloba tanto parâmetros fenotípicos como genotípicos. A sobrevivência dos pacientes está altamente dependente do tipo de tumor, do estágio em que se encontra aquando do diagnóstico e da avaliação médica que deverá definir o tratamento a ser aplicado. Os exames médicos que costumam ser realizados incluem imagem por ressonância magnética (acrónimo em inglês: MRI) e tomografia por emissão de positrões (acrónimo em inglês: PET). No Instituto de Neurociência e Medicina – 4 (acrónimo em inglês: INM-4) do centro de investigação em Jülich, a existência de um scanner híbrido permite a aquisição simultânea de imagens de PET e MRI. Esta tese não engloba o estudo de PET mas a informação metabólica proveniente das imagens simultaneamente adquiridas é usada para gerar máscaras tumorais, utilizadas ao longo deste trabalho. A MRI é uma modalidade não invasiva que tem por base a aplicação de um campo magnético externo e a emissão de ondas de radiofrequência que variam no tempo. Tem as vantagens de não utilizar radiação ionizante e providenciar excelentes contrastes entre os diferentes tipos de tecido. O contraste da imagem depende do peso relativo dado aos parâmetros específicos aquando da aquisição. A possibilidade de adaptar o contraste consoante o que se pretende visualizar ou estudar, faz desta uma modalidade de imagem essencial tanto na clínica médica como em investigação científica. A utilização de imagens ponderadas, também conhecido como MRI qualitativo, apresenta em si algumas limitações. A decisão do tratamento a aplicar é muitas vezes subjetiva, e pode diferir entre diferentes grupos de médicos. Por outro lado, o desenvolvimento de sequências que permitem a aquisição de séries de imagens introduziu a estimação de parâmetros específicos e independentes do scanner usados para avaliar quantitativamente os diferentes tecidos, o que é designado como MRI quantitativo (acrónimo em inglês: qMRI). Através da modelação das curvas de relaxação, qMRI pode ajudar na deteção de pequenas patologias no tecido cerebral, que podem não ser possíveis de observar nas imagens ponderadas usadas na clínica médica. Apesar das suas grandes vantagens, qMRI é maioritariamente uma área de investigação, e a sua inclusão clínica está ainda limitada pelo prolongamento do tempo de aquisição, necessário para adquirir as imagens extra, e pela necessidade de métodos de estimação com extrema exatidão e precisão. Para este estudo, 33 pacientes (entre os 27 e os 76 anos de idade) com suspeita de tumor cerebral ou recorrência de tumores previamente tratados, foram submetidos à aquisição simultânea de MRI e PET. O estudo de modelos de relaxação exponencial foi realizado através de imagens ponderadas em T2*, adquiridas com a uma sequência multi-eco gradiente-eco (acrónimo em inglês: mGRE). T2* permite caracterizar não homogeneidades de campo magnético associadas a concentrações locais de moléculas paramagnéticas, tais como o ferro. Contudo, existem outros fatores que contribuem para o decaimento exponencial mais rápido, nomeadamente contribuições de campos magnéticos externos não homogéneos. Uma vez que estas contribuições não contêm nenhuma informação sobre a fisiologia ou patofisiologia do tecido, torna-se necessário corrigir distorções associadas a campos magnéticos não homogéneos antes de qualquer método de processamento. Neste trabalho, é estudada a influência de dois métodos de correção de campos magnéticos não homogéneos, nomeadamente sinc correction e voxel spread function (VSF). O método de sinc correction considera o desfasamento causado pelo intervalo entre fatias e estima uma modulação sinc do sinal, associada às distorções macroscópicas do campo magnético, para que o sinal seja posteriormente reconstruído. O segundo método, VSF, é um algoritmo matemático que corrige os efeitos do campo magnético não homogéneo nas direções de codificação da fase e da frequência. Para estudar a influência dos efeitos de redução do ruído, dois métodos foram aplicados, nomeadamente filtragem Gaussiana e redução de ruído baseada na análise de componentes principais (acrónimo em inglês: PCA). A típica análise de dados em qMRI assume a existência de um único compartimento por voxel, que muitas vezes é demasiado simplista e pode levar a estimativas tendenciosas. Na realidade, um voxel pode ser constituído por múltiplos compartimentos, cada qual com o seu próprio tempo de relaxação T2*. Por isso, neste trabalho, comparam-se os resultados quantitativos de um modelo que descreve um único compartimento, com um modelo que descreve múltiplos compartimentos. Para tal, o algoritmo dos mínimos quadrados não negativos (acrónimo em inglês: NNLS) é utilizado, com a vantagem de não necessitar de um conhecimento prévio do número de compartimentos (ou seja, do número de exponenciais T2*). Contudo, a relaxometria multi-exponencial é um problema mal condicionado onde várias soluções são possíveis, e, por isso, extremamente sensível ao ruído. Para ultrapassar estas dificuldades e estabilizar as soluções é utilizada uma versão regularizada do algoritmo NNLS, que adicionalmente requer uma restrição da regularização a usar. Para encontrar o melhor intervalo de parâmetros de regularização foi implementado o método da curva L. Com este método, foram também estudadas a influência de escalas logarítmicas e lineares no intervalo de soluções de T2* e no intervalo de regularização. Em relação aos valores de T2*, foram encontradas diferenças significativas nas regiões tumorais entre a média de T2* obtida com uma abordagem mono-exponencial (~58 ms) e a média geométrica de T2* obtida com uma abordagem multi-exponencial (~74 ms), para todos os pacientes. É também demonstrada alta heterogeneidade nos valores de T2*, não só entre pacientes, mas também dentro do mesmo tumor, o que levou a um estudo individual de cada paciente. São relatadas diferenças significativas entre regiões de tecido tumoral ativo e regiões de controlo localizadas no lado contralateral às regiões tumorais (aparentemente saudáveis). De um modo geral, os métodos propostos para remoção do ruído demonstraram reduções substanciais de ruído nas imagens, enquanto que ambos os métodos de correção das não homogeneidades de campo magnético não revelaram ser robustos, e no caso particular da VSF, sendo que bons resultados foram publicados em experimentos 3D, o problema pode estar associado a uma má otimização para o caso de experimentos 2D. O conteúdo de água foi considerado altamente dependente do método de correção usado. Verificou-se que o conteúdo de água em regiões tumorais é próximo ao conteúdo de água encontrado em regiões de substância cinzenta, e maior do que o encontrado em regiões de edema. Ainda, o conteúdo de água na substância branca apresentou valores inferiores que as demais regiões. Finalmente, foi estudada a relação entre o conteúdo de água e os valores de T2*, apesar de nenhuma correlação ter sido evidenciada nas diferentes regiões de interesse. Com este trabalho concluiu-se que a interpretação quantitativa do relaxação de T2* em tumores cerebrais é uma tarefa complicada que exige métodos de elevada exatidão. No entanto, devido à heterogeneidade encontrada não só entre os diferentes tumores mas também no mesmo paciente, pode ser uma ferramenta não invasiva com potencial para monitorizar, avaliar e classificar tumores.Quantitative magnetic resonance imaging (qMRI) allows for the estimation of scanner-independent, tissue-specific parameters. By modelling the relaxation curves, qMRI can help inform on small pathological changes in the brain tissue, that might not be visible in the weighed images used typically in the clinic. However, the use of this imaging technique is often limited in the clinic by the prolonged measurement times and demand for very accurate estimation methods. To improve accuracy, noise reduction and field-inhomogeneity correction methods are of paramount importance. Additionally, the typical analysis of qMRI data assumes a single compartment per voxel, which is often oversimplified and can lead to biased estimations. This thesis addresses the effects of denoising, field-inhomogeneity correction and single compartment vs multiple compartment analysis in the estimation of the qMRI parameters water content and T2*. We explore these effects in the WM, GM, CSF, tumour and oedema regions of a cohort of 33 brain tumour patients. The images were acquired using a multiple echo gradient echo sequence in a hybrid MR-PET system, which allows for the identification of active tumour tissue. To survey the effects of noise reduction in the estimation of the aforementioned qMRI parameters, we apply two denoising methods, namely Gaussian filtering and principal component analysis (PCA). Furthermore, the effects of two field-inhomogeneity correction methods, in particular sinc correction and voxel spread function (VSF), are also investigated. Finally, we compare mono-exponential models to multi-exponential models and the corresponding T2* values of the different regions. In the case of multi-exponential model, no a priori assumptions about the number of exponential components are made. Regarding T2*, high differences are found in the tumour region between the obtained mean T2* with a mono-exponential approach (~58 ms) and the obtained geometric mean T2* with a multiexponential approach (~74 ms), across the different patients. High heterogeneity in the T2* values is found for different tumour types, as well as inside of the active tumour tissue within the same patient, which lead to an individual study of each patient. Significant differences are found between the T2* distributions within distinct regions of the active tumour tissue and the corresponding contralateral regions. Water content was found to be highly dependent on the used correction method. Overall, water content in the tumour is found to be close to that of GM, and higher than that of oedema. Water content in WM is lower than that of the other tissue classes. Finally, water content and geometric mean T2* values do not display significant correlations in any of the tissue classes investigated, thus offering a complementary view of the properties of tissue. We conclude that a quantitative interpretation of T2* relaxation in brain tumours is a very challenging task, but due to the heterogeneity found not only across the cohort but also within the active tumour tissue for each patient, it might be a potential non-invasive tool in monitoring, evaluation and grading of tumours

    Speech Enhancement Exploiting the Source-Filter Model

    Get PDF
    Imagining everyday life without mobile telephony is nowadays hardly possible. Calls are being made in every thinkable situation and environment. Hence, the microphone will not only pick up the user’s speech but also sound from the surroundings which is likely to impede the understanding of the conversational partner. Modern speech enhancement systems are able to mitigate such effects and most users are not even aware of their existence. In this thesis the development of a modern single-channel speech enhancement approach is presented, which uses the divide and conquer principle to combat environmental noise in microphone signals. Though initially motivated by mobile telephony applications, this approach can be applied whenever speech is to be retrieved from a corrupted signal. The approach uses the so-called source-filter model to divide the problem into two subproblems which are then subsequently conquered by enhancing the source (the excitation signal) and the filter (the spectral envelope) separately. Both enhanced signals are then used to denoise the corrupted signal. The estimation of spectral envelopes has quite some history and some approaches already exist for speech enhancement. However, they typically neglect the excitation signal which leads to the inability of enhancing the fine structure properly. Both individual enhancement approaches exploit benefits of the cepstral domain which offers, e.g., advantageous mathematical properties and straightforward synthesis of excitation-like signals. We investigate traditional model-based schemes like Gaussian mixture models (GMMs), classical signal processing-based, as well as modern deep neural network (DNN)-based approaches in this thesis. The enhanced signals are not used directly to enhance the corrupted signal (e.g., to synthesize a clean speech signal) but as so-called a priori signal-to-noise ratio (SNR) estimate in a traditional statistical speech enhancement system. Such a traditional system consists of a noise power estimator, an a priori SNR estimator, and a spectral weighting rule that is usually driven by the results of the aforementioned estimators and subsequently employed to retrieve the clean speech estimate from the noisy observation. As a result the new approach obtains significantly higher noise attenuation compared to current state-of-the-art systems while maintaining a quite comparable speech component quality and speech intelligibility. In consequence, the overall quality of the enhanced speech signal turns out to be superior as compared to state-of-the-art speech ehnahcement approaches.Mobiltelefonie ist aus dem heutigen Leben nicht mehr wegzudenken. Telefonate werden in beliebigen Situationen an beliebigen Orten geführt und dabei nimmt das Mikrofon nicht nur die Sprache des Nutzers auf, sondern auch die Umgebungsgeräusche, welche das Verständnis des Gesprächspartners stark beeinflussen können. Moderne Systeme können durch Sprachverbesserungsalgorithmen solchen Effekten entgegenwirken, dabei ist vielen Nutzern nicht einmal bewusst, dass diese Algorithmen existieren. In dieser Arbeit wird die Entwicklung eines einkanaligen Sprachverbesserungssystems vorgestellt. Der Ansatz setzt auf das Teile-und-herrsche-Verfahren, um störende Umgebungsgeräusche aus Mikrofonsignalen herauszufiltern. Dieses Verfahren kann für sämtliche Fälle angewendet werden, in denen Sprache aus verrauschten Signalen extrahiert werden soll. Der Ansatz nutzt das Quelle-Filter-Modell, um das ursprüngliche Problem in zwei Unterprobleme aufzuteilen, die anschließend gelöst werden, indem die Quelle (das Anregungssignal) und das Filter (die spektrale Einhüllende) separat verbessert werden. Die verbesserten Signale werden gemeinsam genutzt, um das gestörte Mikrofonsignal zu entrauschen. Die Schätzung von spektralen Einhüllenden wurde bereits in der Vergangenheit erforscht und zum Teil auch für die Sprachverbesserung angewandt. Typischerweise wird dabei jedoch das Anregungssignal vernachlässigt, so dass die spektrale Feinstruktur des Mikrofonsignals nicht verbessert werden kann. Beide Ansätze nutzen jeweils die Eigenschaften der cepstralen Domäne, die unter anderem vorteilhafte mathematische Eigenschaften mit sich bringen, sowie die Möglichkeit, Prototypen eines Anregungssignals zu erzeugen. Wir untersuchen modellbasierte Ansätze, wie z.B. Gaußsche Mischmodelle, klassische signalverarbeitungsbasierte Lösungen und auch moderne tiefe neuronale Netzwerke in dieser Arbeit. Die so verbesserten Signale werden nicht direkt zur Sprachsignalverbesserung genutzt (z.B. Sprachsynthese), sondern als sogenannter A-priori-Signal-zu-Rauschleistungs-Schätzwert in einem traditionellen statistischen Sprachverbesserungssystem. Dieses besteht aus einem Störleistungs-Schätzer, einem A-priori-Signal-zu-Rauschleistungs-Schätzer und einer spektralen Gewichtungsregel, die üblicherweise mit Hilfe der Ergebnisse der beiden Schätzer berechnet wird. Schließlich wird eine Schätzung des sauberen Sprachsignals aus der Mikrofonaufnahme gewonnen. Der neue Ansatz bietet eine signifikant höhere Dämpfung des Störgeräuschs als der bisherige Stand der Technik. Dabei wird eine vergleichbare Qualität der Sprachkomponente und der Sprachverständlichkeit gewährleistet. Somit konnte die Gesamtqualität des verbesserten Sprachsignals gegenüber dem Stand der Technik erhöht werden

    Recent Advances in Signal Processing

    Get PDF
    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

    Neuroimaging in Paediatric Tuberous Sclerosis Complex

    Full text link
    This thesis explores novel aspects of the neuroradiological features of tuberous sclerosis complex (TSC), including the relationship between these features and subsequent epilepsy and developmental outcome in later childhood. The neuroradiological features of TSC include cortical tubers, subependymal nodules and subependymal giant cell astrocytoma (SEGA). Patients with TSC have high rates of refractory epilepsy, developmental delay and Autism Spectrum Disorder (ASD), but are affected to varying degrees leading to a wide phenotypic spectrum. Prior studies focussing on TSC neuroradiological markers have found some associations between lesion burden and severity of neurological dysfunction, but the literature remains inconclusive. Literature regarding congenital SEGA is limited but suggested they grew more aggressively than other SEGA. The first study in this thesis addresses this literature gap through a case series of ten congenital SEGA at a single TSC referral centre. Using serial MRI, our study found median congenital SEGA growth rate of 1.1mm/yr in diameter or 150mm3/yr volumetrically, which is lower than previous reports. Congenital SEGA with volume >500mm3 had significantly higher growth rates compared with smaller SEGA. Children with congenital SEGA tended to have more severe epilepsy, developmental disability and ASD compared to genotype-matched controls, suggesting that congenital SEGA may be a biomarker for poor neurological outcome, which is a novel finding. The second study explores whether neuroradiological features of TSC on early MRI are biomarkers for neurological and developmental outcomes. Thirty-nine MRIs acquired between 18-36 months from children with TSC were scored blindly. We found children with drug resistant epilepsy (DRE) had more tubers and subependymal nodules (SEN) compared to children without DRE. More frontal tubers were found in children with impaired adaptive function and children with ASD. Therefore, tuber count and frontal tuber location on early MRI may contribute to predicting DRE, adaptive function and ASD at 5 years of age. In conclusion, this thesis determines that neuroradiological markers have an important role in understanding TSC patients’ neurological and developmental outcome at 5 years of age. Early identification of patients at risk for poor outcomes would open opportunities for early intervention and more aggressive epilepsy treatment with the aim of optimising neurological and developmental outcome
    • …
    corecore