866 research outputs found

    A combined statistical and machine learning approach for single channel speech enhancement

    Get PDF
    University of Minnesota Ph.D. dissertation. May 2015. Major: Electrical Engineering. Advisor: Zhi-Quan Luo. 1 computer file (PDF); ix, 116 pages.In this thesis, we study the single-channel speech enhancement problem, the goal of which is to recover a desired speech from a monaural noisy recording. Speech enhancement is a focal issue to study due to is widespread usage in speech-related applications, such as hearing aids, mobile communications, and speech recognition systems. Three speech enhancement algorithms are proposed. In the rst algorithm, the Wiener Non-negative Matrix Factorization (WNMF), we combine the traditional Wiener ltering and the NMF into a single optimization problem. The objective is to minimize the mean square error, similar to Wiener ltering, and the constraints ensure the enhanced speeches are sparsely representable by the speech model learned by NMF. WNMF is novel because it utilizes NMF to capture the speech-specific structure while simultaneously leveraging it, thus improving the Wiener filtering. For the second algorithm, we propose a Sparse Gaussian Mixture Model (SGMM) that extends the traditional NMF and the Gaussian model. SGMM better captures the complex structure of speech than the traditional NMF. To control for overrepresentation of SGMM, we impose sparsity in order to ensure that only a few Gaussian models are simultaneously active. Computationally, it is achieved by using a l0-norm in the constraint of the maximum-likelihood (ML) estimation. The contribution of SGMM is in solving the constrained ML estimation, which has a closed form update even with the non-convex and non-smooth l0-norm constraint. The final algorithm proposed is the Sparse NMF + Deep Neural Network (SNMF-DNN), in which we treat speech enhancement as a supervised regression problem - the goal being to estimate the optimal enhancement gain. SNMF, originally designed for source separation, is used to extract features from the noisy recording. DNN is subsequently trained to estimate the optimal enhancement gain. Although our system is simple and does not require any sophisticated handcrafted features, we are able to demonstrate a substantial improvement in both intelligibility and enhanced speech quality

    A circular elastic cylinder under its own weight

    Get PDF
    AbstractAn exact analysis of deformation and stress field in a finite circular elastic cylinder under its own weight is presented, with emphasis on the end effect. The problem is formulated on the basis of the state space formalism for axisymmetric deformation of a transversely isotropic body. Upon delineating the Hamiltonian characteristics of the formulation, a rigorous solution which satisfies the end conditions is determined by using eigenfunction expansion. The results show that the end effect is significant but confined to a local region near the base where the displacement and stress distributions are remarkably different from those according to the simplified solution that gives a uniaxial stress state. It is more pronounced in the cylinder with the bottom plane being perfectly bonded than in smooth contact with a rigid base

    Integrin-mediated membrane blebbing is dependent on the NHE1 and NCX1 activities.

    Get PDF
    Integrin-mediated signal transduction and membrane blebbing have been well studied to modulate cell adhesion, spreading and migration^1-6^. However, the relationship between membrane blebbing and integrin signaling has not been explored. Here we show that integrin-ligand interaction induces membrane blebbing and membrane permeability change. We found that sodium-proton exchanger 1 (NHE1) and sodium-calcium exchanger 1 (NCX1) are located in the membrane blebbing sites and inhibition of NHE1 disrupts membrane blebbing and decreases membrane permeability change. However, inhibition of NCX1 enhances cell blebbing to cause cell swelling which is correlated with an intracellular sodium accumulation induced by NHE17. These data suggest that sodium influx induced by NHE1 is a driving force for membrane blebbing growth, while sodium efflux induced by NCX1 in a reverse mode causes membrane blebbing retraction. Together, these data reveal a novel function of NHE1 and NCX1 in membrane permeability change and blebbing and provide the link for integrin signaling and membrane blebbing

    BN-embedded monolayer graphene with tunable electronic and topological properties

    Full text link
    Finding an effective and controllable way to create a sizable energy gap in graphene-based systems has been a challenging topic of intensive research. We propose that the hybrid of boron nitride and graphene (h-BNC) at low BN doping serves as an ideal platform for band-gap engineering and valleytronic applications. We report a systematic first-principles study of the atomic configurations and band gap opening for energetically favorable BN patches embedded in graphene. Based on first-principles calculations, we construct a tight-binding model to simulate general doping configurations in large supercells. Unexpectedly, the calculations find a linear dependence of the band gap on the effective BN concentration at low doping, arising from an induced effective on-site energy difference at the two C sublattices as they are substituted by B and N dopants alternately. The significant and tunable band gap of a few hundred meVs, with preserved topological properties of graphene and feasible sample preparation in the laboratory, presents great opportunities to realize valley physics applications in graphene systems at room temperature

    An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks

    Full text link
    Speech representations learned from Self-supervised learning (SSL) models have been found beneficial for various speech processing tasks. However, utilizing SSL representations usually requires fine-tuning the pre-trained models or designing task-specific downstream models and loss functions, causing much memory usage and human labor. On the other hand, prompting in Natural Language Processing (NLP) is an efficient and widely used technique to leverage pre-trained language models (LMs). Nevertheless, such a paradigm is little studied in the speech community. We report in this paper the first exploration of the prompt tuning paradigm for speech processing tasks based on Generative Spoken Language Model (GSLM). Experiment results show that the prompt tuning technique achieves competitive performance in speech classification tasks with fewer trainable parameters than fine-tuning specialized downstream models. We further study the technique in challenging sequence generation tasks. Prompt tuning also demonstrates its potential, while the limitation and possible research directions are discussed in this paper.Comment: Submitted to Interspeech 202

    The Clinical COPD Questionnaire Correlated with BODE Index-A Cross-Sectional Study

    Get PDF
    The Global initiative for Chronic Obstructive Lung Disease (GOLD) staging has widely used in the stratification of the severity of COPD, while BODE (body mass index, airflow obstruction, dyspnea, and exercise capacity) index was proven superior to FEV1 in predicting mortality, exacerbation and disease severity in patients with COPD. Clinical COPD Questionnaire (CCQ), a questionnaire with ten items categorized into three domains (symptoms, functional state and mental state) was developed to measure health status of COPD patients. However, little is known about the relationship between CCQ score and BODE index. We performed a prospective study with the inclusion of 89 patients who were clinically stable after a 6-week-therapy for COPD symptoms comparing their health status assessed by CCQ, BODE index and GOLD staging. We found that the total CCQ score was correlated with BODE score (P < 0.001) and GOLD staging (P < 0.001); of three CCQ domains, the functional status correlated the most with BODE index (rS = 0.670) and GOLD staging (rS = 0.531), followed by symptoms (rS = 0.482; rS = 0.346, respectively), and mental status (rS = 0.340; rS = 0.236, respectively). Our data suggest that CCQ is a reliable and convenient alternative tool to evaluate the severity of COPD
    corecore