28 research outputs found

    FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis

    Get PDF
    Denoising diffusion probabilistic models (DDPMs) have recently achieved leading performances in many generative tasks. However, the inherited iterative sampling process costs hindered their applications to speech synthesis. This paper proposes FastDiff, a fast conditional diffusion model for high-quality speech synthesis. FastDiff employs a stack of time-aware location-variable convolutions of diverse receptive field patterns to efficiently model long-term time dependencies with adaptive conditions. A noise schedule predictor is also adopted to reduce the sampling steps without sacrificing the generation quality. Based on FastDiff, we design an end-to-end text-to-speech synthesizer, FastDiff-TTS, which generates high-fidelity speech waveforms without any intermediate feature (e.g., Mel-spectrogram). Our evaluation of FastDiff demonstrates the state-of-the-art results with higher-quality (MOS 4.28) speech samples. Also, FastDiff enables a sampling speed of 58x faster than real-time on a V100 GPU, making diffusion models practically applicable to speech synthesis deployment for the first time. We further show that FastDiff generalized well to the mel-spectrogram inversion of unseen speakers, and FastDiff-TTS outperformed other competing methods in end-to-end text-to-speech synthesis. Audio samples are available at \url{https://FastDiff.github.io/}.Comment: Accepted by IJCAI 202

    Metric Optimization for Surface Analysis in the Laplace-Beltrami Embedding Space

    Full text link
    In this paper we present a novel approach for the intrinsic mapping of anatomical surfaces and its application in brain mapping research. Using the Laplace-Beltrami eigen-system, we represent each surface with an isometry invariant embedding in a high dimensional space. The key idea in our system is that we realize surface deformation in the embedding space via the iterative optimization of a conformal metric without explicitly perturbing the surface or its embedding. By minimizing a distance measure in the embedding space with metric optimization, our method generates a conformal map directly between surfaces with highly uniform metric distortion and the ability of aligning salient geometric features. Besides pairwise surface maps, we also extend the metric optimization approach for group-wise atlas construction and multi-atlas cortical label fusion. In experimental results, we demonstrate the robustness and generality of our method by applying it to map both cortical and hippocampal surfaces in population studies. For cortical labeling, our method achieves excellent performance in a cross-validation experiment with 40 manually labeled surfaces, and successfully models localized brain development in a pediatric study of 80 subjects. For hippocampal mapping, our method produces much more significant results than two popular tools on a multiple sclerosis study of 109 subjects

    A novel selection of optimal statistical features in the DWPT domain for discrimination of ictal and seizure-free electroencephalography signals

    Get PDF
    Properly determining the discriminative features which characterize the inherent behaviors of electroencephalography (EEG) signals remains a great challenge for epileptic seizure detection. In this present study, a novel feature selection scheme based on the discrete wavelet packet decomposition and cuckoo search algorithm (CSA) was proposed. The normal as well as epileptic EEG recordings were frst decomposed into various frequency bands by means of wavelet packet decomposition, and subsequently, statistical features at all developed nodes in the wavelet packet decomposition tree were derived. Instead of using the complete set of the extracted features to construct a wavelet neural networks-based classifer, an optimal feature subset that maximizes the predictive competence of the classifer was selected by using the CSA. Experimental results on the publicly available benchmarks demonstrated that the proposed feature subset selection scheme achieved promising recognition accuracies of 98.43–100%, and the results were statistically signifcant using z-test with p value <0.0001

    Machine Learning Applications in Head and Neck Radiation Oncology: Lessons From Open-Source Radiomics Challenges

    Get PDF
    Radiomics leverages existing image datasets to provide non-visible data extraction via image post-processing, with the aim of identifying prognostic, and predictive imaging features at a sub-region of interest level. However, the application of radiomics is hampered by several challenges such as lack of image acquisition/analysis method standardization, impeding generalizability. As of yet, radiomics remains intriguing, but not clinically validated. We aimed to test the feasibility of a non-custom-constructed platform for disseminating existing large, standardized databases across institutions for promoting radiomics studies. Hence, University of Texas MD Anderson Cancer Center organized two public radiomics challenges in head and neck radiation oncology domain. This was done in conjunction with MICCAI 2016 satellite symposium using Kaggle-in-Class, a machine-learning and predictive analytics platform. We drew on clinical data matched to radiomics data derived from diagnostic contrast-enhanced computed tomography (CECT) images in a dataset of 315 patients with oropharyngeal cancer. Contestants were tasked to develop models for (i) classifying patients according to their human papillomavirus status, or (ii) predicting local tumor recurrence, following radiotherapy. Data were split into training, and test sets. Seventeen teams from various professional domains participated in one or both of the challenges. This review paper was based on the contestants' feedback; provided by 8 contestants only (47%). Six contestants (75%) incorporated extracted radiomics features into their predictive model building, either alone (n = 5; 62.5%), as was the case with the winner of the “HPV” challenge, or in conjunction with matched clinical attributes (n = 2; 25%). Only 23% of contestants, notably, including the winner of the “local recurrence” challenge, built their model relying solely on clinical data. In addition to the value of the integration of machine learning into clinical decision-making, our experience sheds light on challenges in sharing and directing existing datasets toward clinical applications of radiomics, including hyper-dimensionality of the clinical/imaging data attributes. Our experience may help guide researchers to create a framework for sharing and reuse of already published data that we believe will ultimately accelerate the pace of clinical applications of radiomics; both in challenge or clinical settings

    A.W.: Anisotropic Laplace-Beltrami eigenmaps: Bridging Reeb graphs and skeletons

    No full text
    In this paper we propose a novel approach of computing skeletons of robust topology for simply connected surfaces with boundary by constructing Reeb graphs from the eigenfunctions of an anisotropic Laplace-Beltrami operator. Our work brings together the idea of Reeb graphs and skeletons by incorporating a flux-based weight function into the Laplace-Beltrami operator. Based on the intrinsic geometry of the surface, the resulting Reeb graph is pose independent and captures the global profile of surface geometry. Our algorithm is very efficient and it only takes several seconds to compute on neuroanatomical structures such as the cingulate gyrus and corpus callosum. In our experiments, we show that the Reeb graphs serve well as an approximate skeleton with consistent topology while following the main body of conventional skeletons quite accurately. 1

    Metric optimization for surface analysis in the Laplace-Beltrami embedding space.

    No full text
    In this paper, we present a novel approach for the intrinsic mapping of anatomical surfaces and its application in brain mapping research. Using the Laplace-Beltrami eigen-system, we represent each surface with an isometry invariant embedding in a high dimensional space. The key idea in our system is that we realize surface deformation in the embedding space via the iterative optimization of a conformal metric without explicitly perturbing the surface or its embedding. By minimizing a distance measure in the embedding space with metric optimization, our method generates a conformal map directly between surfaces with highly uniform metric distortion and the ability of aligning salient geometric features. Besides pairwise surface maps, we also extend the metric optimization approach for group-wise atlas construction and multi-atlas cortical label fusion. In experimental results, we demonstrate the robustness and generality of our method by applying it to map both cortical and hippocampal surfaces in population studies. For cortical labeling, our method achieves excellent performance in a cross-validation experiment with 40 manually labeled surfaces, and successfully models localized brain development in a pediatric study of 80 subjects. For hippocampal mapping, our method produces much more significant results than two popular tools on a multiple sclerosis study of 109 subjects
    corecore