2,776 research outputs found

    Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found Data

    Full text link
    Creating synthetic voices with found data is challenging, as real-world recordings often contain various types of audio degradation. One way to address this problem is to pre-enhance the speech with an enhancement model and then use the enhanced data for text-to-speech (TTS) model training. This paper investigates the use of conditional diffusion models for generalized speech enhancement, which aims at addressing multiple types of audio degradation simultaneously. The enhancement is performed on the log Mel-spectrogram domain to align with the TTS training objective. Text information is introduced as an additional condition to improve the model robustness. Experiments on real-world recordings demonstrate that the synthetic voice built on data enhanced by the proposed model produces higher-quality synthetic speech, compared to those trained on data enhanced by strong baselines. Code and pre-trained parameters of the proposed enhancement model are available at \url{https://github.com/dmse4tts/DMSE4TTS

    Synthesis of 7-dehydrocholesterol through hexacarbonyl molybdenum catalyzed elimination reaction

    Get PDF
    The efficiency of hexacarbonyl molybdenum catalyzed elimination reaction of the allylic acetates has been improved by the presence of O,N-bis(trimethylsilyl) acetamide in the reaction medium. The methodology is particularly well employed for the elimination of 7-acetoxycholesterol-3-acetate(cholestrol-3,7-diacetate) for which the resulting product obtained was exclusively 5,7-homoannular diene(7-dehydrocholesterol-3-acetate). Good yield is achieved (up to 70 %) while decreasing the side products formation and reducing the costs as compared to the previously used procedures. Hexacarbonyl molybdenum elimination reaction is greatly influenced by the reaction temperature, at low as well as at high temperature low yield of the homoannular diene product is separated while at moderate conditions of temperature high products formation is observed. KEY WORDS: Hexacarbonyl molybdenum, Elimination, Deacetoxylation, 7-Dehydrocholesterol, BSA Bull. Chem. Soc. Ethiop. 2011, 25(2), 247-254

    Reap success from persistence

    Get PDF
    The road to success is long and arduous. Almost all Nobel prize laureates experienced tremendous efforts and countless failures before they made their scientific breakthroughs. Hypothesis-driven, independent and critical thinking, passion, repeated experiments and repetitive failures and running in circles on the entire scientific process finally approved their hypotheses

    An ASR-free Fluency Scoring Approach with Self-Supervised Learning

    Full text link
    A typical fluency scoring system generally relies on an automatic speech recognition (ASR) system to obtain time stamps in input speech for either the subsequent calculation of fluency-related features or directly modeling speech fluency with an end-to-end approach. This paper describes a novel ASR-free approach for automatic fluency assessment using self-supervised learning (SSL). Specifically, wav2vec2.0 is used to extract frame-level speech features, followed by K-means clustering to assign a pseudo label (cluster index) to each frame. A BLSTM-based model is trained to predict an utterance-level fluency score from frame-level SSL features and the corresponding cluster indexes. Neither speech transcription nor time stamp information is required in the proposed system. It is ASR-free and can potentially avoid the ASR errors effect in practice. Experimental results carried out on non-native English databases show that the proposed approach significantly improves the performance in the "open response" scenario as compared to previous methods and matches the recently reported performance in the "read aloud" scenario.Comment: Accepted by ICASSP 202

    Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring

    Full text link
    Recent studies on pronunciation scoring have explored the effect of introducing phone embeddings as reference pronunciation, but mostly in an implicit manner, i.e., addition or concatenation of reference phone embedding and actual pronunciation of the target phone as the phone-level pronunciation quality representation. In this paper, we propose to use linguistic-acoustic similarity to explicitly measure the deviation of non-native production from its native reference for pronunciation assessment. Specifically, the deviation is first estimated by the cosine similarity between reference phone embedding and corresponding acoustic embedding. Next, a phone-level Goodness of pronunciation (GOP) pre-training stage is introduced to guide this similarity-based learning for better initialization of the aforementioned two embeddings. Finally, a transformer-based hierarchical pronunciation scorer is used to map a sequence of phone embeddings, acoustic embeddings along with their similarity measures to predict the final utterance-level score. Experimental results on the non-native databases suggest that the proposed system significantly outperforms the baselines, where the acoustic and phone embeddings are simply added or concatenated. A further examination shows that the phone embeddings learned in the proposed approach are able to capture linguistic-acoustic attributes of native pronunciation as reference.Comment: Accepted by ICASSP 202

    Food protein-stabilized nanoemulsions as potential delivery systems for poorly water-soluble drugs: preparation, in vitro characterization, and pharmacokinetics in rats

    Get PDF
    Nanoemulsions stabilized by traditional emulsifiers raise toxicological concerns for long-term treatment. The present work investigates the potential of food proteins as safer stabilizers for nanoemulsions to deliver hydrophobic drugs. Nanoemulsions stabilized by food proteins (soybean protein isolate, whey protein isolate, β-lactoglobulin) were prepared by high-pressure homogenization. The toxicity of the nanoemulsions was tested in Caco-2 cells using the 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazoliumbromide viability assay. In vivo absorption in rats was also evaluated. Food protein-stabilized nanoemulsions, with small particle size and good size distribution, exhibited better stability and biocompatibility compared with nanoemulsions stabilized by traditional emulsifiers. Moreover, β-lactoglobulin had a better emulsifying capacity and biocompatibility than the other two food proteins. The pancreatic degradation of the proteins accelerated drug release. It is concluded that an oil/water nanoemulsion system with good biocompatibility can be prepared by using food proteins as emulsifiers, allowing better and more rapid absorption of lipophilic drugs

    A mobile prototype-based localization approach using inertial navigation and acoustic tracking for underwater

    Get PDF
    During underwater operations, divers must determine their own trajectories using the Inertial Navigation System (INS) they carry to improve operational efficiency. However, the INS contains a sensor bias that is also incorporated into the quadratic integration process to obtain the displacement, resulting in trajectory drift of the divers during prolonged self-guidance. To overcome the above problem, other aids are needed to correct the accumulated error of the INS. The single-beacon Assisted Inertial Navigation (AIN) method can improve the flexibility of inertial error correction while simplifying the localization equipment, which is suitable for the INS cumulative error correction scenario of divers. However, most of the traditional single-beacon assisted correction methods do not consider the effect of acoustic line bending on hydroacoustic ranging, and at the same time, they do not consider the problem of singular or pathological coefficient matrices introduced by inertial navigation neighbor localization deviations. Based on the above two shortcomings, this paper uses the acoustic velocity profile for acoustic line tracking, combines the localization idea of Mobile Primitives (MP), and proposes an MP-based acoustic line tracking-Assisted Inertial Navigation Localization (AINL) method, which constructs a sliding time window (STW) by taking the historical positioning of divers as a virtual primitive, and combines the nonlinear optimization method for iterative optimization search as a means to improve the accuracy and stability of self-navigation of the divers

    Electric field-induced transformations in bismuth sodium titanate-based materials

    Get PDF
    Electric field-induced transformations occur in a myriad of systems with a variegated phenomenology and have attracted widespread scientific interest due to their importance in many applications. The present review focuses on the electric field-induced transformations occurring in bismuth sodium titanate (BNT)-based materials, which are considered an important family of lead-free perovskites and represent possible alternatives to lead-based compounds for several applications. BNT-based systems are generally classified as relaxor ferroelectrics and are characterized by complex structures undergoing various electric field-driven phenomena. In this review, changes in crystal structure symmetry, domain configuration and macroscopic properties are discussed in relation to composition, temperature and electrical loading characteristics, including amplitude, frequency and DC biases. The coupling mechanisms between octahedral tilting with polarization and strain, and other microstructural features are identified as important factors mediating the local and overall electric field-induced response. The role of field-induced transformations on electrical fatigue is discussed by highlighting the effects of ergodicity on domain evolution and fatigue resistance in bipolar and unipolar cycles. The relevance of field-induced transformations in key applications, including energy storage capacitors, actuators, electrocaloric systems and photoluminescent devices is comprehensively discussed to identify materials design criteria. The review is concluded with an outlook for future research
    • …
    corecore