770 research outputs found

    Outstanding supercapacitive properties of Mn-doped TiO2 micro/nanostructure porous film prepared by anodization method.

    Get PDF
    Mn-doped TiO2 micro/nanostructure porous film was prepared by anodizing a Ti-Mn alloy. The film annealed at 300 °C yields the highest areal capacitance of 1451.3 mF/cm(2) at a current density of 3 mA/cm(2) when used as a high-performance supercapacitor electrode. Areal capacitance retention is 63.7% when the current density increases from 3 to 20 mA/cm(2), and the capacitance retention is 88.1% after 5,000 cycles. The superior areal capacitance of the porous film is derived from the brush-like metal substrate, which could greatly increase the contact area, improve the charge transport ability at the oxide layer/metal substrate interface, and thereby significantly enhance the electrochemical activities toward high performance energy storage. Additionally, the effects of manganese content and specific surface area of the porous film on the supercapacitive performance were also investigated in this work

    UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit Normalization

    Full text link
    Dysarthric speech reconstruction (DSR) systems aim to automatically convert dysarthric speech into normal-sounding speech. The technology eases communication with speakers affected by the neuromotor disorder and enhances their social inclusion. NED-based (Neural Encoder-Decoder) systems have significantly improved the intelligibility of the reconstructed speech as compared with GAN-based (Generative Adversarial Network) approaches, but the approach is still limited by training inefficiency caused by the cascaded pipeline and auxiliary tasks of the content encoder, which may in turn affect the quality of reconstruction. Inspired by self-supervised speech representation learning and discrete speech units, we propose a Unit-DSR system, which harnesses the powerful domain-adaptation capacity of HuBERT for training efficiency improvement and utilizes speech units to constrain the dysarthric content restoration in a discrete linguistic space. Compared with NED approaches, the Unit-DSR system only consists of a speech unit normalizer and a Unit HiFi-GAN vocoder, which is considerably simpler without cascaded sub-modules or auxiliary tasks. Results on the UASpeech corpus indicate that Unit-DSR outperforms competitive baselines in terms of content restoration, reaching a 28.2% relative average word error rate reduction when compared to original dysarthric speech, and shows robustness against speed perturbation and noise.Comment: Accepted to ICASSP 202

    Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction

    Full text link
    Dysarthric speech reconstruction (DSR) aims to transform dysarthric speech into normal speech by improving the intelligibility and naturalness. This is a challenging task especially for patients with severe dysarthria and speaking in complex, noisy acoustic environments. To address these challenges, we propose a novel multi-modal framework to utilize visual information, e.g., lip movements, in DSR as extra clues for reconstructing the highly abnormal pronunciations. The multi-modal framework consists of: (i) a multi-modal encoder to extract robust phoneme embeddings from dysarthric speech with auxiliary visual features; (ii) a variance adaptor to infer the normal phoneme duration and pitch contour from the extracted phoneme embeddings; (iii) a speaker encoder to encode the speaker's voice characteristics; and (iv) a mel-decoder to generate the reconstructed mel-spectrogram based on the extracted phoneme embeddings, prosodic features and speaker embeddings. Both objective and subjective evaluations conducted on the commonly used UASpeech corpus show that our proposed approach can achieve significant improvements over baseline systems in terms of speech intelligibility and naturalness, especially for the speakers with more severe symptoms. Compared with original dysarthric speech, the reconstructed speech achieves 42.1\% absolute word error rate reduction for patients with more severe dysarthria levels.Comment: Accepted by ICASSP 202

    Overall PSD and Fractal Characteristics of Tight Oil Reservoirs: A Case Study of Lucaogou Formation in Junggar Basin, China

    Get PDF
    Lucaogou tight oil reservoir, located in the Junggar Basin, Northwest of China, is one of the typical tight oil reservoirs. Complex lithology leads to a wide pore size distribution (PSD), ranging from several nanometers to hundreds of micrometers. To better understand PSD and fractal features of Lucaogou tight oil reservoir, the experiment methods including scanning electron microscope (SEM), rate-controlled mercury injection (RMI) and pressure-controlled mercury injection (PMI) were performed on the six samples with different lithology. The results indicate that four types of pores exist in Lucaogou tight oil reservoir, including dissolution pores, clay dominated pores, microfractures and inter-granular pores. A combination of PMI and RMI was proposed to calculate the overall PSD of tight oil reservoirs, the overall pore radius of Lucaogou tight oil reservoir ranges from 3.6 nm to 500µm. The fractal analysis was carried out based on the PMI data. Fractal dimension (Fd) values varied between 2.843 and 2.913 with a mean value of 2.88. Fd increases with a decrease of quartz content and an increase of clay mineral content. Samples from tight oil reservoirs with smaller average pore radius have stronger complexity of pore structure. Fractal dimension shows negative correlations with porosity and permeability. In addition, fractal characteristics of different tight reservoirs were compared and analyzed

    StyleSpeech: Self-supervised Style Enhancing with VQ-VAE-based Pre-training for Expressive Audiobook Speech Synthesis

    Full text link
    The expressive quality of synthesized speech for audiobooks is limited by generalized model architecture and unbalanced style distribution in the training data. To address these issues, in this paper, we propose a self-supervised style enhancing method with VQ-VAE-based pre-training for expressive audiobook speech synthesis. Firstly, a text style encoder is pre-trained with a large amount of unlabeled text-only data. Secondly, a spectrogram style extractor based on VQ-VAE is pre-trained in a self-supervised manner, with plenty of audio data that covers complex style variations. Then a novel architecture with two encoder-decoder paths is specially designed to model the pronunciation and high-level style expressiveness respectively, with the guidance of the style extractor. Both objective and subjective evaluations demonstrate that our proposed method can effectively improve the naturalness and expressiveness of the synthesized speech in audiobook synthesis especially for the role and out-of-domain scenarios.Comment: Accepted to ICASSP 202

    The Efficacy of Chinese Herbal Medicine as an Adjunctive Therapy for Advanced Non-small Cell Lung Cancer: A Systematic Review and Meta-analysis

    Get PDF
    Many published studies reflect the growing application of complementary and alternative medicine, particularly Chinese herbal medicine (CHM) use in combination with conventional cancer therapy for advanced non-small cell lung cancer (NSCLC), but its efficacy remains largely unexplored. The purpose of this study is to evaluate the efficacy of CHM combined with conventional chemotherapy (CT) in the treatment of advanced NSCLC. Publications in 11 electronic databases were extensively searched, and 24 trials were included for analysis. A sum of 2,109 patients was enrolled in these studies, at which 1,064 patients participated in CT combined CHM and 1,039 in CT (six patients dropped out and were not reported the group enrolled). Compared to using CT alone, CHM combined with CT significantly increase one-year survival rate (RR = 1.36, 95% CI = 1.15-1.60, p = 0.0003). Besides, the combined therapy significantly increased immediate tumor response (RR = 1.36, 95% CI = 1.19-1.56, p<1.0E-5) and improved Karnofsky performance score (KPS) (RR = 2.90, 95% CI = 1.62-5.18, p = 0.0003). Combined therapy remarkably reduced the nausea and vomiting at toxicity grade of III-IV (RR = 0.24, 95% CI = 0.12-0.50, p = 0.0001) and prevented the decline of hemoglobin and platelet in patients under CT at toxicity grade of I-IV (RR = 0.64, 95% CI = 0.51-0.80, p<0.0001). Moreover, the herbs that are frequently used in NSCLC patients were identified. This systematic review suggests that CHM as an adjuvant therapy can reduce CT toxicity, prolong survival rate, enhance immediate tumor response, and improve KPS in advanced NSCLC patients. However, due to the lack of large-scale randomized clinical trials in the included studies, further larger scale trials are needed. © 2013 Li et al.published_or_final_versio

    TIER: Text-Image Encoder-based Regression for AIGC Image Quality Assessment

    Full text link
    Recently, AIGC image quality assessment (AIGCIQA), which aims to assess the quality of AI-generated images (AIGIs) from a human perception perspective, has emerged as a new topic in computer vision. Unlike common image quality assessment tasks where images are derived from original ones distorted by noise, blur, and compression, \textit{etc.}, in AIGCIQA tasks, images are typically generated by generative models using text prompts. Considerable efforts have been made in the past years to advance AIGCIQA. However, most existing AIGCIQA methods regress predicted scores directly from individual generated images, overlooking the information contained in the text prompts of these images. This oversight partially limits the performance of these AIGCIQA methods. To address this issue, we propose a text-image encoder-based regression (TIER) framework. Specifically, we process the generated images and their corresponding text prompts as inputs, utilizing a text encoder and an image encoder to extract features from these text prompts and generated images, respectively. To demonstrate the effectiveness of our proposed TIER method, we conduct extensive experiments on several mainstream AIGCIQA databases, including AGIQA-1K, AGIQA-3K, and AIGCIQA2023. The experimental results indicate that our proposed TIER method generally demonstrates superior performance compared to baseline in most cases.Comment: 12 pages, 8 figures. arXiv admin note: text overlap with arXiv:2312.0589
    • …
    corecore