859 research outputs found

    A Study of Low-Resource Speech Commands Recognition based on Adversarial Reprogramming

    Full text link
    In this study, we propose a novel adversarial reprogramming (AR) approach for low-resource spoken command recognition (SCR), and build an AR-SCR system. The AR procedure aims to modify the acoustic signals (from the target domain) to repurpose a pretrained SCR model (from the source domain). To solve the label mismatches between source and target domains, and further improve the stability of AR, we propose a novel similarity-based label mapping technique to align classes. In addition, the transfer learning (TL) technique is combined with the original AR process to improve the model adaptation capability. We evaluate the proposed AR-SCR system on three low-resource SCR datasets, including Arabic, Lithuanian, and dysarthric Mandarin speech. Experimental results show that with a pretrained AM trained on a large-scale English dataset, the proposed AR-SCR system outperforms the current state-of-the-art results on Arabic and Lithuanian speech commands datasets, with only a limited amount of training data.Comment: Submitted to ICASSP 202

    NeuralFuse: Learning to Improve the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes

    Full text link
    Deep neural networks (DNNs) have become ubiquitous in machine learning, but their energy consumption remains a notable issue. Lowering the supply voltage is an effective strategy for reducing energy consumption. However, aggressively scaling down the supply voltage can lead to accuracy degradation due to random bit flips in static random access memory (SRAM) where model parameters are stored. To address this challenge, we introduce NeuralFuse, a novel add-on module that addresses the accuracy-energy tradeoff in low-voltage regimes by learning input transformations to generate error-resistant data representations. NeuralFuse protects DNN accuracy in both nominal and low-voltage scenarios. Moreover, NeuralFuse is easy to implement and can be readily applied to DNNs with limited access, such as non-configurable hardware or remote access to cloud-based APIs. Experimental results demonstrate that, at a 1% bit error rate, NeuralFuse can reduce SRAM memory access energy by up to 24% while improving accuracy by up to 57%. To the best of our knowledge, this is the first model-agnostic approach (i.e., no model retraining) to address low-voltage-induced bit errors. The source code is available at https://github.com/IBM/NeuralFuse

    Measuring the Quality of Financial Electronic Payment System: Combined with Fuzzy AHP and Fuzzy TOPSIS

    Get PDF
    The study aims to apply Fuzzy AHP in TOPSIS to discuss the key factors that foster the success of current third-party online payment platforms. This study organized the quality measurements into four categories and eleven sub-categories. The AHP in TOPSIS is applied to calculate the weighted averages of all categories and sub-categories to measure the quality of third-party online payment platforms. This study finds that “safety quality” is the most emphasized category, “system quality” is the second, “communication quality” is the third, and “service quality” is the least emphasized

    Vocal cord dysfunction diagnosed by four-dimensional dynamic volume computed tomography in patients with difficult-to-treat asthma: A case series

    Get PDF
    Patients with asthma may also have vocal cord dysfunction (VCD), which leads to poor control of the asthma. Once patients are diagnosed with difficult-to-treat asthma with poor control, VCD should be excluded or treated accordingly. The gold standard for diagnosis of VCD is to perform a laryngoscopy. However, this procedure is invasive and may not be suitable for patients with difficult-to-treat asthma. Four-dimensional (4D) dynamic volume computed tomography (CT) is a noninvasive method for quantification of laryngeal movement, and can serve as an alternative for the diagnosis of VCD. Herein, we present a series of five cases with difficult-to-treat asthma patients who were diagnosed with VCD by 4D dynamic volume CT. Clinicians should be alert to the possibility of VCD when poor control is noted in patients with asthma. Early diagnosis by noninvasive 4D dynamic volume CT can decrease excessive doses of inhaled corticosteroids

    Impact of metabolic syndrome on postoperative outcomes of transsphenoidal pituitary surgery: analysis of U.S. nationwide inpatient sample data 2005–2018

    Get PDF
    IntroductionTranssphenoidal surgery (TSS) is the preferred surgical method for most pituitary adenomas owing to high efficacy and low mortality. This study aimed to evaluate the influence of metabolic syndrome (MetS) on postoperative outcomes of TSS for pituitary adenoma.MethodsThis population-based, retrospective observational study extracted data of adults 20-79 y receiving TSS for pituitary adenoma from the US Nationwide Inpatient Sample (NIS) between 2005-2018. Primary outcomes were pituitary-related complications, poor outcomes (i.e., in-hospital mortality or unfavorable discharge), prolonged length of stay (LOS), and patient safety indicators (PSIs). Univariate and multivariate regressions were performed to determine the associations between study variables and outcomes.Results19,076 patients (representing a 93,185 US in-patient population) were included, among which 2,109 (11.1%) patients had MetS. After adjustment, pre-existing MetS was not significantly associated with presence of pituitary-related complications and poor outcomes. In contrast, MetS was significantly associated with an increased risk for prolonged LOS (adjusted OR (aOR) = 1.19; 95% CI: 1.05-1.34), PSIs (aOR = 1.31; 95% CI: 1.07-1.59) and greater hospital costs (adjusted β = 8.63 thousand USD; 95% CI: 4.98-12.29). Among pituitary-related complications, MetS was independently associated with increased risk of cerebrospinal fluid (CSF) rhinorrhea (aOR = 1.22, 95% CI: 1.01, 1.47) but lowered diabetes insipidus (aOR = 0.83, 95% CI: 0.71, 0.97).DiscussionMetS does not pose excessive risk of in-hospital mortality or unfavorable discharge. However, MetS independently predicted having PSIs, prolonged LOS, greater hospital costs, and CSF rhinorrhea. Study findings may help clinicians achieve better risk stratification before TSS

    VoiceBank-2023: A Multi-Speaker Mandarin Speech Corpus for Constructing Personalized TTS Systems for the Speech Impaired

    Full text link
    Services of personalized TTS systems for the Mandarin-speaking speech impaired are rarely mentioned. Taiwan started the VoiceBanking project in 2020, aiming to build a complete set of services to deliver personalized Mandarin TTS systems to amyotrophic lateral sclerosis patients. This paper reports the corpus design, corpus recording, data purging and correction for the corpus, and evaluations of the developed personalized TTS systems, for the VoiceBanking project. The developed corpus is named after the VoiceBank-2023 speech corpus because of its release year. The corpus contains 29.78 hours of utterances with prompts of short paragraphs and common phrases spoken by 111 native Mandarin speakers. The corpus is labeled with information about gender, degree of speech impairment, types of users, transcription, SNRs, and speaking rates. The VoiceBank-2023 is available by request for non-commercial use and welcomes all parties to join the VoiceBanking project to improve the services for the speech impaired.Comment: submitted to 26th International Conference of the ORIENTAL-COCOSD
    • …
    corecore