859 research outputs found
A Study of Low-Resource Speech Commands Recognition based on Adversarial Reprogramming
In this study, we propose a novel adversarial reprogramming (AR) approach for
low-resource spoken command recognition (SCR), and build an AR-SCR system. The
AR procedure aims to modify the acoustic signals (from the target domain) to
repurpose a pretrained SCR model (from the source domain). To solve the label
mismatches between source and target domains, and further improve the stability
of AR, we propose a novel similarity-based label mapping technique to align
classes. In addition, the transfer learning (TL) technique is combined with the
original AR process to improve the model adaptation capability. We evaluate the
proposed AR-SCR system on three low-resource SCR datasets, including Arabic,
Lithuanian, and dysarthric Mandarin speech. Experimental results show that with
a pretrained AM trained on a large-scale English dataset, the proposed AR-SCR
system outperforms the current state-of-the-art results on Arabic and
Lithuanian speech commands datasets, with only a limited amount of training
data.Comment: Submitted to ICASSP 202
NeuralFuse: Learning to Improve the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes
Deep neural networks (DNNs) have become ubiquitous in machine learning, but
their energy consumption remains a notable issue. Lowering the supply voltage
is an effective strategy for reducing energy consumption. However, aggressively
scaling down the supply voltage can lead to accuracy degradation due to random
bit flips in static random access memory (SRAM) where model parameters are
stored. To address this challenge, we introduce NeuralFuse, a novel add-on
module that addresses the accuracy-energy tradeoff in low-voltage regimes by
learning input transformations to generate error-resistant data
representations. NeuralFuse protects DNN accuracy in both nominal and
low-voltage scenarios. Moreover, NeuralFuse is easy to implement and can be
readily applied to DNNs with limited access, such as non-configurable hardware
or remote access to cloud-based APIs. Experimental results demonstrate that, at
a 1% bit error rate, NeuralFuse can reduce SRAM memory access energy by up to
24% while improving accuracy by up to 57%. To the best of our knowledge, this
is the first model-agnostic approach (i.e., no model retraining) to address
low-voltage-induced bit errors. The source code is available at
https://github.com/IBM/NeuralFuse
Measuring the Quality of Financial Electronic Payment System: Combined with Fuzzy AHP and Fuzzy TOPSIS
The study aims to apply Fuzzy AHP in TOPSIS to discuss the key factors that foster the success of current third-party online payment platforms. This study organized the quality measurements into four categories and eleven sub-categories. The AHP in TOPSIS is applied to calculate the weighted averages of all categories and sub-categories to measure the quality of third-party online payment platforms. This study finds that “safety quality” is the most emphasized category, “system quality” is the second, “communication quality” is the third, and “service quality” is the least emphasized
Vocal cord dysfunction diagnosed by four-dimensional dynamic volume computed tomography in patients with difficult-to-treat asthma: A case series
Patients with asthma may also have vocal cord dysfunction (VCD), which leads to poor control of the asthma. Once patients are diagnosed with difficult-to-treat asthma with poor control, VCD should be excluded or treated accordingly. The gold standard for diagnosis of VCD is to perform a laryngoscopy. However, this procedure is invasive and may not be suitable for patients with difficult-to-treat asthma. Four-dimensional (4D) dynamic volume computed tomography (CT) is a noninvasive method for quantification of laryngeal movement, and can serve as an alternative for the diagnosis of VCD. Herein, we present a series of five cases with difficult-to-treat asthma patients who were diagnosed with VCD by 4D dynamic volume CT. Clinicians should be alert to the possibility of VCD when poor control is noted in patients with asthma. Early diagnosis by noninvasive 4D dynamic volume CT can decrease excessive doses of inhaled corticosteroids
Impact of metabolic syndrome on postoperative outcomes of transsphenoidal pituitary surgery: analysis of U.S. nationwide inpatient sample data 2005–2018
IntroductionTranssphenoidal surgery (TSS) is the preferred surgical method for most pituitary adenomas owing to high efficacy and low mortality. This study aimed to evaluate the influence of metabolic syndrome (MetS) on postoperative outcomes of TSS for pituitary adenoma.MethodsThis population-based, retrospective observational study extracted data of adults 20-79 y receiving TSS for pituitary adenoma from the US Nationwide Inpatient Sample (NIS) between 2005-2018. Primary outcomes were pituitary-related complications, poor outcomes (i.e., in-hospital mortality or unfavorable discharge), prolonged length of stay (LOS), and patient safety indicators (PSIs). Univariate and multivariate regressions were performed to determine the associations between study variables and outcomes.Results19,076 patients (representing a 93,185 US in-patient population) were included, among which 2,109 (11.1%) patients had MetS. After adjustment, pre-existing MetS was not significantly associated with presence of pituitary-related complications and poor outcomes. In contrast, MetS was significantly associated with an increased risk for prolonged LOS (adjusted OR (aOR) = 1.19; 95% CI: 1.05-1.34), PSIs (aOR = 1.31; 95% CI: 1.07-1.59) and greater hospital costs (adjusted β = 8.63 thousand USD; 95% CI: 4.98-12.29). Among pituitary-related complications, MetS was independently associated with increased risk of cerebrospinal fluid (CSF) rhinorrhea (aOR = 1.22, 95% CI: 1.01, 1.47) but lowered diabetes insipidus (aOR = 0.83, 95% CI: 0.71, 0.97).DiscussionMetS does not pose excessive risk of in-hospital mortality or unfavorable discharge. However, MetS independently predicted having PSIs, prolonged LOS, greater hospital costs, and CSF rhinorrhea. Study findings may help clinicians achieve better risk stratification before TSS
VoiceBank-2023: A Multi-Speaker Mandarin Speech Corpus for Constructing Personalized TTS Systems for the Speech Impaired
Services of personalized TTS systems for the Mandarin-speaking speech
impaired are rarely mentioned. Taiwan started the VoiceBanking project in 2020,
aiming to build a complete set of services to deliver personalized Mandarin TTS
systems to amyotrophic lateral sclerosis patients. This paper reports the
corpus design, corpus recording, data purging and correction for the corpus,
and evaluations of the developed personalized TTS systems, for the VoiceBanking
project. The developed corpus is named after the VoiceBank-2023 speech corpus
because of its release year. The corpus contains 29.78 hours of utterances with
prompts of short paragraphs and common phrases spoken by 111 native Mandarin
speakers. The corpus is labeled with information about gender, degree of speech
impairment, types of users, transcription, SNRs, and speaking rates. The
VoiceBank-2023 is available by request for non-commercial use and welcomes all
parties to join the VoiceBanking project to improve the services for the speech
impaired.Comment: submitted to 26th International Conference of the ORIENTAL-COCOSD
- …