277 research outputs found

    A Mandarin Voice Organizer Based on a Template-Matching Speech Recognizer

    Get PDF

    A Nano-Cheese-Cutter to Directly Measure Interfacial Adhesion of Freestanding Nano-Fibers

    Get PDF
    A nano-cheese-cutter is fabricated to directly measure the adhesion between two freestanding nano-fibers. A single electrospun fiber is attached to the free end of an atomic force microscope cantilever, while a similar fiber is similarly prepared on a mica substrate in an orthogonal direction. External load is applied to deform the two fibers into complementary V-shapes, and the force measurement allows the elastic modulus to be determined. At a critical tensile load, “pull-off” occurs when the adhering fibers spontaneously detach from each other, yielding the interfacial adhesion energy. Loading-unloading cycles are performed to investigate repeated adhesion-detachment and surface degradation

    Training strategy for a lightweight countermeasure model for automatic speaker verification

    Full text link
    The countermeasure (CM) model is developed to protect Automatic Speaker Verification (ASV) systems from spoof attacks and prevent resulting personal information leakage. Based on practicality and security considerations, the CM model is usually deployed on edge devices, which have more limited computing resources and storage space than cloud-based systems. This work proposes training strategies for a lightweight CM model for ASV, using generalized end-to-end (GE2E) pre-training and adversarial fine-tuning to improve performance, and applying knowledge distillation (KD) to reduce the size of the CM model. In the evaluation phase of the ASVspoof 2021 Logical Access task, the lightweight ResNetSE model reaches min t-DCF 0.2695 and EER 3.54%. Compared to the teacher model, the lightweight student model only uses 22.5% of parameters and 21.1% of multiply and accumulate operands of the teacher model.Comment: ASVspoof202

    Personalized Audio Quality Preference Prediction

    Full text link
    This paper proposes to use both audio input and subject information to predict the personalized preference of two audio segments with the same content in different qualities. A siamese network is used to compare the inputs and predict the preference. Several different structures for each side of the siamese network are investigated, and an LDNet with PANNs' CNN6 as the encoder and a multi-layer perceptron block as the decoder outperforms a baseline model using only audio input the most, where the overall accuracy grows from 77.56% to 78.04%. Experimental results also show that using all the subject information, including age, gender, and the specifications of headphones or earphones, is more effective than using only a part of them

    Multimodal Transformer Distillation for Audio-Visual Synchronization

    Full text link
    Audio-visual synchronization aims to determine whether the mouth movements and speech in the video are synchronized. VocaLiST reaches state-of-the-art performance by incorporating multimodal Transformers to model audio-visual interact information. However, it requires high computing resources, making it impractical for real-world applications. This paper proposed an MTDVocaLiST model, which is trained by our proposed multimodal Transformer distillation (MTD) loss. MTD loss enables MTDVocaLiST model to deeply mimic the cross-attention distribution and value-relation in the Transformer of VocaLiST. Our proposed method is effective in two aspects: From the distillation method perspective, MTD loss outperforms other strong distillation baselines. From the distilled model's performance perspective: 1) MTDVocaLiST outperforms similar-size SOTA models, SyncNet, and PM models by 15.69% and 3.39%; 2) MTDVocaLiST reduces the model size of VocaLiST by 83.52%, yet still maintaining similar performance.Comment: Submitted to ICASSP 202

    Ontology-based Fuzzy Markup Language Agent for Student and Robot Co-Learning

    Full text link
    An intelligent robot agent based on domain ontology, machine learning mechanism, and Fuzzy Markup Language (FML) for students and robot co-learning is presented in this paper. The machine-human co-learning model is established to help various students learn the mathematical concepts based on their learning ability and performance. Meanwhile, the robot acts as a teacher's assistant to co-learn with children in the class. The FML-based knowledge base and rule base are embedded in the robot so that the teachers can get feedback from the robot on whether students make progress or not. Next, we inferred students' learning performance based on learning content's difficulty and students' ability, concentration level, as well as teamwork sprit in the class. Experimental results show that learning with the robot is helpful for disadvantaged and below-basic children. Moreover, the accuracy of the intelligent FML-based agent for student learning is increased after machine learning mechanism.Comment: This paper is submitted to IEEE WCCI 2018 Conference for revie

    A novel mutation in the WFS1 gene identified in a Taiwanese family with low-frequency hearing impairment

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Wolfram syndrome gene 1 (<it>WFS1</it>) accounts for most of the familial nonsyndromic low-frequency sensorineural hearing loss (LFSNHL) which is characterized by sensorineural hearing losses equal to and below 2000 Hz. The current study aimed to contribute to our understanding of the molecular basis of LFSNHL in an affected Taiwanese family.</p> <p>Methods</p> <p>The Taiwanese family with LFSNHL was phenotypically characterized using audiologic examination and pedigree analysis. Genetic characterization was performed by direct sequencing of <it>WFS1 </it>and mutation analysis.</p> <p>Results</p> <p>Pure tone audiometry confirmed that the family members affected with LFSNHL had a bilateral sensorineural hearing loss equal to or below 2000 Hz. The hearing loss threshold of the affected members showed no progression, a characteristic that was consistent with a mutation in the <it>WFS1 </it>gene located in the DFNA6/14/38 locus. Pedigree analysis showed a hereditarily autosomal dominant pattern characterized by a full penetrance. Among several polymorphisms, a missense mutation Y669H (2005T>C) in exon 8 of <it>WFS1 </it>was identified in members of a Taiwanese family diagnosed with LFSNHL but not in any of the control subjects.</p> <p>Conclusion</p> <p>We discovered a novel heterozygous missense mutation in exon 8 of <it>WFS1 </it>(i.e., Y669H) which is likely responsible for the LFSNHL phenotype in this particular Taiwanese family.</p
    • …
    corecore