864 research outputs found

    Abstractive spoken document summarization using hierarchical model with multi-stage attention diversity optimization

    Get PDF
    Abstractive summarization is a standard task for written documents, such as news articles. Applying summarization schemes to spoken documents is more challenging, especially in situations involving human interactions, such as meetings. Here, utterances tend not to form complete sentences and sometimes contain little information. Moreover, speech disfluencies will be present as well as recognition errors for automated systems. For current attention-based sequence-to-sequence summarization systems, these additional challenges can yield a poor attention distribution over the spoken document words and utterances, impacting performance. In this work, we propose a multi-stage method based on a hierarchical encoder-decoder model to explicitly model utterance-level attention distribution at training time; and enforce diversity at inference time using a unigram diversity term. Furthermore, multitask learning tasks including dialogue act classification and extractive summarization are incorporated. The performance of the system is evaluated on the AMI meeting corpus. The inclusion of both training and inference diversity terms improves performance, outperforming current state-of-the-art systems in terms of ROUGE scores. Additionally, the impact of ASR errors, as well as performance on the multitask learning tasks, is evaluated

    Robust excitation-based features for Automatic Speech Recognition

    Get PDF
    In this paper we investigate the use of robust to noise features characterizing the speech excitation signal as complementary features to the usually considered vocal tract based features for automatic speech recognition (ASR). The features are tested in a state-of-the-art Deep Neural Network (DNN) based hybrid acoustic model for speech recognition. The suggested excitation features expands the set of excitation features previously considered for ASR, expecting that these features help in a better discrimination of the broad phonetic classes (e.g., fricatives, nasal, vowels, etc.). Relative improvements in the word error rate are observed in the AMI meeting transcription system with greater gains (about 5%) if PLP features are combined with the suggested excitation features. For Aurora 4, significant improvements are observed as well. Combining the suggested excitation features with filter banks, a word error rate of 9.96% is achieved.This is the author accepted manuscript. The final version is available from IEEE via http://dx.doi.org/10.1109/ICASSP.2015.717885

    Non-native children's automatic speech recognition: The INTERSPEECH 2020 shared task ALTA systems

    Get PDF
    Automatic spoken language assessment (SLA) is a challenging problem due to the large variations in learner speech combined with limited resources. These issues are even more problematic when considering children learning a language, with higher levels of acoustic and lexical variability, and of code-switching compared to adult data. This paper describes the ALTA system for the INTERSPEECH 2020 Shared Task on Automatic Speech Recognition for Non-Native Children’s Speech. The data for this task consists of examination recordings of Italian school children aged 9-16, ranging in ability from minimal, to basic, to limited but effective command of spoken English. A variety of systems were developed using the limited training data available, 49 hours. State-of-the-art acoustic models and language models were evaluated, including a diversity of lexical representations, handling code-switching and learner pronunciation errors, and grade specific models. The best single system achieved a word error rate (WER) of 16.9% on the evaluation data. By combining multiple diverse systems, including both grade independent and grade specific models, the error rate was reduced to 15.7%. This combined system was the best performing submission for both the closed and open tasks

    Language Model Combination and Adaptation Using Weighted Finite State Transducers

    Get PDF
    In speech recognition systems language model (LMs) are often constructed by training and combining multiple n-gram models. They can be either used to represent different genres or tasks found in diverse text sources, or capture stochastic properties of different linguistic symbol sequences, for example, syllables and words. Unsupervised LM adaption may also be used to further improve robustness to varying styles or tasks. When using these techniques, extensive software changes are often required. In this paper an alternative and more general approach based on weighted finite state transducers (WFSTs) is investigated for LM combination and adaptation. As it is entirely based on well-defined WFST operations, minimum change to decoding tools is needed. A wide range of LM combination configurations can be flexibly supported. An efficient on-the-fly WFST decoding algorithm is also proposed. Significant error rate gains of 7.3% relative were obtained on a state-of-the-art broadcast audio recognition task using a history dependently adapted multi-level LM modelling both syllable and word sequence

    Impact of ASR performance on spoken grammatical error detection

    Get PDF
    Computer assisted language learning (CALL) systems aidlearners to monitor their progress by providing scoring andfeedback on language assessment tasks. Free speaking tests al-low assessment of what a learner has said, as well as how theysaid it. For these tasks, Automatic Speech Recognition (ASR)is required to generate transcriptions of a candidate’s responses,the quality of these transcriptions is crucial to provide reliablefeedback in downstream processes. This paper considers theimpact of ASR performance on Grammatical Error Detection(GED) for free speaking tasks, as an example of providing feed-back on a learner’s use of English. The performance of an ad-vanced deep-learning based GED system, initially trained onwritten corpora, is used to evaluate the influence of ASR errors.One consequence of these errors is that grammatical errors canresult from incorrect transcriptions as well as learner errors, thismay yield confusing feedback. To mitigate the effect of theseerrors, and reduce erroneous feedback, ASR confidence scoresare incorporated into the GED system. By additionally adaptingthe written text GED system to the speech domain, using ASRtranscriptions, significant gains in performance can be achieved.Analysis of the GED performance for different grammatical er-ror types and across grade is also presented.ALT

    Nitric oxide release from antimicrobial peptide hydrogels for wound healing

    Get PDF
    Nitric oxide (NO) is an endogenously produced molecule that has been implicated in several wound healing mechanisms. Its topical delivery may improve healing in acute or chronic wounds. In this study an antimicrobial peptide was synthesized which self-assembled upon a pH shift, forming a hydrogel. The peptide was chemically functionalized to incorporate a NO-donor moiety on lysine residues. The extent of the reaction was measured by ninhydrin assay and the NO release rate was quantified via the Griess reaction method. The resulting compound was evaluated for its antimicrobial activity against Escherichia coli, and its effect on collagen production by fibroblasts was assessed. Time-kill curves point to an initial increase in bactericidal activity of the functionalized peptide, and collagen production by human dermal fibroblasts when incubated with the NO-functionalized peptide showed a dose-dependent increase in the presence of the NO donor within a range of 0–20 µM.This work was financed by FEDER (Fundo Europeu de Desenvolvimento Regional) funds via COMPETE 2020 (Operacional Programme for Competitiveness and Internationalisation (POCI), Portugal 2020), and by Portuguese funds through FCT (Fundação para a Ciência e a Tecnologia/ Ministério da Ciência, Tecnologia e Ensino Superior) in the framework of the projects “Institute for Research and Innovation in Health Sciences” (POCI-01-0145-FEDER-007274) and PTDC/QUI-QFI/29914/2017, as well through the grant SFRH/BD/84914/2012. Thanks to FCT also for supporting Research Unit LAQV-REQUIMTE through the project UID/QUI/5006/2013

    Evaluation of a Telerehabilitation System for Community-Based Rehabilitation

    Get PDF
    The use of web-based portals, while increasing in popularity in the fields of medicine and research, are rarely reported on in community-based rehabilitation programs.  A program within the Pennsylvania Office of Vocational Rehabilitation’s Hiram G. Andrews Center, the Cognitive Skills Enhancement Program (CSEP), sought to enhance organization of program and participant information and communication between part- and full-time employees, supervisors and consultants. A telerehab system was developed consisting of (1) a web-based portal to support a variety of clinical activities and (2) the Versatile Integrated System for Telerehabilitation (VISyTER) video-conferencing system to support the collaboration and delivery of rehabilitation services remotely.  This descriptive evaluation examines the usability of the telerehab system incorporating both the portal and VISyTER. Telerehab system users include CSEP staff members from three geographical locations and employed by two institutions. The IBM After-Scenario Questionnaire (ASQ) and Post-Study System Usability Questionnaire (PSSUQ), the Telehealth Usability Questionnaire (TUQ), and two demographic surveys were administered to gather both objective and subjective information. Results showed generally high levels of usability.  Users commented that the telerehabilitation system improved communication, increased access to information, improved speed of completing tasks, and had an appealing interface. Areas where users would like to see improvements, including ease of accessing/editing documents and searching for information, are discussed.        

    Clinical and Genetic Analysis of Children with Kartagener Syndrome

    Get PDF
    Primary ciliary dyskinesia (PCD) is a rare autosomal recessive disorder characterized by dysfunction of motile cilia causing ineffective mucus clearance and organ laterality defects. In this study, two unrelated Portuguese children with strong PCD suspicion underwent extensive clinical and genetic assessments by whole-exome sequencing (WES), as well as ultrastructural analysis of cilia by transmission electron microscopy (TEM) to identify their genetic etiology. These analyses confirmed the diagnostic of Kartagener syndrome (KS) (PCD with situs inversus). Patient-1 showed a predominance of the absence of the inner dynein arms with two disease-causing variants in the CCDC40 gene. Patient-2 showed the absence of both dynein arms and WES disclosed two novel high impact variants in the DNAH5 gene and two missense variants in the DNAH7 gene, all possibly deleterious. Moreover, in Patient-2, functional data revealed a reduction of gene expression and protein mislocalization in both genes' products. Our work calls the researcher's attention to the complexity of the PCD and to the possibility of gene interactions modelling the PCD phenotype. Further, it is demonstrated that even for well-known PCD genes, novel pathogenic variants could have importance for a PCD/KS diagnosis, reinforcing the difficulty of providing genetic counselling and prenatal diagnosis to families.RP was funded by a PhD grant from the National Foundation for Science and Technology (FCT) (Ref.: PD/BD/105767/2014). This work was also supported by the Institutions of the authors and in part by FCT/UMIB (Pest-OE/SAU/UI0215/2014).info:eu-repo/semantics/publishedVersio
    corecore