56 research outputs found

    Probabilistic insertion, deletion and substitution error correction using Markov inference in next generation sequencing reads

    Get PDF
    Error correction of noisy reads obtained from high-throughput DNA sequencers is an important problem since read quality significantly affects downstream analyses such as detection of genetic variation and the complexity and success of sequence assembly. Most of the current error correction algorithms are only capable of recovering substitution errors. In this work, Pindel, an algorithm that simultaneously corrects insertion, deletion and substitution errors in reads from next generation DNA sequencing platforms is presented. Pindel corrects insertion, deletion and substitution errors by modelling the sequencer output as emissions of an appropriately defined Hidden Markov Model (HMM). Reads are corrected to the corresponding maximum likelihood paths using an appropriately modified Viterbi algorithm. When compared with Karect and Fiona, the top two current algorithms capable of correcting insertion, deletion and substitution errors, Pindel exhibits superior accuracy across a range of datasets

    SEVEN: Deep Semi-supervised Verification Networks

    Full text link
    Verification determines whether two samples belong to the same class or not, and has important applications such as face and fingerprint verification, where thousands or millions of categories are present but each category has scarce labeled examples, presenting two major challenges for existing deep learning models. We propose a deep semi-supervised model named SEmi-supervised VErification Network (SEVEN) to address these challenges. The model consists of two complementary components. The generative component addresses the lack of supervision within each category by learning general salient structures from a large amount of data across categories. The discriminative component exploits the learned general features to mitigate the lack of supervision within categories, and also directs the generative component to find more informative structures of the whole data manifold. The two components are tied together in SEVEN to allow an end-to-end training of the two components. Extensive experiments on four verification tasks demonstrate that SEVEN significantly outperforms other state-of-the-art deep semi-supervised techniques when labeled data are in short supply. Furthermore, SEVEN is competitive with fully supervised baselines trained with a larger amount of labeled data. It indicates the importance of the generative component in SEVEN.Comment: 7 pages, 2 figures, accepted to the 2017 International Joint Conference on Artificial Intelligence (IJCAI-17

    Developing an instrument for Iranian EFL learners' listening comprehension problems and listening strategies

    Get PDF
    In the body of literature on listening strategies to EFL learners, what seems to be lacking is that the focus is on teaching listening strategies to learners with little attention to their listening comprehension problems. No local research has been conducted on the nature of the Iranian tertiary level students' EFL listening comprehension problems or strategies. Therefore, no instrument is available to investigate these constructs. This paper reports the findings of a study that made an attempt to develop and test an instrument that will aid researchers identify students’ specific listening problems and listening strategy repertoire. The instrument was developed by integrating and validating the available instruments in the related literature. The two developed questionnaires were: the Listening Comprehension Problems Questionnaire (LCPQ) and the Listening Strategy Use Questionnaire (LSUQ). Problems related to designing and testing this instrument is shared and the modifications made to it are presented. The instrument is expected to be useful for researchers interested to study the area of EFL listening in a similar setting

    Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition

    Full text link
    In this paper, we propose an efficient and accurate streaming speech recognition model based on the FastConformer architecture. We adapted the FastConformer architecture for streaming applications through: (1) constraining both the look-ahead and past contexts in the encoder, and (2) introducing an activation caching mechanism to enable the non-autoregressive encoder to operate autoregressively during inference. The proposed model is thoughtfully designed in a way to eliminate the accuracy disparity between the train and inference time which is common for many streaming models. Furthermore, our proposed encoder works with various decoder configurations including Connectionist Temporal Classification (CTC) and RNN-Transducer (RNNT) decoders. Additionally, we introduced a hybrid CTC/RNNT architecture which utilizes a shared encoder with both a CTC and RNNT decoder to boost the accuracy and save computation. We evaluate the proposed model on LibriSpeech dataset and a multi-domain large scale dataset and demonstrate that it can achieve better accuracy with lower latency and inference time compared to a conventional buffered streaming model baseline. We also showed that training a model with multiple latencies can achieve better accuracy than single latency models while it enables us to support multiple latencies with a single model. Our experiments also showed the hybrid architecture would not only speedup the convergence of the CTC decoder but also improves the accuracy of streaming models compared to single decoder models.Comment: Shorter version accepted to ICASSP 202

    Comparing mindfulness-based group therapy with treatment as usual for opioid dependents: A pilot randomized clinical trial study protocol

    Get PDF
    Background: In response to high burden of opioid abuse in Iran, Ministry of Health has launched a large-scale opioid maintenance treatment program, delivered through a network of certified drug treatment centers. To promote opioid pharmacotherapies, there is an urgent need to develop and introduce evidence-based psychosocial interventions into the network. Patients and Methods: This is a randomized clinical trial (RCT) to investigate feasibility and effectiveness of adding mindfulness-based group therapy to opioid pharmacotherapies as compared to opioid pharmacotherapies alone. The primary outcomes were treatment retention and percentage of weekly morphine, methamphetamine, and benzodiazepine negative tests. Discussion: This is the first RCT that explores the effectiveness of mindfulness-based relapse prevention group therapy among opioid dependent clients in Iran. The feasibility of group therapy and comparison of outcomes in intervention and control groups should be discussed in the outcome article. © 2015, Mazandaran University of Medical Sciences
    corecore