Search CORE

56 research outputs found

Probabilistic insertion, deletion and substitution error correction using Markov inference in next generation sequencing reads

Author: Noroozi Vahid
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2016
Field of study

Error correction of noisy reads obtained from high-throughput DNA sequencers is an important problem since read quality significantly affects downstream analyses such as detection of genetic variation and the complexity and success of sequence assembly. Most of the current error correction algorithms are only capable of recovering substitution errors. In this work, Pindel, an algorithm that simultaneously corrects insertion, deletion and substitution errors in reads from next generation DNA sequencing platforms is presented. Pindel corrects insertion, deletion and substitution errors by modelling the sequencer output as emissions of an appropriately defined Hidden Markov Model (HMM). Reads are corrected to the corresponding maximum likelihood paths using an appropriately modified Viterbi algorithm. When compared with Karect and Fiona, the top two current algorithms capable of correcting insertion, deletion and substitution errors, Pindel exhibits superior accuracy across a range of datasets

Digital Repository @ Iowa State University (ISU)

SEVEN: Deep Semi-supervised Verification Networks

Author: Bahaadini Sara
Noroozi Vahid
Xie Sihong
Yu Philip S.
Zheng Lei
Publication venue
Publication date: 14/06/2017
Field of study

Verification determines whether two samples belong to the same class or not, and has important applications such as face and fingerprint verification, where thousands or millions of categories are present but each category has scarce labeled examples, presenting two major challenges for existing deep learning models. We propose a deep semi-supervised model named SEmi-supervised VErification Network (SEVEN) to address these challenges. The model consists of two complementary components. The generative component addresses the lack of supervision within each category by learning general salient structures from a large amount of data across categories. The discriminative component exploits the learned general features to mitigate the lack of supervision within categories, and also directs the generative component to find more informative structures of the whole data manifold. The two components are tied together in SEVEN to allow an end-to-end training of the two components. Extensive experiments on four verification tasks demonstrate that SEVEN significantly outperforms other state-of-the-art deep semi-supervised techniques when labeled data are in short supply. Furthermore, SEVEN is competitive with fully supervised baselines trained with a larger amount of labeled data. It indicates the importance of the generative component in SEVEN.Comment: 7 pages, 2 figures, accepted to the 2017 International Joint Conference on Artificial Intelligence (IJCAI-17

arXiv.org e-Print Archive

Crossref

Developing an instrument for Iranian EFL learners' listening comprehension problems and listening strategies

Author: Nimehchisalem Vahid
Noroozi Sara
Tam Shu Sim
Zareian Gholamreza
Publication venue: 'Australian International Academic Centre'
Publication date: 01/06/2014
Field of study

In the body of literature on listening strategies to EFL learners, what seems to be lacking is that the focus is on teaching listening strategies to learners with little attention to their listening comprehension problems. No local research has been conducted on the nature of the Iranian tertiary level students' EFL listening comprehension problems or strategies. Therefore, no instrument is available to investigate these constructs. This paper reports the findings of a study that made an attempt to develop and test an instrument that will aid researchers identify students’ specific listening problems and listening strategy repertoire. The instrument was developed by integrating and validating the available instruments in the related literature. The two developed questionnaires were: the Listening Comprehension Problems Questionnaire (LCPQ) and the Listening Strategy Use Questionnaire (LSUQ). Problems related to designing and testing this instrument is shared and the modifications made to it are presented. The instrument is expected to be useful for researchers interested to study the area of EFL listening in a similar setting

Australian International Academic Centre: AIAC Journals

Universiti Putra Malaysia Institutional Repository

Directory of Open Access Journals

Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition

Author: Balam Jagadeesh
Ginsburg Boris
Kumar Ankur
Majumdar Somshubra
Noroozi Vahid
Publication venue
Publication date: 02/05/2024
Field of study

In this paper, we propose an efficient and accurate streaming speech recognition model based on the FastConformer architecture. We adapted the FastConformer architecture for streaming applications through: (1) constraining both the look-ahead and past contexts in the encoder, and (2) introducing an activation caching mechanism to enable the non-autoregressive encoder to operate autoregressively during inference. The proposed model is thoughtfully designed in a way to eliminate the accuracy disparity between the train and inference time which is common for many streaming models. Furthermore, our proposed encoder works with various decoder configurations including Connectionist Temporal Classification (CTC) and RNN-Transducer (RNNT) decoders. Additionally, we introduced a hybrid CTC/RNNT architecture which utilizes a shared encoder with both a CTC and RNNT decoder to boost the accuracy and save computation. We evaluate the proposed model on LibriSpeech dataset and a multi-domain large scale dataset and demonstrate that it can achieve better accuracy with lower latency and inference time compared to a conventional buffered streaming model baseline. We also showed that training a model with multiple latencies can achieve better accuracy than single latency models while it enables us to support multiple latencies with a single model. Our experiments also showed the hybrid architecture would not only speedup the convergence of the CTC decoder but also improves the accuracy of streaming models compared to single decoder models.Comment: Shorter version accepted to ICASSP 202

arXiv.org e-Print Archive

Comparing mindfulness-based group therapy with treatment as usual for opioid dependents: A pilot randomized clinical trial study protocol

Author: Bowen S.
Gharraee B.
Habibi M.
Imani S.
Noroozi A.
Vahid M.K.A.
Publication venue
Publication date: 01/01/2015
Field of study

Background: In response to high burden of opioid abuse in Iran, Ministry of Health has launched a large-scale opioid maintenance treatment program, delivered through a network of certified drug treatment centers. To promote opioid pharmacotherapies, there is an urgent need to develop and introduce evidence-based psychosocial interventions into the network. Patients and Methods: This is a randomized clinical trial (RCT) to investigate feasibility and effectiveness of adding mindfulness-based group therapy to opioid pharmacotherapies as compared to opioid pharmacotherapies alone. The primary outcomes were treatment retention and percentage of weekly morphine, methamphetamine, and benzodiazepine negative tests. Discussion: This is the first RCT that explores the effectiveness of mindfulness-based relapse prevention group therapy among opioid dependent clients in Iran. The feasibility of group therapy and comparison of outcomes in intervention and control groups should be discussed in the outcome article. Â© 2015, Mazandaran University of Medical Sciences

eprints Iran University of Medical Sciences