78 research outputs found

    FastRemap: A Tool for Quickly Remapping Reads between Genome Assemblies

    Full text link
    A genome read data set can be quickly and efficiently remapped from one reference to another similar reference (e.g., between two reference versions or two similar species) using a variety of tools, e.g., the commonly-used CrossMap tool. With the explosion of available genomic data sets and references, high-performance remapping tools will be even more important for keeping up with the computational demands of genome assembly and analysis. We provide FastRemap, a fast and efficient tool for remapping reads between genome assemblies. FastRemap provides up to a 7.82×\times speedup (6.47×\times, on average) and uses as low as 61.7% (80.7%, on average) of the peak memory consumption compared to the state-of-the-art remapping tool, CrossMap. FastRemap is written in C++. The source code and user manual are freely available at: github.com/CMU-SAFARI/FastRemap. Docker image available at: https://hub.docker.com/r/alkanlab/fast. Also available in Bioconda at: https://anaconda.org/bioconda/fastremap-bio.Comment: FastRemap is open source and all scripts needed to replicate the results in this paper can be found at https://github.com/CMU-SAFARI/FastRema

    RawHash: Enabling Fast and Accurate Real-Time Analysis of Raw Nanopore Signals for Large Genomes

    Full text link
    Nanopore sequencers generate electrical raw signals in real-time while sequencing long genomic strands. These raw signals can be analyzed as they are generated, providing an opportunity for real-time genome analysis. An important feature of nanopore sequencing, Read Until, can eject strands from sequencers without fully sequencing them, which provides opportunities to computationally reduce the sequencing time and cost. However, existing works utilizing Read Until either 1) require powerful computational resources that may not be available for portable sequencers or 2) lack scalability for large genomes, rendering them inaccurate or ineffective. We propose RawHash, the first mechanism that can accurately and efficiently perform real-time analysis of nanopore raw signals for large genomes using a hash-based similarity search. To enable this, RawHash ensures the signals corresponding to the same DNA content lead to the same hash value, regardless of the slight variations in these signals. RawHash achieves an accurate hash-based similarity search via an effective quantization of the raw signals such that signals corresponding to the same DNA content have the same quantized value and, subsequently, the same hash value. We evaluate RawHash on three applications: 1) read mapping, 2) relative abundance estimation, and 3) contamination analysis. Our evaluations show that RawHash is the only tool that can provide high accuracy and high throughput for analyzing large genomes in real-time. When compared to the state-of-the-art techniques, UNCALLED and Sigmap, RawHash provides 1) 25.8x and 3.4x better average throughput and 2) an average speedup of 32.1x and 2.1x in the mapping time, respectively. Source code is available at https://github.com/CMU-SAFARI/RawHash

    A Framework for Designing Efficient Deep Learning-Based Genomic Basecallers

    Full text link
    Nanopore sequencing generates noisy electrical signals that need to be converted into a standard string of DNA nucleotide bases using a computational step called basecalling. The accuracy and speed of basecalling have critical implications for all later steps in genome analysis. Many researchers adopt complex deep learning-based models to perform basecalling without considering the compute demands of such models, which leads to slow, inefficient, and memory-hungry basecallers. Therefore, there is a need to reduce the computation and memory cost of basecalling while maintaining accuracy. Our goal is to develop a comprehensive framework for creating deep learning-based basecallers that provide high efficiency and performance. We introduce RUBICON, a framework to develop hardware-optimized basecallers. RUBICON consists of two novel machine-learning techniques that are specifically designed for basecalling. First, we introduce the first quantization-aware basecalling neural architecture search (QABAS) framework to specialize the basecalling neural network architecture for a given hardware acceleration platform while jointly exploring and finding the best bit-width precision for each neural network layer. Second, we develop SkipClip, the first technique to remove the skip connections present in modern basecallers to greatly reduce resource and storage requirements without any loss in basecalling accuracy. We demonstrate the benefits of RUBICON by developing RUBICALL, the first hardware-optimized basecaller that performs fast and accurate basecalling. Compared to the fastest state-of-the-art basecaller, RUBICALL provides a 3.96x speedup with 2.97% higher accuracy. We show that RUBICON helps researchers develop hardware-optimized basecallers that are superior to expert-designed models

    RUBICON:a framework for designing efficient deep learning-based genomic basecallers

    Get PDF
    Nanopore sequencing generates noisy electrical signals that need to be converted into a standard string of DNA nucleotide bases using a computational step called basecalling. The performance of basecalling has critical implications for all later steps in genome analysis. Therefore, there is a need to reduce the computation and memory cost of basecalling while maintaining accuracy. We present RUBICON, a framework to develop efficient hardware-optimized basecallers. We demonstrate the effectiveness of RUBICON by developing RUBICALL, the first hardware-optimized mixed-precision basecaller that performs efficient basecalling, outperforming the state-of-the-art basecallers. We believe RUBICON offers a promising path to develop future hardware-optimized basecallers.</p

    High rate of colistin and fosfomycin resistance among carbapenemase-producing Enterobacteriaceae in Turkey

    Get PDF
    When the problem with carbapenem-resistant Enterobacteriaceae (CRE) increases, the older antimicrobial agents such as colistin and fosfomycin are used for the treatment of these infections. In this study, the broth microdilution method for colistin and the agar dilution method for fosfomycin were used for a total of 147 multidrug-resistant (MDR) or extensively drug-resistant (XDR) strains of CRE. The study included Klebsiella pneumoniae (91.16%), Escherichia coli (7.48%), Enterobacter cloacae (0.68%), and Serratia marcescens (0.68%). All these strains produce various types of carbapenemase, including OXA-48, NDM, and KPC. Some of these strains also have three different carbapenemase mechanisms, including OXA-48 (78.23%), NDM (2.04%), and KPC (0.68%) or OXA-48 and NDM (10.88%), or OXA-48 and KPC (0.68%). About 76.19% of the strains and 67.35% of the strains were resistant for colistin and fosfomycin, respectively. A total of 21 out of 35 colistin-susceptible strains were found to be susceptible to fosfomycin. This study showed that the resistance rates of colistin and fosfomycin are high. The MDR and XDR strains of CRE are spreading in our region and thus a monitoring system for CRE should be followed. Moreover, the applicability of antimicrobial stewardship programs should be increased in all inpatient and outpatient settings

    TargetCall: Eliminating the Wasted Computation in Basecalling via Pre-Basecalling Filtering

    Full text link
    Basecalling is an essential step in nanopore sequencing analysis where the raw signals of nanopore sequencers are converted into nucleotide sequences, i.e., reads. State-of-the-art basecallers employ complex deep learning models to achieve high basecalling accuracy. This makes basecalling computationally-inefficient and memory-hungry; bottlenecking the entire genome analysis pipeline. However, for many applications, the majority of reads do no match the reference genome of interest (i.e., target reference) and thus are discarded in later steps in the genomics pipeline, wasting the basecalling computation. To overcome this issue, we propose TargetCall, the first fast and widely-applicable pre-basecalling filter to eliminate the wasted computation in basecalling. TargetCall's key idea is to discard reads that will not match the target reference (i.e., off-target reads) prior to basecalling. TargetCall consists of two main components: (1) LightCall, a lightweight neural network basecaller that produces noisy reads; and (2) Similarity Check, which labels each of these noisy reads as on-target or off-target by matching them to the target reference. TargetCall filters out all off-target reads before basecalling; and the highly-accurate but slow basecalling is performed only on the raw signals whose noisy reads are labeled as on-target. Our thorough experimental evaluations using both real and simulated data show that TargetCall 1) improves the end-to-end basecalling performance of the state-of-the-art basecaller by 3.31x while maintaining high (98.88%) sensitivity in keeping on-target reads, 2) maintains high accuracy in downstream analysis, 3) precisely filters out up to 94.71% of off-target reads, and 4) achieves better performance, sensitivity, and generality compared to prior works. We freely open-source TargetCall at https://github.com/CMU-SAFARI/TargetCall

    RowPress: Amplifying Read Disturbance in Modern DRAM Chips

    Full text link
    Memory isolation is critical for system reliability, security, and safety. Unfortunately, read disturbance can break memory isolation in modern DRAM chips. For example, RowHammer is a well-studied read-disturb phenomenon where repeatedly opening and closing (i.e., hammering) a DRAM row many times causes bitflips in physically nearby rows. This paper experimentally demonstrates and analyzes another widespread read-disturb phenomenon, RowPress, in real DDR4 DRAM chips. RowPress breaks memory isolation by keeping a DRAM row open for a long period of time, which disturbs physically nearby rows enough to cause bitflips. We show that RowPress amplifies DRAM's vulnerability to read-disturb attacks by significantly reducing the number of row activations needed to induce a bitflip by one to two orders of magnitude under realistic conditions. In extreme cases, RowPress induces bitflips in a DRAM row when an adjacent row is activated only once. Our detailed characterization of 164 real DDR4 DRAM chips shows that RowPress 1) affects chips from all three major DRAM manufacturers, 2) gets worse as DRAM technology scales down to smaller node sizes, and 3) affects a different set of DRAM cells from RowHammer and behaves differently from RowHammer as temperature and access pattern changes. We demonstrate in a real DDR4-based system with RowHammer protection that 1) a user-level program induces bitflips by leveraging RowPress while conventional RowHammer cannot do so, and 2) a memory controller that adaptively keeps the DRAM row open for a longer period of time based on access pattern can facilitate RowPress-based attacks. To prevent bitflips due to RowPress, we describe and evaluate a new methodology that adapts existing RowHammer mitigation techniques to also mitigate RowPress with low additional performance overhead. We open source all our code and data to facilitate future research on RowPress.Comment: Extended version of the paper "RowPress: Amplifying Read Disturbance in Modern DRAM Chips" at the 50th Annual International Symposium on Computer Architecture (ISCA), 202

    İleri yaşta dehidroepiandrosteron ile spontan gebelik: üç olgu sunumu ve literatür derlemesi

    Get PDF
    It has been reported recently that dehydroepiandrosterone (DHEA) supplementation in older patients with low ovarian reserve increases the response to infertility treatment. Three women of age >3S years with low ovarian reserve parameters including high FSH, low AFC have been treated with DHEA for various time intervals are reported here. They conceived spontaneously after a few months of treatment which resulted in healthy newborns. Although there are not many randomized controlled trials about the value of DHEA treatment in infertile patient population, previous case reports support that DHEA increases spontaneous pregnancy rates. Considering healthy livebirths of the relatively older age of the mentioned cases, DHEA might be improving also oocyte quality as an additional impact.Son yıllarda dehidroepiandrosteronun düşük over rezervli hasta grubunda infertilite tedavisine yanıtı arttırdığı saptanmıştır. Bu olgu serisinde, 35 yaşından büyük ve kötü over rezervi, yüksek 3. gün FSH’sı ve düşük antral folikül sayısı ile kanıtlanmış üç hastada kısa süreli dehidroepiandrosteron kullanımı ile spontan gelişen ve sağlıklı biçimde terme ulaşan gebelik olguları ve bu konudaki güncel literatür bilgileri gözden geçirilmektedir. Dehidroepiandrosteron ile yapılan çok geniş randomize çalışmalar bulunmasa da literatürdeki olgu sunumları ve burada sunulan vakalar, bu androjenin düşük over rezervli hastalarda spontan gebelik şansını arttırdığını kanıtlamaktadır. Gebeliklerin ileri anne yaşına rağmen sağlıklı canlı doğum ile sonuçlanmaları da oosit kalitesini iyileştirdiği fikrini desteklemektedir

    ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-Efficient Genome Analysis

    Full text link
    Profile hidden Markov models (pHMMs) are widely employed in various bioinformatics applications to identify similarities between biological sequences, such as DNA or protein sequences. In pHMMs, sequences are represented as graph structures. These probabilities are subsequently used to compute the similarity score between a sequence and a pHMM graph. The Baum-Welch algorithm, a prevalent and highly accurate method, utilizes these probabilities to optimize and compute similarity scores. However, the Baum-Welch algorithm is computationally intensive, and existing solutions offer either software-only or hardware-only approaches with fixed pHMM designs. We identify an urgent need for a flexible, high-performance, and energy-efficient HW/SW co-design to address the major inefficiencies in the Baum-Welch algorithm for pHMMs. We introduce ApHMM, the first flexible acceleration framework designed to significantly reduce both computational and energy overheads associated with the Baum-Welch algorithm for pHMMs. ApHMM tackles the major inefficiencies in the Baum-Welch algorithm by 1) designing flexible hardware to accommodate various pHMM designs, 2) exploiting predictable data dependency patterns through on-chip memory with memoization techniques, 3) rapidly filtering out negligible computations using a hardware-based filter, and 4) minimizing redundant computations. ApHMM achieves substantial speedups of 15.55x - 260.03x, 1.83x - 5.34x, and 27.97x when compared to CPU, GPU, and FPGA implementations of the Baum-Welch algorithm, respectively. ApHMM outperforms state-of-the-art CPU implementations in three key bioinformatics applications: 1) error correction, 2) protein family search, and 3) multiple sequence alignment, by 1.29x - 59.94x, 1.03x - 1.75x, and 1.03x - 1.95x, respectively, while improving their energy efficiency by 64.24x - 115.46x, 1.75x, 1.96x.Comment: Accepted to ACM TAC

    Lot quality survey: an appealing method for rapid evaluation of vaccine coverage in developing countries – experience in Turkey

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Vaccine-preventable diseases cause significant morbidity and mortality worldwide and in developing countries in particular. Information on coverage and reasons for non-vaccination is vital to enhance overall vaccination activities. Of the several survey techniques available for investigating vaccination coverage in a given setting, the Lot Quality Technique (LQT) remains appealing and could be used in developing countries by local health personnel of district or rural health authorities to evaluate their performance in vaccination and many other health-related programs. This study aimed to evaluate vaccination coverage using LQT in a selected semi-urban setting in Turkey.</p> <p>Methods</p> <p>A LQT-based cross-sectional study was conducted in Kecioren District on a representative sample of residents aged 12–23 months in order to evaluate coverage for routine childhood vaccines, to identify health units with coverage below 75%, and to investigate reasons for non-vaccination.</p> <p>Results</p> <p>Based on self-reports, coverage for BCG, diphtheria-pertussis-tetanus (DPT-3), oral polio-3, hepatitis-3, and measles vaccines ranged between 94–99%. Coverage for measles was below 75% in five lots. The relatively high educational and socioeconomic status of parents in the study group alone could not minimize the "considerable" risk of vaccine-preventable diseases in the District and dictates a continuity of efforts for improving vaccination rates, with special emphasis on measles. We believe that administrative methods should be backed up by household surveys to strengthen vaccination monitoring and that families should be trained and motivated to have their children fully vaccinated according to the recommended schedule and in a timely manner.</p> <p>Conclusion</p> <p>This study identified vaccine coverage for seven routine vaccines completed before the age of 24 months as well as the areas requiring special attention in vaccination services. The LQT, years after its introduction to health-related research, remains an appealing technique for rapid evaluation of the extent of a variety of local health concerns in developing countries, in rural areas in particular, and is very efficient in determining performance of individual subunits in a given service area. Training of local health personnel on use of the LQT could expedite response to local health problems and could even motivate them in conducting their own surveys tailored to their professional interests.</p
    corecore