78 research outputs found
FastRemap: A Tool for Quickly Remapping Reads between Genome Assemblies
A genome read data set can be quickly and efficiently remapped from one
reference to another similar reference (e.g., between two reference versions or
two similar species) using a variety of tools, e.g., the commonly-used CrossMap
tool. With the explosion of available genomic data sets and references,
high-performance remapping tools will be even more important for keeping up
with the computational demands of genome assembly and analysis.
We provide FastRemap, a fast and efficient tool for remapping reads between
genome assemblies. FastRemap provides up to a 7.82 speedup
(6.47, on average) and uses as low as 61.7% (80.7%, on average) of the
peak memory consumption compared to the state-of-the-art remapping tool,
CrossMap.
FastRemap is written in C++. The source code and user manual are freely
available at: github.com/CMU-SAFARI/FastRemap. Docker image available at:
https://hub.docker.com/r/alkanlab/fast. Also available in Bioconda at:
https://anaconda.org/bioconda/fastremap-bio.Comment: FastRemap is open source and all scripts needed to replicate the
results in this paper can be found at https://github.com/CMU-SAFARI/FastRema
RawHash: Enabling Fast and Accurate Real-Time Analysis of Raw Nanopore Signals for Large Genomes
Nanopore sequencers generate electrical raw signals in real-time while
sequencing long genomic strands. These raw signals can be analyzed as they are
generated, providing an opportunity for real-time genome analysis. An important
feature of nanopore sequencing, Read Until, can eject strands from sequencers
without fully sequencing them, which provides opportunities to computationally
reduce the sequencing time and cost. However, existing works utilizing Read
Until either 1) require powerful computational resources that may not be
available for portable sequencers or 2) lack scalability for large genomes,
rendering them inaccurate or ineffective.
We propose RawHash, the first mechanism that can accurately and efficiently
perform real-time analysis of nanopore raw signals for large genomes using a
hash-based similarity search. To enable this, RawHash ensures the signals
corresponding to the same DNA content lead to the same hash value, regardless
of the slight variations in these signals. RawHash achieves an accurate
hash-based similarity search via an effective quantization of the raw signals
such that signals corresponding to the same DNA content have the same quantized
value and, subsequently, the same hash value.
We evaluate RawHash on three applications: 1) read mapping, 2) relative
abundance estimation, and 3) contamination analysis. Our evaluations show that
RawHash is the only tool that can provide high accuracy and high throughput for
analyzing large genomes in real-time. When compared to the state-of-the-art
techniques, UNCALLED and Sigmap, RawHash provides 1) 25.8x and 3.4x better
average throughput and 2) an average speedup of 32.1x and 2.1x in the mapping
time, respectively.
Source code is available at https://github.com/CMU-SAFARI/RawHash
A Framework for Designing Efficient Deep Learning-Based Genomic Basecallers
Nanopore sequencing generates noisy electrical signals that need to be
converted into a standard string of DNA nucleotide bases using a computational
step called basecalling. The accuracy and speed of basecalling have critical
implications for all later steps in genome analysis. Many researchers adopt
complex deep learning-based models to perform basecalling without considering
the compute demands of such models, which leads to slow, inefficient, and
memory-hungry basecallers. Therefore, there is a need to reduce the computation
and memory cost of basecalling while maintaining accuracy. Our goal is to
develop a comprehensive framework for creating deep learning-based basecallers
that provide high efficiency and performance. We introduce RUBICON, a framework
to develop hardware-optimized basecallers. RUBICON consists of two novel
machine-learning techniques that are specifically designed for basecalling.
First, we introduce the first quantization-aware basecalling neural
architecture search (QABAS) framework to specialize the basecalling neural
network architecture for a given hardware acceleration platform while jointly
exploring and finding the best bit-width precision for each neural network
layer. Second, we develop SkipClip, the first technique to remove the skip
connections present in modern basecallers to greatly reduce resource and
storage requirements without any loss in basecalling accuracy. We demonstrate
the benefits of RUBICON by developing RUBICALL, the first hardware-optimized
basecaller that performs fast and accurate basecalling. Compared to the fastest
state-of-the-art basecaller, RUBICALL provides a 3.96x speedup with 2.97%
higher accuracy. We show that RUBICON helps researchers develop
hardware-optimized basecallers that are superior to expert-designed models
RUBICON:a framework for designing efficient deep learning-based genomic basecallers
Nanopore sequencing generates noisy electrical signals that need to be converted into a standard string of DNA nucleotide bases using a computational step called basecalling. The performance of basecalling has critical implications for all later steps in genome analysis. Therefore, there is a need to reduce the computation and memory cost of basecalling while maintaining accuracy. We present RUBICON, a framework to develop efficient hardware-optimized basecallers. We demonstrate the effectiveness of RUBICON by developing RUBICALL, the first hardware-optimized mixed-precision basecaller that performs efficient basecalling, outperforming the state-of-the-art basecallers. We believe RUBICON offers a promising path to develop future hardware-optimized basecallers.</p
High rate of colistin and fosfomycin resistance among carbapenemase-producing Enterobacteriaceae in Turkey
When the problem with carbapenem-resistant Enterobacteriaceae (CRE) increases, the older antimicrobial agents such as colistin and fosfomycin are used for the treatment of these infections. In this study, the broth microdilution method for colistin and the agar dilution method for fosfomycin were used for a total of 147 multidrug-resistant (MDR) or extensively drug-resistant (XDR) strains of CRE. The study included Klebsiella pneumoniae (91.16%), Escherichia coli (7.48%), Enterobacter cloacae (0.68%), and Serratia marcescens (0.68%). All these strains produce various types of carbapenemase, including OXA-48, NDM, and KPC. Some of these strains also have three different carbapenemase mechanisms, including OXA-48 (78.23%), NDM (2.04%), and KPC (0.68%) or OXA-48 and NDM (10.88%), or OXA-48 and KPC (0.68%). About 76.19% of the strains and 67.35% of the strains were resistant for colistin and fosfomycin, respectively. A total of 21 out of 35 colistin-susceptible strains were found to be susceptible to fosfomycin. This study showed that the resistance rates of colistin and fosfomycin are high. The MDR and XDR strains of CRE are spreading in our region and thus a monitoring system for CRE should be followed. Moreover, the applicability of antimicrobial stewardship programs should be increased in all inpatient and outpatient settings
TargetCall: Eliminating the Wasted Computation in Basecalling via Pre-Basecalling Filtering
Basecalling is an essential step in nanopore sequencing analysis where the
raw signals of nanopore sequencers are converted into nucleotide sequences,
i.e., reads. State-of-the-art basecallers employ complex deep learning models
to achieve high basecalling accuracy. This makes basecalling
computationally-inefficient and memory-hungry; bottlenecking the entire genome
analysis pipeline. However, for many applications, the majority of reads do no
match the reference genome of interest (i.e., target reference) and thus are
discarded in later steps in the genomics pipeline, wasting the basecalling
computation. To overcome this issue, we propose TargetCall, the first fast and
widely-applicable pre-basecalling filter to eliminate the wasted computation in
basecalling. TargetCall's key idea is to discard reads that will not match the
target reference (i.e., off-target reads) prior to basecalling. TargetCall
consists of two main components: (1) LightCall, a lightweight neural network
basecaller that produces noisy reads; and (2) Similarity Check, which labels
each of these noisy reads as on-target or off-target by matching them to the
target reference. TargetCall filters out all off-target reads before
basecalling; and the highly-accurate but slow basecalling is performed only on
the raw signals whose noisy reads are labeled as on-target. Our thorough
experimental evaluations using both real and simulated data show that
TargetCall 1) improves the end-to-end basecalling performance of the
state-of-the-art basecaller by 3.31x while maintaining high (98.88%)
sensitivity in keeping on-target reads, 2) maintains high accuracy in
downstream analysis, 3) precisely filters out up to 94.71% of off-target reads,
and 4) achieves better performance, sensitivity, and generality compared to
prior works. We freely open-source TargetCall at
https://github.com/CMU-SAFARI/TargetCall
RowPress: Amplifying Read Disturbance in Modern DRAM Chips
Memory isolation is critical for system reliability, security, and safety.
Unfortunately, read disturbance can break memory isolation in modern DRAM
chips. For example, RowHammer is a well-studied read-disturb phenomenon where
repeatedly opening and closing (i.e., hammering) a DRAM row many times causes
bitflips in physically nearby rows.
This paper experimentally demonstrates and analyzes another widespread
read-disturb phenomenon, RowPress, in real DDR4 DRAM chips. RowPress breaks
memory isolation by keeping a DRAM row open for a long period of time, which
disturbs physically nearby rows enough to cause bitflips. We show that RowPress
amplifies DRAM's vulnerability to read-disturb attacks by significantly
reducing the number of row activations needed to induce a bitflip by one to two
orders of magnitude under realistic conditions. In extreme cases, RowPress
induces bitflips in a DRAM row when an adjacent row is activated only once. Our
detailed characterization of 164 real DDR4 DRAM chips shows that RowPress 1)
affects chips from all three major DRAM manufacturers, 2) gets worse as DRAM
technology scales down to smaller node sizes, and 3) affects a different set of
DRAM cells from RowHammer and behaves differently from RowHammer as temperature
and access pattern changes.
We demonstrate in a real DDR4-based system with RowHammer protection that 1)
a user-level program induces bitflips by leveraging RowPress while conventional
RowHammer cannot do so, and 2) a memory controller that adaptively keeps the
DRAM row open for a longer period of time based on access pattern can
facilitate RowPress-based attacks. To prevent bitflips due to RowPress, we
describe and evaluate a new methodology that adapts existing RowHammer
mitigation techniques to also mitigate RowPress with low additional performance
overhead. We open source all our code and data to facilitate future research on
RowPress.Comment: Extended version of the paper "RowPress: Amplifying Read Disturbance
in Modern DRAM Chips" at the 50th Annual International Symposium on Computer
Architecture (ISCA), 202
İleri yaşta dehidroepiandrosteron ile spontan gebelik: üç olgu sunumu ve literatür derlemesi
It has been reported recently that dehydroepiandrosterone (DHEA) supplementation in older patients with low ovarian reserve increases the response to infertility treatment. Three women of age >3S years with low ovarian reserve parameters including high FSH, low AFC have been treated with DHEA for various time intervals are reported here. They conceived spontaneously after a few months of treatment which resulted in healthy newborns. Although there are not many randomized controlled trials about the value of DHEA treatment in infertile patient population, previous case reports support that DHEA increases spontaneous pregnancy rates. Considering healthy livebirths of the relatively older age of the mentioned cases, DHEA might be improving also oocyte quality as an additional impact.Son yıllarda dehidroepiandrosteronun düşük over rezervli hasta grubunda infertilite tedavisine yanıtı arttırdığı saptanmıştır. Bu olgu serisinde, 35 yaşından büyük ve kötü over rezervi, yüksek 3. gün FSH’sı ve düşük antral
folikül sayısı ile kanıtlanmış üç hastada kısa süreli dehidroepiandrosteron kullanımı ile spontan gelişen ve sağlıklı biçimde terme ulaşan gebelik olguları ve bu konudaki güncel literatür bilgileri gözden geçirilmektedir. Dehidroepiandrosteron ile yapılan çok geniş randomize çalışmalar bulunmasa da literatürdeki olgu sunumları ve burada sunulan vakalar, bu androjenin düşük over rezervli hastalarda spontan gebelik şansını arttırdığını kanıtlamaktadır. Gebeliklerin ileri anne yaşına rağmen sağlıklı canlı doğum ile sonuçlanmaları da oosit kalitesini iyileştirdiği fikrini desteklemektedir
ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-Efficient Genome Analysis
Profile hidden Markov models (pHMMs) are widely employed in various
bioinformatics applications to identify similarities between biological
sequences, such as DNA or protein sequences. In pHMMs, sequences are
represented as graph structures. These probabilities are subsequently used to
compute the similarity score between a sequence and a pHMM graph. The
Baum-Welch algorithm, a prevalent and highly accurate method, utilizes these
probabilities to optimize and compute similarity scores. However, the
Baum-Welch algorithm is computationally intensive, and existing solutions offer
either software-only or hardware-only approaches with fixed pHMM designs. We
identify an urgent need for a flexible, high-performance, and energy-efficient
HW/SW co-design to address the major inefficiencies in the Baum-Welch algorithm
for pHMMs.
We introduce ApHMM, the first flexible acceleration framework designed to
significantly reduce both computational and energy overheads associated with
the Baum-Welch algorithm for pHMMs. ApHMM tackles the major inefficiencies in
the Baum-Welch algorithm by 1) designing flexible hardware to accommodate
various pHMM designs, 2) exploiting predictable data dependency patterns
through on-chip memory with memoization techniques, 3) rapidly filtering out
negligible computations using a hardware-based filter, and 4) minimizing
redundant computations.
ApHMM achieves substantial speedups of 15.55x - 260.03x, 1.83x - 5.34x, and
27.97x when compared to CPU, GPU, and FPGA implementations of the Baum-Welch
algorithm, respectively. ApHMM outperforms state-of-the-art CPU implementations
in three key bioinformatics applications: 1) error correction, 2) protein
family search, and 3) multiple sequence alignment, by 1.29x - 59.94x, 1.03x -
1.75x, and 1.03x - 1.95x, respectively, while improving their energy efficiency
by 64.24x - 115.46x, 1.75x, 1.96x.Comment: Accepted to ACM TAC
Lot quality survey: an appealing method for rapid evaluation of vaccine coverage in developing countries – experience in Turkey
<p>Abstract</p> <p>Background</p> <p>Vaccine-preventable diseases cause significant morbidity and mortality worldwide and in developing countries in particular. Information on coverage and reasons for non-vaccination is vital to enhance overall vaccination activities. Of the several survey techniques available for investigating vaccination coverage in a given setting, the Lot Quality Technique (LQT) remains appealing and could be used in developing countries by local health personnel of district or rural health authorities to evaluate their performance in vaccination and many other health-related programs. This study aimed to evaluate vaccination coverage using LQT in a selected semi-urban setting in Turkey.</p> <p>Methods</p> <p>A LQT-based cross-sectional study was conducted in Kecioren District on a representative sample of residents aged 12–23 months in order to evaluate coverage for routine childhood vaccines, to identify health units with coverage below 75%, and to investigate reasons for non-vaccination.</p> <p>Results</p> <p>Based on self-reports, coverage for BCG, diphtheria-pertussis-tetanus (DPT-3), oral polio-3, hepatitis-3, and measles vaccines ranged between 94–99%. Coverage for measles was below 75% in five lots. The relatively high educational and socioeconomic status of parents in the study group alone could not minimize the "considerable" risk of vaccine-preventable diseases in the District and dictates a continuity of efforts for improving vaccination rates, with special emphasis on measles. We believe that administrative methods should be backed up by household surveys to strengthen vaccination monitoring and that families should be trained and motivated to have their children fully vaccinated according to the recommended schedule and in a timely manner.</p> <p>Conclusion</p> <p>This study identified vaccine coverage for seven routine vaccines completed before the age of 24 months as well as the areas requiring special attention in vaccination services. The LQT, years after its introduction to health-related research, remains an appealing technique for rapid evaluation of the extent of a variety of local health concerns in developing countries, in rural areas in particular, and is very efficient in determining performance of individual subunits in a given service area. Training of local health personnel on use of the LQT could expedite response to local health problems and could even motivate them in conducting their own surveys tailored to their professional interests.</p
- …