11 research outputs found

    Embedded CMOS Basecalling for Nanopore DNA Sequencing

    Get PDF
    DNA sequencing is undergoing a profound evolution into a mobile technology. Unfortunately the effort needed to process the data emerging from this new sequencing technology requires a compute power only available to traditional desktop or cloud-based machines. To empower the full potential of portable DNA solutions a means of efficiently carrying out their computing needs in an embedded format will certainly be required. This thesis presents the design of a custom fixed-point VLSI hardware implementation of an HMM-based multi-channel DNA sequence processor. A 4096 state (6-mer nanopore sensor) basecalling architecture is designed in a 32-nm CMOS technology with the ability to process 1 million DNA base pairs per second per channel. Over a 100 mm^2 silicon footprint the design could process the equivalent of one human genome every 30 seconds at a power consumption of around 5 W

    Hardware Accelerated DNA Sequencing

    Get PDF
    DNA sequencing technology is quickly evolving. The latest developments ex- ploit nanopore sensing and microelectronics to realize real-time, hand-held devices. A critical limitation in these portable sequencing machines is the requirement of powerful data processing consoles, a need incompatible with portability and wide deployment. This thesis proposes a rst step towards addressing this problem, the construction of specialized computing modules { hardware accelerators { that can execute the required computations in real-time, within a small footprint, and at a fraction of the power needed by conventional computers. Such a hardware accel- erator, in FPGA form, is introduced and optimized specically for the basecalling function of the DNA sequencing pipeline. Key basecalling computations are identi- ed and ported to custom FPGA hardware. Remaining basecalling operations are maintained in a traditional CPU which maintains constant communications with its FPGA accelerator over the PCIe bus. Measured results demonstrated a 137X basecalling speed improvement over CPU-only methods while consuming 17X less power than a CPU-only method

    An In-Memory Architecture for High-Performance Long-Read Pre-Alignment Filtering

    Full text link
    With the recent move towards sequencing of accurate long reads, finding solutions that support efficient analysis of these reads becomes more necessary. The long execution time required for sequence alignment of long reads negatively affects genomic studies relying on sequence alignment. Although pre-alignment filtering as an extra step before alignment was recently introduced to mitigate sequence alignment for short reads, these filters do not work as efficiently for long reads. Moreover, even with efficient pre-alignment filters, the overall end-to-end (i.e., filtering + original alignment) execution time of alignment for long reads remains high, while the filtering step is now a major portion of the end-to-end execution time. Our paper makes three contributions. First, it identifies data movement of sequences between memory units and computing units as the main source of inefficiency for pre-alignment filters of long reads. This is because although filters reject many of these long sequencing pairs before they get to the alignment stage, they still require a huge cost regarding time and energy consumption for the large data transferred between memory and processor. Second, this paper introduces an adaptation of a short-read pre-alignment filtering algorithm suitable for long reads. We call this LongGeneGuardian. Finally, it presents Filter-Fuse as an architecture that supports LongGeneGuardian inside the memory. FilterFuse exploits the Computation-In-Memory computing paradigm, eliminating the cost of data movement in LongGeneGuardian. Our evaluations show that FilterFuse improves the execution time of filtering by 120.47x for long reads compared to State-of-the-Art (SoTA) filter, SneakySnake. FilterFuse also improves the end-to-end execution time of sequence alignment by up to 49.14x and 5207.63x compared to SneakySnake with SoTA aligner and only SoTA aligner, respectively

    Evaluation of nanopore-based sequencing technology for gene marker based analysis of complex microbial communities. Method development for accurate 16S rRNA gene amplicon sequencing

    Get PDF
    Nucleic acid sequencing can provide a detailed overview of microbial communities in comparison with standard plate-culture methods. Expansion of high-throughput sequencing (HTS) technologies and reduction in analysis costs has allowed for detailed exploration of various habitats with use of amplicon, metagenomics, and metatranscriptomics approaches. However, due to a capital cost of HTS platforms and requirements for batch analysis, genomics-based studies are still not being used as a standard method for the comprehensive examination of environmental or clinical samples for microbial characterization. This research project investigated the potential of a novel nanopore-based sequencing platform from Oxford Nanopore Technologies (ONT) for rapid and accurate analysis of various environmentally complex samples. ONT is an emerging company that developed the first-ever portable nanopore-based sequencing platform called MinIONTM. Portability and miniaturised size of the device gives an immense opportunity for de-centralised, in-field, and real-time analysis of environmental and clinical samples. Nonetheless, benchmarking of this new technology against the current gold-standard platform (i.e., Illumina sequencers) is necessary to evaluate nanopore data and understand its benefits and limitations. The focus of this study is on the evaluation of nanopore sequencing data: read quality, sequencing errors, alignment quality but also bacterial community structure. For this reason, mock bacterial community samples were generated, sequenced and analysed with use of multiple bioinformatics approaches. Furthermore, this study developed sophisticated library preparation and data analyses methods to enable high-accuracy analysis of amplicon libraries from complex microbial communities for sequencing on the nanopore platform. Besides, the best performing library preparation and data analyses methods were used for analysis of environmental samples and compared to high-quality Illumina metagenomics data. This work opens a new possibility for accurate, in-field amplicon analysis of complex samples with the use of MinIONTM and for the development of autonomous biosensing technology for culture-free detection of pathogenic and non-pathogenic microorganisms in water, soil, food, drinks or blood

    Genome and transcriptome architecture in Pyrococcus furiosus

    Get PDF
    Archaea nowadays are acknowledged for representing the second domain of life and for playing significant roles in the Earth ́s biogeochemical cycles. Before their initial discovery by Carl Woese and colleagues in the late 1970s, Archaea have not been recognised and erroneously confused with look-alike Bacteria under the microscope for decades. Since their classification as the third primary “kingdom” in 1990, not only their position in the universal tree of life has changed, defining the archaeal ancestry of Eukaryotes. Also, the knowledge about their ecology, diversity, evolution and molecular principles has been extended tremendously. Notably, it has been revealed that on the molecular level, Archaea share remarkably striking characteristics with both Eukarya and Bacteria, with transcription as one of the prime examples. Here, we have primarily been interested in the genome and transcriptome architecture, the regulatory roles of transcription factors and post-transcriptional mechanisms in the hyperthermophilic model archaeon Pyrococcus furiosus. To obtain the most accurate and informative background for further studies, we re- sequenced the culture collection strain DSM 3638 employing state-of-the-art hybrid Illumina and PacBio DNA sequencing and extensively expanded the annotation on the transcript level by using a differential RNA sequencing approach. Digestion of all non 5 ́- triphosphorylated transcripts by a Terminator-exonuclease allowed us to specifically enrich primary transcripts. The redefinition of the transcriptional landscape of P. furiosus included the genome-wide detection of transcription start sites, promoter architectures, sense- and antisense-RNAs. Interestingly, we discovered bidirectional transcription from symmetric promoters as an extensive source of antisense transcription, which is presumably a widespread feature of archaeal transcription. Additionally, we could prove that despite the relatively high abundance of insertion sequences in the 2 Mbp genome, the handling of a lab culture for two years did not lead to genomic rearrangements. Although we did not specifically challenge the genomic integrity, this still suggests that the genome is more stable than previously anticipated, which is an essential prerequisite for the comparability and feasibility of future genome-wide studies in P. furiosus. For rapid and cost-efficient re- sequencing of archaeal strains, we established 3rd generation long-read Nanopore sequencing technology in the lab, which allowed us to sequence the lab strain with high consensus accuracy. Next, we established a protocol for direct RNA sequencing in prokaryotes using the Nanopore technology, which is currently the only option for single-molecule sequencing of transcripts in their native context. The plethora of transcriptional and post-transcriptional events and features are usually tackled by short-read sequencing approaches that specifically have to be tailored to the respective research question by making adaptions to the library preparation protocol or by chemical treatment. In contrast, we evaluated the potential of native RNA sequencing to address multiple transcriptomic features simultaneously in a bacterial (Escherichia coli) and archaeal (Haloferax volcanii, P. furiosus) model organisms. Performing meta-data and single-molecule analysis we could (re-)annotate large transcriptional units and map transcription boundaries. Besides, we showed that long reads are a valuable tool for heterogeneous 3 ́-end detection and that diverse termination mechanisms occur in Archaea. Next, we used the single-molecule potential of Nanopore reads for the identification of previously known and unknown intermediates in the poorly understood rRNA maturation pathway in Archaea. Moreover, we were able to detect RNA base modifications in the form of systematic basecalling errors and shifts in the ionic current, which allowed us to follow the relative timely order of KsgA- dependent di-methylation and N4-cytidine acetylation in mature and precursor 16S rRNAs in archaeal species. Third, using the new reference genome of P. furiosus, we performed an integrative RNA-seq and ChIP-seq based approach to decipher the function of the transcriptional regulator CopR during copper detoxification in P. furiosus. To get a global view on the transcriptomic response and find components of the CopR-regulon, we performed differential gene expression analysis and ChIP-seq analysis after copper shock. We discovered that CopR, which is essential in copper detoxification, binds to the upstream regions of highly copper-induced genes, that all share a common palindromic motif. Additionally, negative-stain transmission electron microscopy and image analysis by 2D class averaging revealed that CopR binds to DNA in an octameric conformation similar to other factors of the Lrp family. Finally, we proposed a model for allosteric regulation of CopR upon copper-binding and revealed different layers of copper detoxification in P. furiosus. The findings of the studies that make up this thesis contribute to a deeper understanding of basic and regulatory principles of transcription in Archaea and update the genomic and transcriptomic landscape of P. furiosus. Also, the application of Nanopore- based native RNA sequencing not only represents a significant extension of the transcriptomic toolbox in prokaryotes but also provided us with a wealth of information, especially regarding transcriptional and post-transcriptional events during rRNA maturation

    Manipulation of the Electrical Double Layer for Control and Sensing in a Solid State Nanopore

    Get PDF
    Nanopores have been explored with the goal of achieving non-functionalized, sub-molecular sensors, primarily with the purpose of producing fast, low-cost DNA sequencers. Because of the nanoscale volume within the nanopore structure, it is possible to isolate individual molecular and sub-molecular analytes. Nanopore DNA sequencing has remained elusive due to high noise levels and the challenge of obtaining single-nucleotide resolution. However, the complete electrical double layer within the nanopore is a key feature of fluid-nanopore interaction and has been neglected in previous studies. By exploring interactions with the electrical double layer in various nanopore systems, we characterize the material, electrical, and solution dependent properties of this structure and develop a new sensing technique. The overall goals of this project are development of a theoretically complete and useful model of the electrical double layer in a nanopore, development of a nanopore device capable of detecting and manipulating the electrical double layer, characterization of active nanofluidic control, and detection of molecular and double layer properties. By considering extensive numerical models along with experimental evaluation of the nanopore devices, we characterize the fluidic and sensor properties of the electrical double layer in a nanopore. The ability to interact with the electrochemical and structural properties of the fluid within a nanopore offers new avenues for molecular detection and manipulation. We find that the energetic balance between the nanopore surface potential and the distribution of charged species within the electrical double layer is the key relationship governing the operation of this type of device. A method of active control of the ionic conductance through the nanopore was developed, with complete gating and on-state modulation. A molecular sensing technique was developed by correlating changes to the electrochemical potential of the solution to the physical properties of molecular analytes. The theoretical and practical limits of the nanopore sensor were tested by implementing a new type of nanopore DNA sequencer. High accuracy DNA sequences were produced by combining the double layer potential and ionic current channels in parallel, along with extensive application of signal theory, digital signal processing, and machine learning techniques

    A choline-releasing glycerophosphodiesterase essential for phosphatidylcholine biosynthesis and blood stage development in the malaria parasite.

    Get PDF
    The malaria parasite Plasmodium falciparum synthesizes significant amounts of phospholipids to meet the demands of replication within red blood cells. De novo phosphatidylcholine (PC) biosynthesis via the Kennedy pathway is essential, requiring choline that is primarily sourced from host serum lysophosphatidylcholine (lysoPC). LysoPC also acts as an environmental sensor to regulate parasite sexual differentiation. Despite these critical roles for host lysoPC, the enzyme(s) involved in its breakdown to free choline for PC synthesis are unknown. Here we show that a parasite glycerophosphodiesterase (PfGDPD) is indispensable for blood stage parasite proliferation. Exogenous choline rescues growth of PfGDPD-null parasites, directly linking PfGDPD function to choline incorporation. Genetic ablation of PfGDPD reduces choline uptake from lysoPC, resulting in depletion of several PC species in the parasite, whilst purified PfGDPD releases choline from glycerophosphocholine in vitro. Our results identify PfGDPD as a choline-releasing glycerophosphodiesterase that mediates a critical step in PC biosynthesis and parasite survival

    Leva-LAMP: Sustainable use of levamisole through elucidation of genetic markers of resistance and molecular diagnostics in Haemonchus contortus

    Get PDF
    Haemonchus contortus is a gastrointestinal parasitic nematode primarily infecting small ruminants. It is globally distributed, and a major cause of production losses and animal health concerns. Control largely relies on the use of broad spectrum anthelmintics, however, the effectiveness of many of these drugs is declining due to widespread anthelmintic resistance. One of the broad spectrum anthelmintics available to control H. contortus is levamisole (LEV), a cholinergic agonist which, when bound to the nematode acetylcholine receptor (AChR), causes paralysis in the worm. Resistance to LEV is, at the time of writing, less widespread worldwide than to other major drug classes, such as benzimidazoles (BZs) and macrocyclic lactones (MLs). However, unavailability of effective molecular diagnostics, due to the lack of fully resolved and validated molecular markers of LEV resistance in H. contortus and over reliance on the faecal egg count reduction test, which has inherent inaccuracies, and is poor at detecting emerging resistance, underline the urgent need to improve resistance monitoring. Understanding of genetic markers of LEV resistance is, therefore, key to improving the diagnosis of anthelmintic resistance and maintaining the sustainability of LEV. Therefore, the goal of this PhD was to elucidate and validate genetic markers of LEV resistance, and develop proof-of-concept molecular diagnostic assays. First, using whole genome sequencing data (WGS) from a controlled genetic cross (from a resistant and a susceptible parental line) pre- and post-LEV treatment, two quantitative trait loci (QTL) were identified. A single non-synonymous SNP variant, S168T, was identified in acr-8, a gene present within the major QTL, and previously shown to be essential for conferring LEV sensitivity to the H. contortus AChR. Extensive analysis of WGS data and single worm genotyping demonstrated that the presence of the S168T variant was predictive of a LEV resistant phenotype in all LEV resistant laboratory and field isolates examined, and absent in all susceptible isolates. Further SNP variants were also identified in LEV resistance associated genes in a second QTL on chromosome IV, which likely constitute minor markers of LEV resistance. Concurrently, putative markers of LEV resistance previously detailed in the literature were examined, but found not to constitute effective markers of LEV resistance in the isolates examined. A pilot study to validate a next-generation amplicon sequencing panel was also undertaken, with preliminary validation complete for S168T, and several additional minor markers of LEV resistance. Following on from these results, several molecular diagnostic assays with significant translational potential were developed and validated for the detection of the S168T variant in H. contortus. A two-step nested allele specific (AS)-PCR was optimised and demonstrated for the detection of the S168T variant in both laboratory and field isolates. Two emerging novel loop-mediated isothermal amplification (LAMP) technologies were then evaluated for detection of the S168T variant. Loop-primer enzymatic cleavage (LEC)-LAMP showed promising results discriminating between the resistant and susceptible alleles, with the added capacity for multiplexing. LEC-LAMP was also preliminarily adapted to point-of-care (POC) detection via a lateral flow (LF) platform. The results generated in this PhD have laid the foundation necessary to establish reliable and accurate laboratory based molecular diagnostics for the detection of LEV resistance in H. contortus, with potential for fast and low-cost application of these assays at the POC
    corecore