Search CORE

8,257 research outputs found

Recommended from our members

Haplotype Assembly and Small Variant Calling using Emerging Sequencing Technologies

Author: Edge Peter Joseph
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Short read DNA sequencing technologies from Illumina have made sequencing a human genome significantly more affordable, greatly accelerating studies of biological function and the association of genetic variants to disease. These technologies are frequently used to detect small genetic variants such as single nucleotide variants (SNVs) using a reference genome. However, short read sequencing technologies have several limitations. First, the human genome is diploid and short reads contain limited information for assembling haplotypes, or the sequences of alleles on homologous chromosomes. Moreover, there is significant input DNA required, which poses challenges for analyzing single cells. Further, there is limited ability to detect genetic variants inside long duplicated sequences that occur in the genome. As a result, there has been widespread development of novel methods to overcome these deficiencies using short reads. These include clone based sequencing, linked read sequencing, and proximity ligation sequencing, as well as various single cell sequencing methods. There are also entirely new sequencing technologies from Pacific Biosciences and Oxford Nanopore Technologies that produce significantly longer reads. While these emerging methods and technologies demonstrate improvements compared to short reads, they also have properties and error modalities that pose unique computational challenges. Moreover, there is a shortage of bioinformatics methods for accurate small variant detection and haplotype assembly using these approaches compared to short reads. This dissertation aims to address this problem with the introduction of several new algorithms for highly accurate haplotype assembly and SNV calling. First, it introduces HapCUT2, an algorithm that can rapidly assemble haplotypes using a broad range of sequencing technologies. Second, it introduces an algorithm for variant calling and haplotyping using SISSOR, a recently introduced microfluidics based technology for sequencing single cells. Finally, it introduces Longshot, an algorithm for detecting and phasing SNVs using error-prone long read technologies. In each case, the algorithms are benchmarked using multiple real whole-genome sequencing datasets and are found to be highly accurate. The methods introduced in this dissertation contribute to the goal of sequencing diploid genomes accurately and completely for a broad range of scientific and clinical purposes

eScholarship - University of California

On the design of clone-based haplotyping

Author
Publication venue: BioMed Central
Publication date: 12/09/2013
Field of study

Springer - Publisher Connector

Acute Myeloid Leukemia

Author: Fortina P
Kricka Lj
Londin E
Park Jy
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Acute myeloid leukemia (AML) is the most common type of leukemia. The Cancer Genome Atlas Research Network has demonstrated the increasing genomic complexity of acute myeloid leukemia (AML). In addition, the network has facilitated our understanding of the molecular events leading to this deadly form of malignancy for which the prognosis has not improved over past decades. AML is a highly heterogeneous disease, and cytogenetics and molecular analysis of the various chromosome aberrations including deletions, duplications, aneuploidy, balanced reciprocal translocations and fusion of transcription factor genes and tyrosine kinases has led to better understanding and identification of subgroups of AML with different prognoses. Furthermore, molecular classification based on mRNA expression profiling has facilitated identification of novel subclasses and defined high-, poor-risk AML based on specific molecular signatures. However, despite increased understanding of AML genetics, the outcome for AML patients whose number is likely to rise as the population ages, has not changed significantly. Until it does, further investigation of the genomic complexity of the disease and advances in drug development are needed. In this review, leading AML clinicians and research investigators provide an up-to-date understanding of the molecular biology of the disease addressing advances in diagnosis, classification, prognostication and therapeutic strategies that may have significant promise and impact on overall patient survival

Archivio della ricerca- Università di Roma La Sapienza

Bacteria homologus to Aeromonas capable of microcystin degradation

Author: Dziadek J.
Gągała I.
Jaskulska A.
Jurczak Tomasz
Mankiewicz-Boczek Joanna
Pawełczyk J.
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 20/08/2014
Field of study

Water blooms dominated by cyanobacteria are capable of producing hepatotoxins known as microcystins. These toxins are dangerous to people and to the environment. Therefore, for a better understanding of the biological termination of this increasingly common phenomenon, bacteria with the potential to degrade cyanobacteria-derived hepatotoxins and the degradative activity of culturable bacteria were studied. Based on the presence of the mlrA gene, bacteria with a homology to the Sphingopyxis and Stenotrophomonas genera were identified as those presenting potential for microcystins degradation directly in the water samples from the Sulejów Reservoir (SU, Central Poland). However, this biodegrading potential has not been confirmed in in vitro experiments. The degrading activity of the culturable isolates from the water studied was determined in more than 30 bacterial mixes. An analysis of the biodegradation of the microcystin-LR (MC-LR) together with an analysis of the phylogenetic affiliation of bacteria demonstrated for the first time that bacteria homologous to the Aeromonas genus were able to degrade the mentioned hepatotoxin, although the mlrA gene was not amplified. The maximal removal efficiency of MC-LR was 48%. This study demonstrates a new aspect of interactions between the microcystin-containing cyanobacteria and bacteria from the Aeromonas genus.The authors would like to acknowledge the European Cooperation in Science and Technology, COST Action ES 1105 “CYANOCOST - Cyanobacterial blooms and toxins in water resources: Occurrence, impacts and management” for adding value to this study through networking and knowledge sharing with European experts and researchers in the field. The Sulejów Reservoir is a part of the Polish National Long- Term Ecosystem Research Network and the European LTER site

Repozytorium Uniwersytetu Łódzkiego (University of Lodz Repository)

Computational Methods for Sequencing and Analysis of Heterogeneous RNA Populations

Author: Glebova Olga
Publication venue: ScholarWorks @ Georgia State University
Publication date: 15/12/2016
Field of study

Next-generation sequencing (NGS) and mass spectrometry technologies bring unprecedented throughput, scalability and speed, facilitating the studies of biological systems. These technologies allow to sequence and analyze heterogeneous RNA populations rather than single sequences. In particular, they provide the opportunity to implement massive viral surveillance and transcriptome quantification. However, in order to fully exploit the capabilities of NGS technology we need to develop computational methods able to analyze billions of reads for assembly and characterization of sampled RNA populations. In this work we present novel computational methods for cost- and time-effective analysis of sequencing data from viral and RNA samples. In particular, we describe: i) computational methods for transcriptome reconstruction and quantification; ii) method for mass spectrometry data analysis; iii) combinatorial pooling method; iv) computational methods for analysis of intra-host viral populations

ScholarWorks @ Georgia State University

A Characterization of the DNA Data Storage Channel

Author: Grass Robert N.
Heckel Reinhard
Mikutis Gediminas
Publication venue
Publication date: 08/03/2018
Field of study

Owing to its longevity and enormous information density, DNA, the molecule encoding biological information, has emerged as a promising archival storage medium. However, due to technological constraints, data can only be written onto many short DNA molecules that are stored in an unordered way, and can only be read by sampling from this DNA pool. Moreover, imperfections in writing (synthesis), reading (sequencing), storage, and handling of the DNA, in particular amplification via PCR, lead to a loss of DNA molecules and induce errors within the molecules. In order to design DNA storage systems, a qualitative and quantitative understanding of the errors and the loss of molecules is crucial. In this paper, we characterize those error probabilities by analyzing data from our own experiments as well as from experiments of two different groups. We find that errors within molecules are mainly due to synthesis and sequencing, while imperfections in handling and storage lead to a significant loss of sequences. The aim of our study is to help guide the design of future DNA data storage systems by providing a quantitative and qualitative understanding of the DNA data storage channel

arXiv.org e-Print Archive

Repository for Publications and Research Data

Sensitive quantification of clonal evolution in Acute Myeloid Leukemia

Author: Richter Daniel
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 01/12/2022
Field of study

Digitale Hochschulschriften der LMU

On the design of clone-based haplotyping

Author: Aach John
Bafna Vineet
Byrne Susan
Church George
Lee Jehyuk
Liu Rui
Lo Christine
Lucchesi Carolina
Robasky Kimberly
Zhang Kun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Background: Haplotypes are important for assessing genealogy and disease susceptibility of individual genomes, but are difficult to obtain with routine sequencing approaches. Experimental haplotype reconstruction based on assembling fragments of individual chromosomes is promising, but with variable yields due to incompletely understood parameter choices. Results: We parameterize the clone-based haplotyping problem in order to provide theoretical and empirical assessments of the impact of different parameters on haplotype assembly. We confirm the intuition that long clones help link together heterozygous variants and thus improve haplotype length. Furthermore, given the length of the clones, we address how to choose the other parameters, including number of pools, clone coverage and sequencing coverage, so as to maximize haplotype length. We model the problem theoretically and show empirically the benefits of using larger clones with moderate number of pools and sequencing coverage. In particular, using 140 kb BAC clones, we construct haplotypes for a personal genome and assemble haplotypes with N50 values greater than 2.6 Mb. These assembled haplotypes are longer and at least as accurate as haplotypes of existing clone-based strategies, whether in vivo or in vitro. Conclusions: Our results provide practical guidelines for the development and design of clone-based methods to achieve long range, high-resolution and accurate haplotypes

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Harvard University - DASH

PubMed Central

eScholarship - University of California

The application of genomic technologies to cancer and companion diagnostics.

Author: Hadfield James
Publication venue
Publication date: 01/09/2014
Field of study

This thesis describes work undertaken by the author between 1996 and 2014. Genomics is the study of the genome, although it is also often used as a catchall phrase and applied to the transcriptome (study of RNAs) and methylome (study of DNA methylation). As cancer is a disease of the genome the rapid advances in genomic technology, specifically microarrays and next generation sequencing, are creating a wave of change in our understanding of its molecular pathology. Molecular pathology and personalised medicine are being driven by discoveries in genomics, and genomics is being driven by the development of faster, better and cheaper genome sequencing. The next decade is likely to see significant changes in the way cancer is managed for individual cancer patients as next generation sequencing enters the clinic. In chapter 3 I discuss how ERBB2 amplification testing for breast cancer is currently dominated by immunohistochemistry (a single-gene test); and present the development, by the author, of a semi-quantitative PCR test for ERBB2 amplification. I also show that estimating ERBB2 amplification from microarray copy-number analysis of the genome is possible. In chapter 4 I present a review of microarray comparison studies, and outline the case for careful and considered comparison of technologies when selecting a platform for use in a research study. Similar, indeed more stringent, care needs to be applied when selecting a platform for use in a clinical test. In chapter 5 I present co-authored work on the development of amplicon and exome methods for the detection and quantitation of somatic mutations in circulating tumour DNA, and demonstrate the impact this can have in understanding tumour heterogeneity and evolution during treatment. I also demonstrate how next-generation sequencing technologies may allow multiple genetic abnormalities to be analysed in a single test, and in low cellularity tumours and/or heterogenous cancers. Keywords: Genome, exome, transcriptome, amplicon, next-generation sequencing, differential gene expression, RNA-seq, ChIP-seq, microarray, ERBB2, companion diagnostic

University of East Anglia digital repository

A novel genotyping approach to improve transfusion support for patients with HLA and/or HPA alloantibodies

Author: Davey Susan Rosemary
Publication venue
Publication date
Field of study

Patients who require platelet transfusion support but have become sensitised to Human Leucocyte Antigens (HLA) or Human Platelet Antigens (HPA) require suitably matched or selected products to avoid adverse transfusion reactions resulting from antibodies reacting with the transfused product. Provision of compatible products for these patients is often challenging, and requires significant resources from the blood service. This study set out to develop and implement next generation sequencing (NGS) technology to enhance the HLA and HPA definition of both platelet donors and recipients.An NGS based method was designed and developed for high throughput, allele level HLA class I genotyping and used to evaluate the impact of NGS technology on the selection of platelet donors using HLA epitope matching (HEM). In addition, an alternative NGS approach was designed to simultaneously sequence the six genes that code for glycoproteins expressing HPA in order to define all known HPA systems in both donor and patient samples.Allele level HLA-A, -B and –C genotypes were generated for 519 platelet donors by NGS. A critical evaluation of algorithms used to predict alleles from low to medium resolution HLA types demonstrated that NGS was more accurate when determining HLA epitopes for the selection of platelets by HEM. The HLA genotyping data obtained was used to establish previously undefined HLA allele and haplotype frequencies at third field resolution in the English platelet donor population. This thesis also includes the first reported NGS based method for the simultaneous genotyping of HPA-1 to HPA-29, with the additional capability of novel HPA detection. NGS has been shown to significantly improve the definition of both HLA and HPA genetic systems and will provide a number of future benefits for laboratories and the patients they support, including provision of well matched transfusion products, the detection of rare or novel polymorphisms and increased knowledge of HLA and HPA frequencies

UWE Bristol Research Repository