10 research outputs found

    A Gibbs sampling strategy applied to the mapping of ambiguous short-sequence tags

    Get PDF
    Motivation: Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is widely used in biological research. ChIP-seq experiments yield many ambiguous tags that can be mapped with equal probability to multiple genomic sites. Such ambiguous tags are typically eliminated from consideration resulting in a potential loss of important biological information

    Deep Sequencing Data Analysis: Challenges and Solutions

    Get PDF

    Probabilistic resolution of multi-mapping reads in massively parallel sequencing data using MuMRescueLite

    No full text
    Multi-mapping sequence tags are a significant impediment to short-read sequencing platforms. These tags are routinely omitted from further analysis, leading to experimental bias and reduced coverage. Here, we present MuMRescueLite, a low-resource requirement version of the MuMRescue software that has been used by several next generation sequencing projects to probabilistically reincorporate multi-mapping tags into mapped short read data

    Entwicklung der Transkriptomsequenzierung und Anwendung zur Analyse des Transkriptoms von Corynebacterium glutamicum

    Get PDF
    Pfeifer-Sancar K. Entwicklung der Transkriptomsequenzierung und Anwendung zur Analyse des Transkriptoms von Corynebacterium glutamicum. Bielefeld: Universität Bielefeld; 2014

    Differential Gene Expression of Dorsal Pictorial Ornaments and Pigmentation in Skates (Rajidae, Chondrichthyes)

    Get PDF
    Approximately twenty years have passed since the beginning of concentrated investigations on the evolution and ecology of skates. The evidence generated thus far suggested that this monophyletic group have experienced multiple, parallel adaptive radiations at a regional scale. This background represented the guiding light of the work described in this thesis, where two main themes were developed. The first one focused on the investigation of Raja miraletus L. species complex through the analysis of genetic variation derived from both mtDNA and nuDNA. The results presented herein assessed the presence of a restricted gene flow and different degree of divergence between the South African and Mediterranean samples. Despite the high species diversity characterising the Family, most Rajidae show a stable gross morphology and peculiar dorsal pigmentation patterns, which may have been implicated in cryptic speciation. Nonetheless, the adaptive value and the genetic basis of these traits remain poorly investigated. To fill this gap, this thesis also describes the application of RNA-sequencing technology on recently diverged skate species with sibling and sister phylogenetic relationships. Therefore, the second goal of this research consisted in investigating the molecular basis of pigmentation in five non-model species. To this end, the transcriptome profiling of different skin tissues was performed using the Illumina platform, whereas longer sequencing data were obtained from R. miraletus multiple organs using the Ion Torrent technology. After the assembly of a reference transcriptome and the mapping of Illumina reads, the Differential Gene Expression between skin tissues across species was performed, revealing the expression of transcripts mainly related to metabolic process and catalytic activity in which pigmentary genes appeared involved. This work could be considered the basis for future studies aiming to disentangle how pigmentary traits evolved in skates and other chondrichtyans

    The Marvelous World of tRNAs: From Accurate Mapping to Chemical Modifications

    Get PDF
    Since the discovery of transfer RNAs (tRNAs) as decoders of the genetic code, life science has transformed. Particularly, as soon as the importance of tRNAs in protein synthesis has been established, researchers recognized that the functionality of tRNAs in cellular regulation exceeds beyond this paradigm. A strong impetus for these discoveries came from advances in large-scale RNA sequencing (RNA-seq) and increasingly sophisticated algorithms. Sequencing tRNAs is challenging both experimentally and in terms of the subsequent computational analysis. In RNA-seq data analysis, mapping tRNA reads to a reference genome is an error-prone task. This is in particular true, as chemical modifications introduce systematic reverse transcription errors while at the same time the genomic loci are only approximately identical due to the post-transcriptional maturation of tRNAs. Additionally, their multi-copy nature complicates the precise read assignment to its true genomic origin. In the course of the thesis a computational workflow was established to enable accurate mapping of tRNA reads. The developed method removes most of the mapping artifacts introduced by simpler mapping schemes, as demonstrated by using both simulated and human RNA-seq data. Subsequently, the resulting mapping profiles can be used for reliable identification of specific chemical tRNA modifications with a false discovery rate of only 2%. For that purpose, computational analysis methods were developed that facilitates the sensitive detection and even classification of most tRNA modifications based on their mapping profiles. This comprised both untreated RNA-seq data of various species, as well as treated data of Bacillus subtilis that has been designed to display modifications in a specific read-out in the mapping profile. The discussion focuses on sources of artifacts that complicate the profiling of tRNA modifications and strategies to overcome them. Exemplary studies on the modification pattern of different human tissues and the developmental stages of Dictyostelium discoideum were carried out. These suggested regulatory functions of tRNA modifications in development and during cell differentiation. The main experimental difficulties of tRNA sequencing are caused by extensive, stable secondary structures and the presence of chemical modifications. Current RNA-seq methods do not sample the entire tRNA pool, lose short tRNA fragments, or they lack specificity for tRNAs. Within this thesis, the benchmark and improvement of LOTTE-seq, a method for specific selection of tRNAs for high-throughput sequencing, exhibited that the method solves the experimental challenges and avoids the disadvantages of previous tRNA-seq protocols. Applying the accurate tRNA mapping strategy to LOTTE-seq and other tRNA-specific RNA- seq methods demonstrated that the content of mature tRNAs is highest in LOTTE-seq data, ranging from 90% in Spinacia oleracea to 100% in D. discoideum. Additionally, the thesis addressed the fact that tRNAs are multi-copy genes that undergo concerted evolution which keeps sequences of paralogous genes effectively identical. Therefore, it is impossible to distinguish orthologs from paralogs by sequence similarity alone. Synteny, the maintenance of relative genomic positions, is helpful to disambiguate evolutionary relationships in this situation. During this thesis a workflow was computed for synteny-based orthology identification of tRNA genes. The workflow is based on the use of pre-computed genome-wide multiple sequence alignment blocks as anchors to establish syntenic conservation of sequence intervals. Syntenic clusters of concertedly evolving genes of different tRNA families are then subdivided and processed by cograph editing to recover their duplication histories. A useful outcome of this study is that it highlights the technical problems and difficulties associated with an accurate analysis of the evolution of multi-copy genes. To showcase the method, evolution of tRNAs in primates and fruit flies were reconstructed. In the last decade, a number of reports have described novel aspects of tRNAs in terms of the diversity of their genes. For example, nuclear-encoded mitochondrial-derived tRNAs (nm-tRNAs) have been reported whose presence provokes intriguing questions about their functionality. Within this thesis an annotation strategy was developed that led to the identification of 335 and 43 novel nm-tRNAs in human and mouse, respectively. Interestingly, downstream analyses showed that the localization of several nm-tRNAs in introns and the over-representation of conserved RNA-binding sites of proteins involved in splicing suggest a potential regulatory function of intronic nm-tRNAs in splicing

    Mechanisms controlling mRNA processing and translation: decoding the regulatory layers defining gene expression through RNA sequencing

    Get PDF
    The work described in this thesis focuses on the mechanisms that give rise to alternative mRNAs and their alternative translation into proteins. Each of the described studies has been based on a specific set of high-throughput RNA sequencing technologies. An overview of the available RNA sequencing methods, together with an introduction to different regulatory layers which define the expression of a gene, are presented in Chapter 1. Our work in Chapter 2 and Chapter 3 investigates the process of alternative polyadenylation. Chapter 2 shows the role of alternative polyadenylation in the context of oculopharyngeal muscular dystrophy. Chapter 3 describes genetic variants associated with alternative polyadenylation. Chapter 4 focuses on mechanisms controlling protein synthesis (translation) during skeletal muscle differentiation, highlighting changes in the use of alternative translation initiation sites. In Chapter 5 we investigated the interdependence between alternative regulatory events in gene expression. In this study, based on single-molecule full-length RNA sequencing, we demonstrated coordination and interdependence between alternative transcription initiation, alternative splicing, and alternative polyadenylation. Finally, Chapter 6 connects fundamental research in the RNA field with clinical care, describing new diagnostic and therapeutic approaches.UBL - phd migration 201

    Grand Celebration: 10th Anniversary of the Human Genome Project

    Get PDF
    In 1990, scientists began working together on one of the largest biological research projects ever proposed. The project proposed to sequence the three billion nucleotides in the human genome. The Human Genome Project took 13 years and was completed in April 2003, at a cost of approximately three billion dollars. It was a major scientific achievement that forever changed the understanding of our own nature. The sequencing of the human genome was in many ways a triumph for technology as much as it was for science. From the Human Genome Project, powerful technologies have been developed (e.g., microarrays and next generation sequencing) and new branches of science have emerged (e.g., functional genomics and pharmacogenomics), paving new ways for advancing genomic research and medical applications of genomics in the 21st century. The investigations have provided new tests and drug targets, as well as insights into the basis of human development and diagnosis/treatment of cancer and several mysterious humans diseases. This genomic revolution is prompting a new era in medicine, which brings both challenges and opportunities. Parallel to the promising advances over the last decade, the study of the human genome has also revealed how complicated human biology is, and how much remains to be understood. The legacy of the understanding of our genome has just begun. To celebrate the 10th anniversary of the essential completion of the Human Genome Project, in April 2013 Genes launched this Special Issue, which highlights the recent scientific breakthroughs in human genomics, with a collection of papers written by authors who are leading experts in the field
    corecore