163,987 research outputs found

    Exploring genome wide bisulfite sequencing for DNA methylation analysis in livestock: a technical assessment

    Get PDF
    peer-reviewedRecent advances made in “omics” technologies are contributing to a revolution in livestock selection and breeding practices. Epigenetic mechanisms, including DNA methylation are important determinants for the control of gene expression in mammals. DNA methylation research will help our understanding of how environmental factors contribute to phenotypic variation of complex production and health traits. High-throughput sequencing is a vital tool for the comprehensive analysis of DNA methylation, and bisulfite-based strategies coupled with DNA sequencing allows for quantitative, site-specific methylation analysis at the genome level or genome wide. Reduced representation bisulfite sequencing (RRBS) and more recently whole genome bisulfite sequencing (WGBS) have proven to be effective techniques for studying DNA methylation in both humans and mice. Here we report the development of RRBS and WGBS for use in sheep, the first application of this technology in livestock species. Important technical issues associated with these methodologies including fragment size selection and sequence depth are examined and discussed.AgResearch AR&C grant for funding and Teagasc for providing a short-term overseas training awar

    Evaluation of points of improvement in NGS data analysis

    Get PDF
    [EN]DNA sequencing is a fundamental technique in molecular biology that allows the exact sequence of nucleotides in a DNA sample to be read. Over the past decades, DNA sequencing has seen significant advances, evolving from manual and laborious techniques to modern high-throughput techniques. Despite these advances, interpretation and analysis of sequencing data continue to present challenges. Artificial Intelligence (AI), and in particular machine learning, has emerged as an essential tool to address these challenges. The application of AI in the sequencing pipeline refers to the use of algorithms and models to automate, optimize and improve the precision of the sequencing process and its subsequent analysis. The Sanger sequencing method, introduced in the 1970s, was one of the first to be widely used. Although effective, this method is slow and is not suitable for sequencing large amounts of DNA, such as entire genomes. With the arrival of next generation sequencing (NGS) in the 21st century, greater speed and efficiency in obtaining genomic data has been achieved. However, the exponential increase in the amount of data produced has created a bottleneck in its analysis and interpretation

    Counting absolute number of molecules using unique molecular identifiers

    Get PDF
    Advances in molecular biology have made it easy to identify different DNA or RNA species and to copy them. Identification of nucleic acid species can be accomplished by reading the DNA sequence; currently millions of molecules can be sequenced in a single day using massively parallel sequencing. Efficient copying of DNA-molecules of arbitrary sequence was made possible by molecular cloning, and the polymerase chain reaction. Differences in the relative abundance of a large number of different sequences between two or more samples can in turn be measured using microarray hybridization and/or tag sequencing. However, determining the relative abundance of two different species and/or the absolute number of molecules present in a single sample has proven much more challenging. This is because it is hard to detect individual molecules without copying them, and even harder to make defined number of copies of molecules. We show here that this limitation can be overcome by using unique molecular identifiers (umis), which make each molecule in the sample distinct

    The Source of the Data Flood: Sequencing Technologies

    Get PDF
    Where does this huge amount of data come from? What are the costs of producing it? The answers to these questions lie in the impressive development of sequencing technologies, which have opened up many research opportunities and challenges, some of which are described in this issue. DNA sequencing is the process of “reading” a DNA fragment (referred to as a “read”) and determining the exact order of DNA bases (the four possible nucleotides, that are Adenine, Guanine, Cytosine, and Thymine) that compose a given DNA strand. Research in biology and medicine has been revolutionised and accelerated by the advances of DNA and even RNA sequencing biotechnologies

    Next-generation sequencing technologies and applications for human genetic history and forensics

    Get PDF
    Rapid advances in the development of sequencing technologies in recent years have enabled an increasing number of applications in biology and medicine. Here, we review key technical aspects of the preparation of DNA templates for sequencing, the biochemical reaction principles and assay formats underlying next-generation sequencing systems, methods for imaging and base calling, quality control, and bioinformatic approaches for sequence alignment, variant calling and assembly. We also discuss some of the most important advances that the new sequencing technologies have brought to the fields of human population genetics, human genetic history and forensic genetics

    High Performance Computing for DNA Sequence Alignment and Assembly

    Get PDF
    Recent advances in DNA sequencing technology have dramatically increased the scale and scope of DNA sequencing. These data are used for a wide variety of important biological analyzes, including genome sequencing, comparative genomics, transcriptome analysis, and personalized medicine but are complicated by the volume and complexity of the data involved. Given the massive size of these datasets, computational biology must draw on the advances of high performance computing. Two fundamental computations in computational biology are read alignment and genome assembly. Read alignment maps short DNA sequences to a reference genome to discover conserved and polymorphic regions of the genome. Genome assembly computes the sequence of a genome from many short DNA sequences. Both computations benefit from recent advances in high performance computing to efficiently process the huge datasets involved, including using highly parallel graphics processing units (GPUs) as high performance desktop processors, and using the MapReduce framework coupled with cloud computing to parallelize computation to large compute grids. This dissertation demonstrates how these technologies can be used to accelerate these computations by orders of magnitude, and have the potential to make otherwise infeasible computations practical
    corecore