55 research outputs found

    Methods for Epigenetic Analyses from Long-Read Sequencing Data

    Get PDF
    Epigenetics, particularly the study of DNA methylation, is a cornerstone field for our understanding of human development and disease. DNA methylation has been included in the "hallmarks of cancer" due to its important function as a biomarker and its contribution to carcinogenesis and cancer cell plasticity. Long-read sequencing technologies, such as the Oxford Nanopore Technologies platform, have evolved the study of structural variations, while at the same time allowing direct measurement of DNA methylation on the same reads. With this, new avenues of analysis have opened up, such as long-range allele-specific methylation analysis, methylation analysis on structural variations, or relating nearby epigenetic modalities on the same read to another. Basecalling and methylation calling of Nanopore reads is a computationally expensive task which requires complex machine learning architectures. Read-level methylation calls require different approaches to data management and analysis than ones developed for methylation frequencies measured from short-read technologies or array data. The 2-dimensional nature of read and genome associated DNA methylation calls, including methylation caller uncertainties, are much more storage costly than 1-dimensional methylation frequencies. Methods for storage, retrieval, and analysis of such data therefore require careful consideration. Downstream analysis tasks, such as methylation segmentation or differential methylation calling, have the potential of benefiting from read information and allow uncertainty propagation. These avenues had not been considered in existing tools. In my work, I explored the potential of long-read DNA methylation analysis and tackled some of the challenges of data management and downstream analysis using state of the art software architecture and machine learning methods. I defined a storage standard for reference anchored and read assigned DNA methylation calls, including methylation calling uncertainties and read annotations such as haplotype or sample information. This storage container is defined as a schema for the hierarchical data format version 5, includes an index for rapid access to genomic coordinates, and is optimized for parallel computing with even load balancing. It further includes a python API for creation, modification, and data access, including convenience functions for the extraction of important quality statistics via a command line interface. Furthermore, I developed software solutions for the segmentation and differential methylation testing of DNA methylation calls from Nanopore sequencing. This implementation takes advantage of the performance benefits provided by my high performance storage container. It includes a Bayesian methylome segmentation algorithm which allows for the consensus instance segmentation of multiple sample and/or haplotype assigned DNA methylation profiles, while considering methylation calling uncertainties. Based on this segmentation, the software can then perform differential methylation testing and provides a large number of options for statistical testing and multiple testing correction. I benchmarked all tools on both simulated and publicly available real data, and show the performance benefits compared to previously existing and concurrently developed solutions. Next, I applied the methods to a cancer study on a chromothriptic cancer sample from a patient with Sonic Hedgehog Medulloblastoma. I here report regulatory genomic regions differentially methylated before and after treatment, allele-specific methylation in the tumor, as well as methylation on chromothriptic structures. Finally, I developed specialized methylation callers for the combined DNA methylation profiling of CpG, GpC, and context-free adenine methylation. These callers can be used to measure chromatin accessibility in a NOMe-seq like setup, showing the potential of long-read sequencing for the profiling of transcription factor co-binding. In conclusion, this thesis presents and subsequently benchmarks new algorithmic and infrastructural solutions for the analysis of DNA methylation data from long-read sequencing

    THE PROTECTION AND PRESERVATION OF ARCHIVAL MATERIAL IN POZEGA EXEMPLIFIELD LEGACY PH. D. JOSEPH BUTURCA

    Get PDF
    U Državnom arhivu u Požegi čuva se dio ostavštine Josipa Buturca, višeg arhivista, doktora povijesnih znanosti, sveučilišnog profesora, svećenika. U radu se obrađuje zaštita i čuvanje arhivskog gradiva, osobe, koji predstavlja vrijedan izvor promicatelja povijesne vrijednosti Hrvata i vremena u kojem je živio i radioIn the National Archives in Pozega kept part of the legacy of Joseph Buturca, senior archivist, doctorof historical science, university professors, priests. This paper deals with the protection and preservation of archival records, the person who represents a valuable source of promoter historical value of the Croats and the times in which he lived and worke

    Interfacing war game simulations with tactical C2 systems – dream or reality?, Journal of Telecommunications and Information Technology, 2003, nr 4

    Get PDF
    Decision making process in current tactical C2 systems is base on planning process of commanders and their staff. Improving tactical decision making by interfacing war game simulations with tactical C2 systems is achievable. Commander can review the results of the simulation and subsequently modify the tactical plan. Previously, the use of “training” simulations was not a viable solution to real world decision making due to the lengthy time required to input all of the combat entities, the unit organizations and personnel dispositions, the equipment configurations, status of the units and equipment, and the distribution of the available supplies. Modern C2 systems have all of this information stored in the common system databases, and this information can be used to instantiate and populate the simulation through an electronic adaptation of the data structures to match the requirements of the constructive simulation. This paper will provide description of system approach of interfacing simulation and C2 system to improve decision-making

    QPCR: Application for real-time PCR data management and analysis

    Get PDF
    BACKGROUND: Since its introduction quantitative real-time polymerase chain reaction (qPCR) has become the standard method for quantification of gene expression. Its high sensitivity, large dynamic range, and accuracy led to the development of numerous applications with an increasing number of samples to be analyzed. Data analysis consists of a number of steps, which have to be carried out in several different applications. Currently, no single tool is available which incorporates storage, management, and multiple methods covering the complete analysis pipeline. RESULTS: QPCR is a versatile web-based Java application that allows to store, manage, and analyze data from relative quantification qPCR experiments. It comprises a parser to import generated data from qPCR instruments and includes a variety of analysis methods to calculate cycle-threshold and amplification efficiency values. The analysis pipeline includes technical and biological replicate handling, incorporation of sample or gene specific efficiency, normalization using single or multiple reference genes, inter-run calibration, and fold change calculation. Moreover, the application supports assessment of error propagation throughout all analysis steps and allows conducting statistical tests on biological replicates. Results can be visualized in customizable charts and exported for further investigation. CONCLUSION: We have developed a web-based system designed to enhance and facilitate the analysis of qPCR experiments. It covers the complete analysis workflow combining parsing, analysis, and generation of charts into one single application. The system is freely available a

    Comparison of Collocation Extraction Measures for Document Indexing

    Get PDF
    Automatic extraction of collocations from a corpus is a well-known problem in the field of natural language processing. It is typically carried out by employing some kind of a statistical measure that indicates whether or not two words occur together more often than by chance. As there is an aboundance of these measures proposed by various authors, we have compared some of them on a task of extracting collocations from a corpus of Croatian legal documents for the purpose of document indexing. We propose and evaluate extensions of these measures for collocations consisting of three words

    An ex vivo system to study cellular dynamics underlying mouse peri-implantation development

    Get PDF
    マウスの着床期の胚発生を三次元で再現することに成功. 京都大学プレスリリース. 2022-02-09.Upon implantation, mammalian embryos undergo major morphogenesis and key developmental processes such as body axis specification and gastrulation. However, limited accessibility obscures the study of these crucial processes. Here, we develop an ex vivo Matrigel-collagen-based culture to recapitulate mouse development from E4.5 to E6.0. Our system not only recapitulates embryonic growth, axis initiation, and overall 3D architecture in 49% of the cases, but its compatibility with light-sheet microscopy also enables the study of cellular dynamics through automatic cell segmentation. We find that, upon implantation, release of the increasing tension in the polar trophectoderm is necessary for its constriction and invagination. The resulting extra-embryonic ectoderm plays a key role in growth, morphogenesis, and patterning of the neighboring epiblast, which subsequently gives rise to all embryonic tissues. This 3D ex vivo system thus offers unprecedented access to peri-implantation development for in toto monitoring, measurement, and spatiotemporally controlled perturbation, revealing a mechano-chemical interplay between extra-embryonic and embryonic tissues

    Long-read sequencing of diagnosis and post-therapy medulloblastoma reveals complex rearrangement patterns and epigenetic signatures

    Get PDF
    Cancer genomes harbor a broad spectrum of structural variants (SVs) driving tumorigenesis, a relevant subset of which escape discovery using short-read sequencing. We employed Oxford Nanopore Technologies (ONT) long-read sequencing in a paired diagnostic and post-therapy medulloblastoma to unravel the haplotype-resolved somatic genetic and epigenetic landscape. We assembled complex rearrangements, including a 1.55-Mbp chromothripsis event, and we uncover a complex SV pattern termed templated insertion (TI) thread, characterized by short (mostly <1 kb) insertions showing prevalent self-concatenation into highly amplified structures of up to 50 kbp in size. TI threads occur in 3% of cancers, with a prevalence up to 74% in liposarcoma, and frequent colocalization with chromothripsis. We also perform long-read-based methylome profiling and discover allele-specific methylation (ASM) effects, complex rearrangements exhibiting differential methylation, and differential promoter methylation in cancer-driver genes. Our study shows the advantage of long-read sequencing in the discovery and characterization of complex somatic rearrangements
    corecore