13 research outputs found

    Spectral methods for the detection and characterization of Topologically Associated Domains

    Get PDF
    The three-dimensional (3D) structure of the genome plays a crucial role in gene expression regulation. Chromatin conformation capture technologies (Hi-C) have revealed that the genome is organized in a hierarchy of topologically associated domains (TADs), sub-TADs, and chromatin loops which is relatively stable across cell-lines and even across species. These TADs dynamically reorganize during development of disease, and exhibit cell- and conditionspecific differences. Identifying such hierarchical structures and how they change between conditions is a critical step in understanding genome regulation and disease development. Despite their importance, there are relatively few tools for identification of TADs and even fewer for identification of hierarchies. Additionally, there are no publicly available tools for comparison of TADs across datasets. These tools are necessary to conduct large-scale genome-wide analysis and comparison of 3D structure. To address the challenge of TAD identification, we developed a novel sliding window-based spectral clustering framework that uses gaps between consecutive eigenvectors for TAD boundary identification. Our method, implemented in an R package, SpectralTAD, has automatic parameter selection, is robust to sequencing depth, resolution and sparsity of Hi-C data, and detects hierarchical, biologically relevant TADs. SpectralTAD outperforms four state-of-the-art TAD callers in simulated and experimental settings. We demonstrate that TAD boundaries shared among multiple levels of the TAD hierarchy were more enriched in classical boundary marks and more conserved across cell lines and tissues. SpectralTAD is available at http://bioconductor.org/packages/SpectralTAD/. To address the problem of TAD comparison, we developed TADCompare. TADCompare is based on a spectral clustering-derived measure called the eigenvector gap, which enables a loci-by-loci comparison of TAD boundary differences between datasets. Using this measure, we introduce methods for identifying differential and consensus TAD boundaries and tracking TAD boundary changes over time. We further propose a novel framework for the systematic classification of TAD boundary changes. Colocalization- and gene enrichment analysis of different types of TAD boundary changes revealed distinct biological functionality associated with them. TADCompare is available on https://github.com/dozmorovlab/TADCompare

    Quantifying Variation in Gait Features from Wearable Inertial Sensors Using Mixed Effects Models

    Get PDF
    The emerging technology of wearable inertial sensors has shown its advantages in collecting continuous longitudinal gait data outside laboratories. This freedom also presents challenges in collecting high-fidelity gait data. In the free-living environment, without constant supervision from researchers, sensor-based gait features are susceptible to variation from confounding factors such as gait speed and mounting uncertainty, which are challenging to control or estimate. This paper is one of the first attempts in the field to tackle such challenges using statistical modeling. By accepting the uncertainties and variation associated with wearable sensor-based gait data, we shift our efforts from detecting and correcting those variations to modeling them statistically. From gait data collected on one healthy, non-elderly subject during 48 full-factorial trials, we identified four major sources of variation, and quantified their impact on one gait outcome—range per cycle—using a random effects model and a fixed effects model. The methodology developed in this paper lays the groundwork for a statistical framework to account for sources of variation in wearable gait data, thus facilitating informative statistical inference for free-living gait analysis

    Quantifying Variation in Gait Features from Wearable Inertial Sensors Using Mixed Effects Models

    Get PDF
    The emerging technology of wearable inertial sensors has shown its advantages in collecting continuous longitudinal gait data outside laboratories. This freedom also presents challenges in collecting high-fidelity gait data. In the free-living environment, without constant supervision from researchers, sensor-based gait features are susceptible to variation from confounding factors such as gait speed and mounting uncertainty, which are challenging to control or estimate. This paper is one of the first attempts in the field to tackle such challenges using statistical modeling. By accepting the uncertainties and variation associated with wearable sensor-based gait data, we shift our efforts from detecting and correcting those variations to modeling them statistically. From gait data collected on one healthy, non-elderly subject during 48 full-factorial trials, we identified four major sources of variation, and quantified their impact on one gait outcome—range per cycle—using a random effects model and a fixed effects model. The methodology developed in this paper lays the groundwork for a statistical framework to account for sources of variation in wearable gait data, thus facilitating informative statistical inference for free-living gait analysis

    HiCcompare: an R-package for joint normalization and comparison of HI-C datasets

    No full text
    Abstract Background Changes in spatial chromatin interactions are now emerging as a unifying mechanism orchestrating the regulation of gene expression. Hi-C sequencing technology allows insight into chromatin interactions on a genome-wide scale. However, Hi-C data contains many DNA sequence- and technology-driven biases. These biases prevent effective comparison of chromatin interactions aimed at identifying genomic regions differentially interacting between, e.g., disease-normal states or different cell types. Several methods have been developed for normalizing individual Hi-C datasets. However, they fail to account for biases between two or more Hi-C datasets, hindering comparative analysis of chromatin interactions. Results We developed a simple and effective method, HiCcompare, for the joint normalization and differential analysis of multiple Hi-C datasets. The method introduces a distance-centric analysis and visualization of the differences between two Hi-C datasets on a single plot that allows for a data-driven normalization of biases using locally weighted linear regression (loess). HiCcompare outperforms methods for normalizing individual Hi-C datasets and methods for differential analysis (diffHiC, FIND) in detecting a priori known chromatin interaction differences while preserving the detection of genomic structures, such as A/B compartments. Conclusions HiCcompare is able to remove between-dataset bias present in Hi-C matrices. It also provides a user-friendly tool to allow the scientific community to perform direct comparisons between the growing number of pre-processed Hi-C datasets available at online repositories. HiCcompare is freely available as a Bioconductor R package https://bioconductor.org/packages/HiCcompare/

    Additional file 1: of HiCcompare: an R-package for joint normalization and comparison of HI-C datasets

    No full text
    Supplementary materials for the paper. This PDF file contains supplemental methods (Section 1), a computation performance evaluation of HiCcompare (Section 3), additional validation of methods used in HiCcompare, and extended comparisons with diffHic and FIND (Section 6 & 7). (PDF 5878 kb

    Additional file 4: of HiCcompare: an R-package for joint normalization and comparison of HI-C datasets

    No full text
    Table of gene enrichment results for NPC vs Neuron. This excecl file contains a worksheet for the GO MF results for the gene enrichment analysis between the NPC and Neuron. The GO BP and KEGG pathway analysis did not return any significant results and thus are not included here. (XLSX 11 kb

    Additional file 3: of HiCcompare: an R-package for joint normalization and comparison of HI-C datasets

    No full text
    Table of gene enrichment results for ESC vs NPC. This excecl file contains a worksheet for the GO MF, GO BP, and KEGG pathway analysis results for the gene enrichment analysis between the ESC and NPC discussed the in the results section. (XLSX 15 kb
    corecore