17 research outputs found

    Segmentation and genome annotation algorithms

    Full text link
    Segmentation and genome annotation (SAGA) algorithms are widely used to understand genome activity and gene regulation. These algorithms take as input epigenomic datasets, such as chromatin immunoprecipitation-sequencing (ChIP-seq) measurements of histone modifications or transcription factor binding. They partition the genome and assign a label to each segment such that positions with the same label exhibit similar patterns of input data. SAGA algorithms discover categories of activity such as promoters, enhancers, or parts of genes without prior knowledge of known genomic elements. In this sense, they generally act in an unsupervised fashion like clustering algorithms, but with the additional simultaneous function of segmenting the genome. Here, we review the common methodological framework that underlies these methods, review variants of and improvements upon this basic framework, catalogue existing large-scale reference annotations, and discuss the outlook for future work

    Additional file 2 of Choosing panels of genomics assays using submodular optimization

    No full text
    List of all assays used. File is in gzipped, tab-delimited format. Columns correspond to: (1) assay type, (2) cell type, (3) file name of file on original server, and (4) URL of server that the file was downloaded from. (TAB 300 kb

    Segway 2.0 Application Note Datasets

    No full text
    <p>Learned parameters and resulting segmentation corresponding to the analyses shown in the Segway 2.0 application note.</p> <p>Directory structure:</p> <p><strong>GMM</strong> (datasets corresponding to the mixture of Gaussians analysis)</p> <ul> <li>1-component <ul> <li>traindir/ <ul> <li>log/ (training log likelihood progression)</li> <li>params/ (learned parameters)</li> </ul> </li> <li>identifydir/ <ul> <li>segway.bed.gz (segmentation)</li> </ul> </li> </ul> </li> <li>3-component <ul> <li>traindir/ <ul> <li>log/ (training log likelihood progression)</li> <li>params/ (learned parameters)</li> </ul> </li> <li>identifydir/ <ul> <li>segway.bed.gz (segmentation)</li> </ul> </li> </ul> </li> </ul> <p><strong>minibatch-fixed</strong> (datasets corresponding to the minibatch learning analysis)</p> <ul> <li>fixed/ <ul> <li>traindir/ <ul> <li>log/ (training and validation log likelihood progression)</li> <li>params/ (learned parameters)</li> </ul> </li> </ul> </li> <li>minibatch/ <ul> <li>traindir/ <ul> <li>log/ (training and validation log likelihood progression)</li> <li>params/ (learned parameters)</li> </ul> </li> </ul> </li> </ul> <p><strong>TSS_prediction</strong> (datasets corresponding to the TSS prediction analysis) (where k=component number=1-5, n=random start number=1-10)</p> <ul> <li>outputs_[date]_k/ <ul> <li>traindir/ <ul> <li>log/ (training and validation log likelihood progression)</li> <li>params/ (learned parameters)</li> </ul> </li> <li>identifydir_n/ <ul> <li>segway.bed.gz (segmentation)</li> </ul> </li> </ul> </li> </ul

    Segway 2.0 Application Note Datasets

    No full text
    <p>Learned parameters and resulting segmentation corresponding to the analyses shown in the Segway 2.0 application note.</p> <p>Directory structure:</p> <p><strong>GMM</strong> (datasets corresponding to the mixture of Gaussians analysis)</p> <ul> <li>1-component <ul> <li>traindir/ <ul> <li>log/ (training log likelihood progression)</li> <li>params/ (learned parameters)</li> </ul> </li> <li>identifydir/ <ul> <li>segway.bed.gz (segmentation)</li> </ul> </li> </ul> </li> <li>3-component <ul> <li>traindir/ <ul> <li>log/ (training log likelihood progression)</li> <li>params/ (learned parameters)</li> </ul> </li> <li>identifydir/ <ul> <li>segway.bed.gz (segmentation)</li> </ul> </li> </ul> </li> </ul> <p><strong>minibatch-fixed</strong> (datasets corresponding to the minibatch learning analysis)</p> <ul> <li>fixed/ <ul> <li>traindir/ <ul> <li>log/ (training and validation log likelihood progression)</li> <li>params/ (learned parameters)</li> </ul> </li> </ul> </li> <li>minibatch/ <ul> <li>traindir/ <ul> <li>log/ (training and validation log likelihood progression)</li> <li>params/ (learned parameters)</li> </ul> </li> </ul> </li> </ul> <p><strong>TSS_prediction</strong> (datasets corresponding to the TSS prediction analysis) (where k=component number=1-5, n=random start number=1-10)</p> <ul> <li>outputs_[date]_k/ <ul> <li>traindir/ <ul> <li>log/ (training and validation log likelihood progression)</li> <li>params/ (learned parameters)</li> </ul> </li> <li>identifydir_n/ <ul> <li>segway.bed.gz (segmentation)</li> </ul> </li> </ul> </li> </ul
    corecore