109 research outputs found

    Orienting Ordered Scaffolds: Complexity and Algorithms

    Full text link
    Despite the recent progress in genome sequencing and assembly, many of the currently available assembled genomes come in a draft form. Such draft genomes consist of a large number of genomic fragments (scaffolds), whose order and/or orientation (i.e., strand) in the genome are unknown. There exist various scaffold assembly methods, which attempt to determine the order and orientation of scaffolds along the genome chromosomes. Some of these methods (e.g., based on FISH physical mapping, chromatin conformation capture, etc.) can infer the order of scaffolds, but not necessarily their orientation. This leads to a special case of the scaffold orientation problem (i.e., deducing the orientation of each scaffold) with a known order of the scaffolds. We address the problem of orientating ordered scaffolds as an optimization problem based on given weighted orientations of scaffolds and their pairs (e.g., coming from pair-end sequencing reads, long reads, or homologous relations). We formalize this problem using notion of a scaffold graph (i.e., a graph, where vertices correspond to the assembled contigs or scaffolds and edges represent connections between them). We prove that this problem is NP-hard, and present a polynomial-time algorithm for solving its special case, where orientation of each scaffold is imposed relatively to at most two other scaffolds. We further develop an FPT algorithm for the general case of the OOS problem

    Uncertainties in Arctic Precipitation

    Get PDF
    We acknowledge the generous support of the National Science Foundation under grant ARC-0909525,as well as the JAMSTEC- IARC cooperative agreement.It is crucial to measure precipitation accurately to predict future water budget with confidence. In our study, we aim to understand and compare precipitation datasets and discrepancies associated with them. We divide our datasets into three classes-raw data (data that have only been preprocessed to minimum quality control);corrected products (data that have been adjusted by their respective authors); finally, a reanalysis dataset (a combination of observed data and model output)
    corecore