382 research outputs found

    Optimal Assembly for High Throughput Shotgun Sequencing

    Get PDF
    We present a framework for the design of optimal assembly algorithms for shotgun sequencing under the criterion of complete reconstruction. We derive a lower bound on the read length and the coverage depth required for reconstruction in terms of the repeat statistics of the genome. Building on earlier works, we design a de Brujin graph based assembly algorithm which can achieve very close to the lower bound for repeat statistics of a wide range of sequenced genomes, including the GAGE datasets. The results are based on a set of necessary and sufficient conditions on the DNA sequence and the reads for reconstruction. The conditions can be viewed as the shotgun sequencing analogue of Ukkonen-Pevzner's necessary and sufficient conditions for Sequencing by Hybridization.Comment: 26 pages, 18 figure

    Inferring the Sign of Kinase-Substrate Interactions by Combining Quantitative Phosphoproteomics with a Literature-Based Mammalian Kinome Network

    Full text link
    Protein phosphorylation is a reversible post-translational modification commonly used by cell signaling networks to transmit information about the extracellular environment into intracellular organelles for the regulation of the activity and sorting of proteins within the cell. For this study we reconstructed a literature-based mammalian kinase-substrate network from several online resources. The interactions within this directed graph network connect kinases to their substrates, through specific phosphosites including kinase-kinase regulatory interactions. However, the "signs" of links, activation or inhibition of the substrate upon phosphorylation, within this network are mostly unknown. Here we show how we can infer the "signs" indirectly using data from quantitative phosphoproteomics experiments applied to mammalian cells combined with the literature-based kinase-substrate network. Our inference method was able to predict the sign for 321 links and 153 phosphosites on 120 kinases, resulting in signed and directed subnetwork of mammalian kinase-kinase interactions. Such an approach can rapidly advance the reconstruction of cell signaling pathways and networks regulating mammalian cells.Comment: 5 page, 3 figures, IEEE-BIBE confrenc

    Telescoper: de novo assembly of highly repetitive regions.

    Get PDF
    MotivationWith advances in sequencing technology, it has become faster and cheaper to obtain short-read data from which to assemble genomes. Although there has been considerable progress in the field of genome assembly, producing high-quality de novo assemblies from short-reads remains challenging, primarily because of the complex repeat structures found in the genomes of most higher organisms. The telomeric regions of many genomes are particularly difficult to assemble, though much could be gained from the study of these regions, as their evolution has not been fully characterized and they have been linked to aging.ResultsIn this article, we tackle the problem of assembling highly repetitive regions by developing a novel algorithm that iteratively extends long paths through a series of read-overlap graphs and evaluates them based on a statistical framework. Our algorithm, Telescoper, uses short- and long-insert libraries in an integrated way throughout the assembly process. Results on real and simulated data demonstrate that our approach can effectively resolve much of the complex repeat structures found in the telomeres of yeast genomes, especially when longer long-insert libraries are used.AvailabilityTelescoper is publicly available for download at sourceforge.net/p/[email protected] informationSupplementary data are available at Bioinformatics online

    SMaSH: A Benchmarking Toolkit for Human Genome Variant Calling

    Full text link
    Motivation: Computational methods are essential to extract actionable information from raw sequencing data, and to thus fulfill the promise of next-generation sequencing technology. Unfortunately, computational tools developed to call variants from human sequencing data disagree on many of their predictions, and current methods to evaluate accuracy and computational performance are ad-hoc and incomplete. Agreement on benchmarking variant calling methods would stimulate development of genomic processing tools and facilitate communication among researchers. Results: We propose SMaSH, a benchmarking methodology for evaluating human genome variant calling algorithms. We generate synthetic datasets, organize and interpret a wide range of existing benchmarking data for real genomes, and propose a set of accuracy and computational performance metrics for evaluating variant calling methods on this benchmarking data. Moreover, we illustrate the utility of SMaSH to evaluate the performance of some leading single nucleotide polymorphism (SNP), indel, and structural variant calling algorithms. Availability: We provide free and open access online to the SMaSH toolkit, along with detailed documentation, at smash.cs.berkeley.edu

    Bag6 complex contains a minimal tail-anchor–targeting module and a mock BAG domain

    Get PDF
    BCL2-associated athanogene cochaperone 6 (Bag6) plays a central role in cellular homeostasis in a diverse array of processes and is part of the heterotrimeric Bag6 complex, which also includes ubiquitin-like 4A (Ubl4A) and transmembrane domain recognition complex 35 (TRC35). This complex recently has been shown to be important in the TRC pathway, the mislocalized protein degradation pathway, and the endoplasmic reticulum-associated degradation pathway. Here we define the architecture of the Bag6 complex, demonstrating that both TRC35 and Ubl4A have distinct C-terminal binding sites on Bag6 defining a minimal Bag6 complex. A crystal structure of the Bag6–Ubl4A dimer demonstrates that Bag6–BAG is not a canonical BAG domain, and this finding is substantiated biochemically. Remarkably, the minimal Bag6 complex defined here facilitates tail-anchored substrate transfer from small glutamine-rich tetratricopeptide repeat-containing protein α to TRC40. These findings provide structural insight into the complex network of proteins coordinated by Bag6

    Worldwide food recall patterns over an eleven month period: A country perspective.

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Following the World Health Organization Forum in November 2007, the Beijing Declaration recognized the importance of food safety along with the rights of all individuals to a safe and adequate diet. The aim of this study is to retrospectively analyze the patterns in food alert and recall by countries to identify the principal hazard generators and gatekeepers of food safety in the eleven months leading up to the Declaration.</p> <p>Methods</p> <p>The food recall data set was collected by the Laboratory of the Government Chemist (LGC, UK) over the period from January to November 2007. Statistics were computed with the focus reporting patterns by the 117 countries. The complexity of the recorded interrelations was depicted as a network constructed from structural properties contained in the data. The analysed network properties included degrees, weighted degrees, modularity and <it>k</it>-core decomposition. Network analyses of the reports, based on 'country making report' (<it>detector</it>) and 'country reported on' (<it>transgressor</it>), revealed that the network is organized around a dominant core.</p> <p>Results</p> <p>Ten countries were reported for sixty per cent of all faulty products marketed, with the top 5 countries having received between 100 to 281 reports. Further analysis of the dominant core revealed that out of the top five transgressors three made no reports (in the order China > Turkey > Iran). The top ten detectors account for three quarters of reports with three > 300 (Italy: 406, Germany: 340, United Kingdom: 322).</p> <p>Conclusion</p> <p>Of the 117 countries studied, the vast majority of food reports are made by 10 countries, with EU countries predominating. The majority of the faulty foodstuffs originate in ten countries with four major producers making no reports. This pattern is very distant from that proposed by the Beijing Declaration which urges all countries to take responsibility for the provision of safe and adequate diets for their nationals.</p

    A Characterization of Scale Invariant Responses in Enzymatic Networks

    Get PDF
    An ubiquitous property of biological sensory systems is adaptation: a step increase in stimulus triggers an initial change in a biochemical or physiological response, followed by a more gradual relaxation toward a basal, pre-stimulus level. Adaptation helps maintain essential variables within acceptable bounds and allows organisms to readjust themselves to an optimum and non-saturating sensitivity range when faced with a prolonged change in their environment. Recently, it was shown theoretically and experimentally that many adapting systems, both at the organism and single-cell level, enjoy a remarkable additional feature: scale invariance, meaning that the initial, transient behavior remains (approximately) the same even when the background signal level is scaled. In this work, we set out to investigate under what conditions a broadly used model of biochemical enzymatic networks will exhibit scale-invariant behavior. An exhaustive computational study led us to discover a new property of surprising simplicity and generality, uniform linearizations with fast output (ULFO), whose validity we show is both necessary and sufficient for scale invariance of enzymatic networks. Based on this study, we go on to develop a mathematical explanation of how ULFO results in scale invariance. Our work provides a surprisingly consistent, simple, and general framework for understanding this phenomenon, and results in concrete experimental predictions
    corecore