75,914 research outputs found

    Back-translation for discovering distant protein homologies

    Get PDF
    Frameshift mutations in protein-coding DNA sequences produce a drastic change in the resulting protein sequence, which prevents classic protein alignment methods from revealing the proteins' common origin. Moreover, when a large number of substitutions are additionally involved in the divergence, the homology detection becomes difficult even at the DNA level. To cope with this situation, we propose a novel method to infer distant homology relations of two proteins, that accounts for frameshift and point mutations that may have affected the coding sequences. We design a dynamic programming alignment algorithm over memory-efficient graph representations of the complete set of putative DNA sequences of each protein, with the goal of determining the two putative DNA sequences which have the best scoring alignment under a powerful scoring system designed to reflect the most probable evolutionary process. This allows us to uncover evolutionary information that is not captured by traditional alignment methods, which is confirmed by biologically significant examples.Comment: The 9th International Workshop in Algorithms in Bioinformatics (WABI), Philadelphia : \'Etats-Unis d'Am\'erique (2009

    A Graph-based Framework for Transmission of Correlated Sources over Broadcast Channels

    Full text link
    In this paper we consider the communication problem that involves transmission of correlated sources over broadcast channels. We consider a graph-based framework for this information transmission problem. The system involves a source coding module and a channel coding module. In the source coding module, the sources are efficiently mapped into a nearly semi-regular bipartite graph, and in the channel coding module, the edges of this graph are reliably transmitted over a broadcast channel. We consider nearly semi-regular bipartite graphs as discrete interface between source coding and channel coding in this multiterminal setting. We provide an information-theoretic characterization of (1) the rate of exponential growth (as a function of the number of channel uses) of the size of the bipartite graphs whose edges can be reliably transmitted over a broadcast channel and (2) the rate of exponential growth (as a function of the number of source samples) of the size of the bipartite graphs which can reliably represent a pair of correlated sources to be transmitted over a broadcast channel.Comment: 36 pages, 9 figure
    corecore