77,745 research outputs found
Back-translation for discovering distant protein homologies
Frameshift mutations in protein-coding DNA sequences produce a drastic change
in the resulting protein sequence, which prevents classic protein alignment
methods from revealing the proteins' common origin. Moreover, when a large
number of substitutions are additionally involved in the divergence, the
homology detection becomes difficult even at the DNA level. To cope with this
situation, we propose a novel method to infer distant homology relations of two
proteins, that accounts for frameshift and point mutations that may have
affected the coding sequences. We design a dynamic programming alignment
algorithm over memory-efficient graph representations of the complete set of
putative DNA sequences of each protein, with the goal of determining the two
putative DNA sequences which have the best scoring alignment under a powerful
scoring system designed to reflect the most probable evolutionary process. This
allows us to uncover evolutionary information that is not captured by
traditional alignment methods, which is confirmed by biologically significant
examples.Comment: The 9th International Workshop in Algorithms in Bioinformatics
(WABI), Philadelphia : \'Etats-Unis d'Am\'erique (2009
A Graph-based Framework for Transmission of Correlated Sources over Broadcast Channels
In this paper we consider the communication problem that involves
transmission of correlated sources over broadcast channels. We consider a
graph-based framework for this information transmission problem. The system
involves a source coding module and a channel coding module. In the source
coding module, the sources are efficiently mapped into a nearly semi-regular
bipartite graph, and in the channel coding module, the edges of this graph are
reliably transmitted over a broadcast channel. We consider nearly semi-regular
bipartite graphs as discrete interface between source coding and channel coding
in this multiterminal setting. We provide an information-theoretic
characterization of (1) the rate of exponential growth (as a function of the
number of channel uses) of the size of the bipartite graphs whose edges can be
reliably transmitted over a broadcast channel and (2) the rate of exponential
growth (as a function of the number of source samples) of the size of the
bipartite graphs which can reliably represent a pair of correlated sources to
be transmitted over a broadcast channel.Comment: 36 pages, 9 figure
- …