39,316 research outputs found
Sequence alignment by passing messages
BACKGROUND: Sequence alignment has become an indispensable tool in modern molecular biology research, and probabilistic sequence alignment models have been shown to provide an effective framework for building accurate sequence alignment tools. One such example is the pair hidden Markov model (pair-HMM), which has been especially popular in comparative sequence analysis for several reasons, including their effectiveness in modeling and detecting sequence homology, model simplicity, and the existence of efficient algorithms for applying the model to sequence alignment problems. However, despite these advantages, pair-HMMs also have a number of practical limitations that may degrade their alignment performance or render them unsuitable for certain alignment tasks. RESULTS: In this work, we propose a novel scheme for comparing and aligning biological sequences that can effectively address the shortcomings of the traditional pair-HMMs. The proposed scheme is based on a simple message-passing approach, where messages are exchanged between neighboring symbol pairs that may be potentially aligned in the optimal sequence alignment. The message-passing process yields probabilistic symbol alignment confidence scores, which may be used for predicting the optimal alignment that maximizes the expected number of correctly aligned symbol pairs. CONCLUSIONS: Extensive performance evaluation on protein alignment benchmark datasets shows that the proposed message-passing scheme clearly outperforms the traditional pair-HMM-based approach, in terms of both alignment accuracy and computational efficiency. Furthermore, the proposed scheme is numerically robust and amenable to massive parallelization
Interference Alignment via Message-Passing
We introduce an iterative solution to the problem of interference alignment
(IA) over MIMO channels based on a message-passing formulation. We propose a
parameterization of the messages that enables the computation of IA precoders
by a min-sum algorithm over continuous variable spaces -- under this
parameterization, suitable approximations of the messages can be computed in
closed-form. We show that the iterative leakage minimization algorithm of
Cadambe et al. is a special case of our message-passing algorithm, obtained for
a particular schedule. Finally, we show that the proposed algorithm compares
favorably to iterative leakage minimization in terms of convergence speed, and
discuss a distributed implementation.Comment: Submitted to the IEEE International Conference on Communications
(ICC) 201
Generalized sequential tree-reweighted message passing
This paper addresses the problem of approximate MAP-MRF inference in general
graphical models. Following [36], we consider a family of linear programming
relaxations of the problem where each relaxation is specified by a set of
nested pairs of factors for which the marginalization constraint needs to be
enforced. We develop a generalization of the TRW-S algorithm [9] for this
problem, where we use a decomposition into junction chains, monotonic w.r.t.
some ordering on the nodes. This generalizes the monotonic chains in [9] in a
natural way. We also show how to deal with nested factors in an efficient way.
Experiments show an improvement over min-sum diffusion, MPLP and subgradient
ascent algorithms on a number of computer vision and natural language
processing problems
Clustering with shallow trees
We propose a new method for hierarchical clustering based on the optimisation
of a cost function over trees of limited depth, and we derive a
message--passing method that allows to solve it efficiently. The method and
algorithm can be interpreted as a natural interpolation between two well-known
approaches, namely single linkage and the recently presented Affinity
Propagation. We analyze with this general scheme three biological/medical
structured datasets (human population based on genetic information, proteins
based on sequences and verbal autopsies) and show that the interpolation
technique provides new insight.Comment: 11 pages, 7 figure
Models of Interaction as a Grounding for Peer to Peer Knowledge Sharing
Most current attempts to achieve reliable knowledge sharing on a large scale have relied on pre-engineering of content and supply services. This, like traditional knowledge engineering, does not by itself scale to large, open, peer to peer systems because the cost of being precise about the absolute semantics of services and their knowledge rises rapidly as more services participate. We describe how to break out of this deadlock by focusing on semantics related to interaction and using this to avoid dependency on a priori semantic agreement; instead making semantic commitments incrementally at run time. Our method is based on interaction models that are mobile in the sense that they may be transferred to other components, this being a mechanism for service composition and for coalition formation. By shifting the emphasis to interaction (the details of which may be hidden from users) we can obtain knowledge sharing of sufficient quality for sustainable communities of practice without the barrier of complex meta-data provision prior to community formation
The Parallelism Motifs of Genomic Data Analysis
Genomic data sets are growing dramatically as the cost of sequencing
continues to decline and small sequencing devices become available. Enormous
community databases store and share this data with the research community, but
some of these genomic data analysis problems require large scale computational
platforms to meet both the memory and computational requirements. These
applications differ from scientific simulations that dominate the workload on
high end parallel systems today and place different requirements on programming
support, software libraries, and parallel architectural design. For example,
they involve irregular communication patterns such as asynchronous updates to
shared data structures. We consider several problems in high performance
genomics analysis, including alignment, profiling, clustering, and assembly for
both single genomes and metagenomes. We identify some of the common
computational patterns or motifs that help inform parallelization strategies
and compare our motifs to some of the established lists, arguing that at least
two key patterns, sorting and hashing, are missing
- …