4,450 research outputs found
Structure of conflict graphs in constraint alignment problems and algorithms
We consider the constrained graph alignment problem which has applications in
biological network analysis. Given two input graphs , a pair of vertex mappings induces an {\it edge conservation} if
the vertex pairs are adjacent in their respective graphs. %In general terms The
goal is to provide a one-to-one mapping between the vertices of the input
graphs in order to maximize edge conservation. However the allowed mappings are
restricted since each vertex from (resp. ) is allowed to be mapped
to at most (resp. ) specified vertices in (resp. ). Most
of results in this paper deal with the case which attracted most
attention in the related literature. We formulate the problem as a maximum
independent set problem in a related {\em conflict graph} and investigate
structural properties of this graph in terms of forbidden subgraphs. We are
interested, in particular, in excluding certain wheals, fans, cliques or claws
(all terms are defined in the paper), which corresponds in excluding certain
cycles, paths, cliques or independent sets in the neighborhood of each vertex.
Then, we investigate algorithmic consequences of some of these properties,
which illustrates the potential of this approach and raises new horizons for
further works. In particular this approach allows us to reinterpret a known
polynomial case in terms of conflict graph and to improve known approximation
and fixed-parameter tractability results through efficiently solving the
maximum independent set problem in conflict graphs. Some of our new
approximation results involve approximation ratios that are function of the
optimal value, in particular its square root; this kind of results cannot be
achieved for maximum independent set in general graphs.Comment: 22 pages, 6 figure
Hybrid modeling, HMM/NN architectures, and protein applications
We describe a hybrid modeling approach where the parameters of a model are calculated and modulated by another model, typically a neural network (NN), to avoid both overfitting and underfitting. We develop the approach for the case of Hidden Markov Models (HMMs), by deriving a class of hybrid HMM/NN architectures. These architectures can be trained with unified algorithms that blend HMM dynamic programming with NN backpropagation. In the case of complex data, mixtures of HMMs or modulated HMMs must be used. NNs can then be applied both to the parameters of each single HMM, and to the switching or modulation of the models, as a function of input or context. Hybrid HMM/NN architectures provide a flexible NN parameterization for the control of model structure and complexity. At the same time, they can capture distributions that, in practice, are inaccessible to single HMMs. The HMM/NN hybrid approach is tested, in its simplest form, by constructing a model of the immunoglobulin protein family. A hybrid model is trained, and a multiple alignment derived, with less than a fourth of the number of parameters used with previous single HMMs
Solving Maximum Clique Problem for Protein Structure Similarity
A basic assumption of molecular biology is that proteins sharing close
three-dimensional (3D) structures are likely to share a common function and in
most cases derive from a same ancestor. Computing the similarity between two
protein structures is therefore a crucial task and has been extensively
investigated. Evaluating the similarity of two proteins can be done by finding
an optimal one-to-one matching between their components, which is equivalent to
identifying a maximum weighted clique in a specific "alignment graph". In this
paper we present a new integer programming formulation for solving such clique
problems. The model has been implemented using the ILOG CPLEX Callable Library.
In addition, we designed a dedicated branch and bound algorithm for solving the
maximum cardinality clique problem. Both approaches have been integrated in
VAST (Vector Alignment Search Tool) - a software for aligning protein 3D
structures largely used in NCBI (National Center for Biotechnology
Information). The original VAST clique solver uses the well known Bron and
Kerbosh algorithm (BK). Our computational results on real life protein
alignment instances show that our branch and bound algorithm is up to 116 times
faster than BK for the largest proteins
Protein alignment HW/SW optimizations
Biosequence alignment recently received an amazing support from both commodity and dedicated hardware platforms. The limitless requirements of this application motivate the search for improved implementations to boost processing time and capabilities. We propose an unprecedented hardware improvement to the classic Smith-Waterman (S-W) algorithm based on a twofold approach: i) an on-the-fly gap-open/gap-extension selection that reduces the hardware implementation complexity; ii) a pre-selection filter that uses reduced amino-acid alphabets to screen out not-significant sequences and to shorten the S-Witerations on huge reference databases.We demonstrated the improvements w.r.t. a classic approach both from the point of view of algorithm efficiency and of HW performance (FPGA and ASIC post-synthesis analysis)
- âŠ