23 research outputs found

    A path following algorithm for the graph matching problem

    Get PDF
    We propose a convex-concave programming approach for the labeled weighted graph matching problem. The convex-concave programming formulation is obtained by rewriting the weighted graph matching problem as a least-square problem on the set of permutation matrices and relaxing it to two different optimization problems: a quadratic convex and a quadratic concave optimization problem on the set of doubly stochastic matrices. The concave relaxation has the same global minimum as the initial graph matching problem, but the search for its global minimum is also a hard combinatorial problem. We therefore construct an approximation of the concave problem solution by following a solution path of a convex-concave problem obtained by linear interpolation of the convex and concave formulations, starting from the convex relaxation. This method allows to easily integrate the information on graph label similarities into the optimization problem, and therefore to perform labeled weighted graph matching. The algorithm is compared with some of the best performing graph matching methods on four datasets: simulated graphs, QAPLib, retina vessel images and handwritten chinese characters. In all cases, the results are competitive with the state-of-the-art.Comment: 23 pages, 13 figures,typo correction, new results in sections 4,5,

    A new protein binding pocket similarity measure based on comparison of clouds of atoms in 3D: application to ligand prediction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Predicting which molecules can bind to a given binding site of a protein with known 3D structure is important to decipher the protein function, and useful in drug design. A classical assumption in structural biology is that proteins with similar 3D structures have related molecular functions, and therefore may bind similar ligands. However, proteins that do not display any overall sequence or structure similarity may also bind similar ligands if they contain similar binding sites. Quantitatively assessing the similarity between binding sites may therefore be useful to propose new ligands for a given pocket, based on those known for similar pockets.</p> <p>Results</p> <p>We propose a new method to quantify the similarity between binding pockets, and explore its relevance for ligand prediction. We represent each pocket by a cloud of atoms, and assess the similarity between two pockets by aligning their atoms in the 3D space and comparing the resulting configurations with a convolution kernel. Pocket alignment and comparison is possible even when the corresponding proteins share no sequence or overall structure similarities. In order to predict ligands for a given target pocket, we compare it to an ensemble of pockets with known ligands to identify the most similar pockets. We discuss two criteria to evaluate the performance of a binding pocket similarity measure in the context of ligand prediction, namely, area under ROC curve (AUC scores) and classification based scores. We show that the latter is better suited to evaluate the methods with respect to ligand prediction, and demonstrate the relevance of our new binding site similarity compared to existing similarity measures.</p> <p>Conclusions</p> <p>This study demonstrates the relevance of the proposed method to identify ligands binding to known binding pockets. We also provide a new benchmark for future work in this field. The new method and the benchmark are available at <url>http://cbio.ensmp.fr/paris/</url>.</p

    Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen

    Get PDF
    The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistance that might be overcome with drug combinations. However, the number of possible combinations is vast, necessitating data-driven approaches to find optimal patient-specific treatments. Here we report AstraZeneca's large drug combination dataset, consisting of 11,576 experiments from 910 combinations across 85 molecularly characterized cancer cell lines, and results of a DREAM Challenge to evaluate computational strategies for predicting synergistic drug pairs and biomarkers. 160 teams participated to provide a comprehensive methodological development and benchmarking. Winning methods incorporate prior knowledge of drug-target interactions. Synergy is predicted with an accuracy matching biological replicates for >60% of combinations. However, 20% of drug combinations are poorly predicted by all methods. Genomic rationale for synergy predictions are identified, including ADAM17 inhibitor antagonism when combined with PIK3CB/D inhibition contrasting to synergy when combined with other PI3K-pathway inhibitors in PIK3CA mutant cells

    L'alignement de graphes : applications en bioinformatique et vision par ordinateur

    No full text
    The graph matching problem is among the most important challenges of graph processing, and plays a central role in various fields of pattern recognition. We propose an approximate method for labeled weighted graph matching, based on a convex-concave programming approach which can be applied to the matching of large sized graphs. This method allows to easily integrate information on graph label similarities into the optimization problem, and therefore to perform labeled weighted graph matching. One of the interesting applications of the graph matching problem is the alignment of protein-protein interaction networks. This problem is important when investigating evolutionary conserved pathways or protein complexes across species, and to help in the identification of functional orthologs through the detection of conserved interactions. We reformulate PPI alignment as a graph matching problem, and study how state-of-the-art graph matching algorithms can be used for this purpose. In the classical formulation of graph matching, only one-to-one correspondences are considered, which is not always appropriate. In many applications, it is more interesting to consider many-to-many correspondences between graph vertices. We propose a reformulation of the many-to-many graph matching problem as a discrete optimization problem and we propose an approximate algorithm based on a continuous relaxation. In this thesis, we also present two interesting results in statistical machine translation and bioinformatics. We show how the phrase-based statistical machine translation decoding problem can be reformulated as a Traveling Salesman Problem. We also propose a new protein binding pocket similarity measure based on a comparison of 3D atom clouds.Le problĂšme d'alignement de graphes, qui joue un rĂŽle central dans diffĂ©rents domaines de la reconnaissance de formes, est l'un des plus grands dĂ©fis dans le traitement de graphes. Nous proposons une mĂ©thode approximative pour l'alignement de graphes Ă©tiquetĂ©s et pondĂ©rĂ©s, basĂ©e sur la programmation convexe concave. Une application importante du problĂšme d'alignement de graphes est l'alignement de rĂ©seaux d'interactions de protĂ©ines, qui joue un rĂŽle central pour la recherche de voies de signalisation conservĂ©es dans l'Ă©volution, de complexes protĂ©iques conservĂ©s entre les espĂšces, et pour l'identification d'orthologues fonctionnels. Nous reformulons le problĂšme d'alignement de rĂ©seaux d'interactions comme un problĂšme d'alignement de graphes, et Ă©tudions comment les algorithmes existants d'alignement de graphes peuvent ĂȘtre utilisĂ©s pour le rĂ©soudre. Dans la formulation classique de problĂšme d'alignement de graphes, seules les correspondances bijectives entre les noeuds de deux graphes sont considĂ©rĂ©es. Dans beaucoup d'applications, cependant, il est plus intĂ©ressant de considĂ©rer les correspondances entre des ensembles de nƓuds. Nous proposons une nouvelle formulation de ce problĂšme comme un problĂšme d'optimisation discret, ainsi qu'un algorithme approximatif basĂ© sur une relaxation continue. Nous prĂ©sentons Ă©galement deux rĂ©sultats indĂ©pendants dans les domaines de la traduction automatique statistique et de la bio-informatique. Nous montrons d'une part comment le problĂšme de la traduction statistique basĂ© sur les phrases peut ĂȘtre reformulĂ© comme un problĂšme du voyageur de commerce. Nous proposons d'autre part une nouvelle mesure de similaritĂ© entre les sites de fixation de protĂ©ines, basĂ©e sur la comparaison 3D de nuages atomiques.

    Many-to-Many Graph Matching: a Continuous Relaxation Approach

    No full text
    Graphs provide an efficient tool for object representation in various computer vision applications. Once graph-based representations are constructed, an important question is how to compare graphs. This problem is often formulated as a graph matching problem where one seeks a mapping between vertices of two graphs which optimally aligns their structure. In the classical formulation of graph matching, only one-to-one correspondences between vertices are considered. However, in many applications, graphs cannot be matched perfectly and it is more interesting to consider many-to-many correspondences where clusters of vertices in one graph are matched to clusters of vertices in the other graph. In this paper, we formulate the many-to-many graph matching problem as a discrete optimization problem and propose an approximate algorithm based on a continuous relaxation of the combinatorial problem. We compare our method with other existing methods on several benchmark computer vision datasets.

    Phrase-Based Statistical Machine Translation as a Traveling Salesman Problem

    No full text
    An efficient decoding algorithm is a crucial element of any statistical machine translation system. Some researchers have noted certain similarities between SMT decoding and the famous Traveling Salesman Problem; in particular (Knight, 1999) has shown that any TSP instance can be mapped to a sub-case of a word-based SMT model, demonstrating NP-hardness of the decoding task. In this paper, we focus on the reverse mapping, showing that any phrase-based SMT decoding problem can be directly reformulated as a TSP. The transformation is very natural, deepens our understanding of the decoding problem, and allows direct use of any of the powerful existing TSP solvers for SMT decoding. We test our approach on three datasets, and compare a TSP-based decoder to the popular beam-search algorithm. In all cases, our method provides competitive or better performance.
    corecore