913 research outputs found

    Detection of large exact subgraph isomorphisms with a topology-only graphlet index built using deterministic walks

    Full text link
    We introduce the first algorithm to perform topology-only local graph matching (a.k.a. local network alignment or subgraph isomorphism): BLANT, for Basic Local Alignment of Network Topology. BLANT first creates a limited, high-specificity index of a single graph containing connected k-node induced subgraphs called k-graphlets, for k=6-15. The index is constructed in a deterministic way such that, if significant common network topology exists between two networks, their indexes are likely to overlap. This is the key insight which allows BLANT to discover alignments using only topological information. To align two networks, BLANT queries their respective indexes to form large, high quality local alignments. BLANT is able to discover highly topologically similar alignments (S3 >= 0.95) of up to 150 node-pairs for which up to 50% of node pairs differ from their "assigned" global counterpart. These results compare favorably against the baseline, a state-of-the-art local alignment algorithm which was adapted to be topology-only. Such alignments are 3x larger and differ 30% more (additive) more from the global alignment than alignments of similar topological similarity (S3 >= 0.95) discovered by the baseline. We hope that such regions of high local similarity and low global similarity may provide complementary insights to global alignment algorithms.Comment: 13 pages, 11 figures, 4 table

    Revealing divergent evolution, identifying circular permutations and detecting active-sites by protein structure comparison

    Get PDF
    BACKGROUND: Protein structure comparison is one of the most important problems in computational biology and plays a key role in protein structure prediction, fold family classification, motif finding, phylogenetic tree reconstruction and protein docking. RESULTS: We propose a novel method to compare the protein structures in an accurate and efficient manner. Such a method can be used to not only reveal divergent evolution, but also identify circular permutations and further detect active-sites. Specifically, we define the structure alignment as a multi-objective optimization problem, i.e., maximizing the number of aligned atoms and minimizing their root mean square distance. By controlling a single distance-related parameter, theoretically we can obtain a variety of optimal alignments corresponding to different optimal matching patterns, i.e., from a large matching portion to a small matching portion. The number of variables in our algorithm increases with the number of atoms of protein pairs in almost a linear manner. In addition to solid theoretical background, numerical experiments demonstrated significant improvement of our approach over the existing methods in terms of quality and efficiency. In particular, we show that divergent evolution, circular permutations and active-sites (or structural motifs) can be identified by our method. The software SAMO is available upon request from the authors, or from and . CONCLUSION: A novel formulation is proposed to accurately align protein structures in the framework of multi-objective optimization, based on a sequence order-independent strategy. A fast and accurate algorithm based on the bipartite matching algorithm is developed by exploiting the special features. Convergence of computation is shown in experiments and is also theoretically proven

    AntNetAlign: Ant colony optimization for network alignment

    Get PDF
    The code (and data) in this article has been certified as Reproducible by Code Ocean: (https://codeocean.com/). More information on the Reproducibility Badge Initiative is available at https://www.elsevier.com/physical-sciences-andengineering/computer-science/journalsNetwork Alignment (NA) is a hard optimization problem with important applications such as, for example, the identification of orthologous relationships between different proteins and of phylogenetic relationships between species. Given two (or more) networks, the goal is to find an alignment between them, that is, a mapping between their respective nodes such that the topological and functional structure is well preserved. Although the problem has received great interest in recent years, there is still a need to unify the different trends that have emerged from diverse research areas. In this paper, we introduce AntNetAlign, an Ant Colony Optimization (ACO) approach for solving the problem. The proposed approach makes use of similarity information extracted from the input networks to guide the construction process. Combined with an improvement measure that depends on the current construction state, it is able to optimize any of the three main topological quality measures. We provide an extensive experimental evaluation using real-world instances that range from Protein–Protein Interaction (PPI) networks to Social Networks. Results show that our method outperforms other state-of-the-art approaches in two out of three of the tested scores within a reasonable amount of time, specially in the important score. Moreover, it is able to obtain near-optimal results when aligning networks with themselves. Furthermore, in larger instances, our algorithm was still able to compete with the best performing method in this regard.Christian Blum and Guillem Rodríguez Corominas, Spain were supported by grants PID2019-104156GB-I00 and TED2021- 129319B-I00 funded by MCIN/AEI/10.13039/501100011033. Maria J. Blesa acknowledges support from AEI, Spain under grant PID2020-112581GB-C21 (MOTION) and the Catalan Agency for Management of University and Research Grants (AGAUR), Spain under grant 2017-SGR-786 (ALBCOM).Peer ReviewedPostprint (published version

    BioSuite: a comprehensive bioinformatics software package (A unique industry-academia collaboration)

    Get PDF
    This article does not have an abstract

    Protein multiple sequence alignment by hybrid bio-inspired algorithms

    Get PDF
    This article presents an immune inspired algorithm to tackle the Multiple Sequence Alignment (MSA) problem. MSA is one of the most important tasks in biological sequence analysis. Although this paper focuses on protein alignments, most of the discussion and methodology may also be applied to DNA alignments. The problem of finding the multiple alignment was investigated in the study by Bonizzoni and Vedova and Wang and Jiang, and proved to be a NP-hard (non-deterministic polynomial-time hard) problem. The presented algorithm, called Immunological Multiple Sequence Alignment Algorithm (IMSA), incorporates two new strategies to create the initial population and specific ad hoc mutation operators. It is based on the ‘weighted sum of pairs’ as objective function, to evaluate a given candidate alignment. IMSA was tested using both classical benchmarks of BAliBASE (versions 1.0, 2.0 and 3.0), and experimental results indicate that it is comparable with state-of-the-art multiple alignment algorithms, in terms of quality of alignments, weighted Sums-of-Pairs (SP) and Column Score (CS) values. The main novelty of IMSA is its ability to generate more than a single suboptimal alignment, for every MSA instance; this behaviour is due to the stochastic nature of the algorithm and of the populations evolved during the convergence process. This feature will help the decision maker to assess and select a biologically relevant multiple sequence alignment. Finally, the designed algorithm can be used as a local search procedure to properly explore promising alignments of the search space

    Filtered Distance Matrix For Constructing High-Throughput Multiple Sequence Alignment On Protein Data

    Get PDF
    Urutan Penjajaran Berganda (MSA) adalah satu proses yang penting dalam biologi pengkomputeran dan bioinformatik. MSA optima adalah masalah NP-keras sementara membina penjajaran optimum menggunakan pengaturcaraan dinamik merupakan masalah NP lengkap. Multiple sequence alignment (MSA) is a significant process in computational biology and bioinformatics. Optimal MSA is an NP-hard problem, while building optimal alignment using dynamic programming is an NP complete problem. Although numerous algorithms have been proposed for MSA, producing an efficient MSA with high accuracy remains a huge challenge

    Parallel Exchange of Randomized SubGraphs for Optimization of Network Alignment: PERSONA

    Get PDF
    The aim of Network Alignment in Protein-Protein Interaction Networks is discovering functionally similar regions between compared organisms. One major compromise for solving a network alignment problem is the trade-off among multiple similarity objectives while applying an alignment strategy. An alignment may lose its biological relevance while favoring certain objectives upon others due to the actual relevance of unfavored objectives. One possible solution for solving this issue may be blending the stronger aspects of various alignment strategies until achieving mature solutions. This study proposes a parallel approach called PERSONA that allows aligners to share their partial solutions continuously while they progress. All these aligners pursue their particular heuristics as part of a particle swarm that searches for multi-objective solutions of the same alignment problem in a reactive actor environment. The actors use the stronger portion of a solution as a subgraph that they receive from leading or other actors and send their own stronger subgraphs back upon evaluation of those partial solutions. Moreover, the individual heuristics of each actor takes randomized parameter values at each cycle of parallel execution so that the problem search space can thoroughly be investigated. The results achieved with PERSONA are remarkably optimized and balanced for both topological and node similarity objectives
    corecore