50 research outputs found

    Algorithms for the minimum sum coloring problem: a review

    Get PDF
    The Minimum Sum Coloring Problem (MSCP) is a variant of the well-known vertex coloring problem which has a number of AI related applications. Due to its theoretical and practical relevance, MSCP attracts increasing attention. The only existing review on the problem dates back to 2004 and mainly covers the history of MSCP and theoretical developments on specific graphs. In recent years, the field has witnessed significant progresses on approximation algorithms and practical solution algorithms. The purpose of this review is to provide a comprehensive inspection of the most recent and representative MSCP algorithms. To be informative, we identify the general framework followed by practical solution algorithms and the key ingredients that make them successful. By classifying the main search strategies and putting forward the critical elements of the reviewed methods, we wish to encourage future development of more powerful methods and motivate new applications

    Active module identification in intracellular networks using a memetic algorithm with a new binary decoding scheme

    Get PDF
    BACKGROUND: Active modules are connected regions in biological network which show significant changes in expression over particular conditions. The identification of such modules is important since it may reveal the regulatory and signaling mechanisms that associate with a given cellular response. RESULTS: In this paper, we propose a novel active module identification algorithm based on a memetic algorithm. We propose a novel encoding/decoding scheme to ensure the connectedness of the identified active modules. Based on the scheme, we also design and incorporate a local search operator into the memetic algorithm to improve its performance. CONCLUSION: The effectiveness of proposed algorithm is validated on both small and large protein interaction networks

    High-Quality Hypergraph Partitioning

    Get PDF
    This dissertation focuses on computing high-quality solutions for the NP-hard balanced hypergraph partitioning problem: Given a hypergraph and an integer kk, partition its vertex set into kk disjoint blocks of bounded size, while minimizing an objective function over the hyperedges. Here, we consider the two most commonly used objectives: the cut-net metric and the connectivity metric. Since the problem is computationally intractable, heuristics are used in practice - the most prominent being the three-phase multi-level paradigm: During coarsening, the hypergraph is successively contracted to obtain a hierarchy of smaller instances. After applying an initial partitioning algorithm to the smallest hypergraph, contraction is undone and, at each level, refinement algorithms try to improve the current solution. With this work, we give a brief overview of the field and present several algorithmic improvements to the multi-level paradigm. Instead of using a logarithmic number of levels like traditional algorithms, we present two coarsening algorithms that create a hierarchy of (nearly) nn levels, where nn is the number of vertices. This makes consecutive levels as similar as possible and provides many opportunities for refinement algorithms to improve the partition. This approach is made feasible in practice by tailoring all algorithms and data structures to the nn-level paradigm, and developing lazy-evaluation techniques, caching mechanisms and early stopping criteria to speed up the partitioning process. Furthermore, we propose a sparsification algorithm based on locality-sensitive hashing that improves the running time for hypergraphs with large hyperedges, and show that incorporating global information about the community structure into the coarsening process improves quality. Moreover, we present a portfolio-based initial partitioning approach, and propose three refinement algorithms. Two are based on the Fiduccia-Mattheyses (FM) heuristic, but perform a highly localized search at each level. While one is designed for two-way partitioning, the other is the first FM-style algorithm that can be efficiently employed in the multi-level setting to directly improve kk-way partitions. The third algorithm uses max-flow computations on pairs of blocks to refine kk-way partitions. Finally, we present the first memetic multi-level hypergraph partitioning algorithm for an extensive exploration of the global solution space. All contributions are made available through our open-source framework KaHyPar. In a comprehensive experimental study, we compare KaHyPar with hMETIS, PaToH, Mondriaan, Zoltan-AlgD, and HYPE on a wide range of hypergraphs from several application areas. Our results indicate that KaHyPar, already without the memetic component, computes better solutions than all competing algorithms for both the cut-net and the connectivity metric, while being faster than Zoltan-AlgD and equally fast as hMETIS. Moreover, KaHyPar compares favorably with the current best graph partitioning system KaFFPa - both in terms of solution quality and running time

    Optimization Methods for Cluster Analysis in Network-based Data Mining

    Get PDF
    This dissertation focuses on two optimization problems that arise in network-based data mining, concerning identification of basic community structures (clusters) in graphs: the maximum edge weight clique and maximum induced cluster subgraph problems. We propose a continuous quadratic formulation for the maximum edge weight clique problem, and establish the correspondence between its local optima and maximal cliques in the graph. Subsequently, we present a combinatorial branch-and-bound algorithm for this problem that takes advantage of a polynomial-time solvable nonconvex relaxation of the proposed formulation. We also introduce a linear-time-computable analytic upper bound on the clique number of a graph, as well as a new method of upper-bounding the maximum edge weight clique problem, which leads to another exact algorithm for this problem. For the maximum induced cluster subgraph problem, we present the results of a comprehensive polyhedral analysis. We derive several families of facet-defining valid inequalities for the IUC polytope associated with a graph. We also provide a complete description of this polytope for some special classes of graphs. We establish computational complexity of the separation problems for most of the considered families of valid inequalities, and explore the effectiveness of employing the corresponding cutting planes in an integer (linear) programming framework for the maximum induced cluster subgraph problem

    Active modules identification in multilayer intracellular networks

    Get PDF
    The network analysis has become a basic tool to gain insights on evolution and organization of living organisms in computational system biology. Since a group of genes may get involved into a biological process other than act alone, identifying modules from biological networks has been a central challenge to this field in the past decade. Several representative methods have been proposed to search such important modules using different intuitions while no unified framework exists yet, especially for multilayer networks, which can model gene expression dynamics and species conservation. This thesis provides a comprehensive study on active modules identification in multilayer intracellular networks, with the following main contributions: - An improvement on a heuristic method for identifying active modules from protein-protein interaction (PPI) networks. - A new objective of active modules to incorporate the topological structure and active property on the single layer and multilayer dynamic PPI network, and a convex optimization algorithm to solve it. - A new definition for active modules in single layer and multilayer gene co-expression networks and a novel algorithm which achieves the state-of-the-art performance. - A framework to conduct networks comparison via modules differentiation analysis, which can find condition-specific modules as well as conserved modules

    Scalable Graph Algorithms using Practically Efficient Data Reductions

    Get PDF

    Problèmes de tournées de véhicules et application industrielle pour la réduction de l'empreinte écologique

    Get PDF
    Dans cette thèse, nous nous sommes intéressés à la résolution approchée de problèmes de tournées de véhicules. Nous avons exploité des travaux menés sur les graphes d'intervalles et des propriétés de dominance relatives aux tournées saturées pour traiter les problèmes de tournées sélectives plus efficacement. Des approches basées sur un algorithme d'optimisation par essaim particulaire et un algorithme mémétique ont été proposées. Les métaheuristiques développées font appel à un ensemble de techniques particulièrement efficaces telles que le découpage optimal, les opérateurs de croisement génétiques ainsi que des méthodes de recherches locales. Nous nous sommes intéressés également aux problèmes de tournées classiques avec fenêtres de temps. Différents prétraitements ont été introduits pour obtenir des bornes inférieures sur le nombre de véhicules. Ces prétraitements s'inspirent de méthodes issues de modèles de graphes, de problème d'ordonnancement et de problèmes de bin packing avec conflits. Nous avons montré également l'utilité des méthodes développées dans un contexte industriel à travers la réalisation d'un portail de services mobilité.In this thesis, we focused on the development of heuristic approaches for solvingvehicle routing problems. We exploited researches conducted on interval graphsand dominance properties of saturated tours to deal more efficiently with selectivevehicle routing problems. An adaptation of a particle swarm optimization algorithmand a memetic algorithm is proposed. The metaheuristics that we developed arebased on effective techniques such as optimal split, genetic crossover operatorsand local searches. We are also interested in classical vehicle problems with timewindows. Various pre-processing methods are introduced to obtain lower boundson the number of vehicles. These methods are based on many approaches usinggraph models, scheduling problems and bin packing problems with conflicts. Wealso showed the effectiveness of the developed methods with an industrial applicationby implementing a portal of mobility services.COMPIEGNE-BU (601592101) / SudocSudocFranceF

    SANA: simulated annealing far outperforms many other search algorithms for biological network alignment

    Full text link
    SummaryEvery alignment algorithm consists of two orthogonal components: an objective function M measuring the quality of an alignment, and a search algorithm that explores the space of alignments looking for ones scoring well according to M . We introduce a new search algorithm called SANA (Simulated Annealing Network Aligner) and apply it to protein-protein interaction networks using S 3 as the topological measure. Compared against 12 recent algorithms, SANA produces 5-10 times as many correct node pairings as the others when the correct answer is known. We expose an anti-correlation in many existing aligners between their ability to produce good topological vs. functional similarity scores, whereas SANA usually outscores other methods in both measures. If given the perfect objective function encoding the identity mapping, SANA quickly converges to the perfect solution while many other algorithms falter. We observe that when aligning networks with a known mapping and optimizing only S 3 , SANA creates alignments that are not perfect and yet whose S 3 scores match that of the perfect alignment. We call this phenomenon saturation of the topological score . Saturation implies that a measure's correlation with alignment correctness falters before the perfect alignment is reached. This, combined with SANA's ability to produce the perfect alignment if given the perfect objective function, suggests that better objective functions may lead to dramatically better alignments. We conclude that future work should focus on finding better objective functions, and offer SANA as the search algorithm of choice.Availability and implementationSoftware available at http://sana.ics.uci.edu [email protected] informationSupplementary data are available at Bioinformatics online
    corecore