266 research outputs found

    Scalable Community Detection

    Get PDF

    A survey of parameterized algorithms and the complexity of edge modification

    Get PDF
    The survey is a comprehensive overview of the developing area of parameterized algorithms for graph modification problems. It describes state of the art in kernelization, subexponential algorithms, and parameterized complexity of graph modification. The main focus is on edge modification problems, where the task is to change some adjacencies in a graph to satisfy some required properties. To facilitate further research, we list many open problems in the area.publishedVersio

    Bi-(N-) cluster editing and its biomedical applications

    Get PDF
    The extremely fast advances in wet-lab techniques lead to an exponential growth of heterogeneous and unstructured biological data, posing a great challenge to data integration in nowadays system biology. The traditional clustering approach, although widely used to divide the data into groups sharing common features, is less powerful in the analysis of heterogeneous data from n different sources (n _ 2). The co-clustering approach has been widely used for combined analyses of multiple networks to address the challenge of heterogeneity. In this thesis, novel methods for the co-clustering of large scale heterogeneous data sets are presented in the software package n-CluE: one exact algorithm and two heuristic algorithms based on the model of bi-/n-cluster editing by modeling the input as n-partite graphs and solving the clustering problem with various strategies. In the first part of the thesis, the complexity and the fixed-parameter tractability of the extended bicluster editing model with relaxed constraints are investigated, namely the ?-bicluster editing model and its NP-hardness is proven. Based on the results of this analysis, three strategies within the n-CluE software package are then established and discussed, together with the evaluations on performances and the systematic comparisons against other algorithms of the same type in solving bi-/n-cluster editing problem. To demonstrate the practical impact, three real-world analyses using n-CluE are performed, including (a) prediction of novel genotype-phenotype associations by clustering the data from Genome-Wide Association Studies; (b) comparison between n-CluE and eight other biclustering tools on GEO Omnibus microarray data sets; (c) drug repositioning predictions by co-clustering on drug, gene and disease networks. The outstanding performance of n-CluE in the real-world applications shows its strength and flexibility in integrating heterogeneous data and extracting biological relevant information in bioinformatic analyses.Die enormen Fortschritte im Bereich Labortechnik haben in jüngster Zeit zu einer exponentiell wachsenden Menge an heterogenen und unstrukturierten Daten geführt. Dies stellt eine große Herausforderung für systembiologische Forschung dar, innerhalb derer diese Datenmengen durch Datenintegration und Datamining zusammengefasst und in Kombination analysiert werden. Traditionelles Clustering ist eine vielseitig eingesetzte Methode, um Entitäten innerhalb grosser Datenmengen bezüglich ihrer Ähnlichkeit bestimmter Attribute zu gruppieren (“clustern„). Beim Clustern von heterogenen Daten aus n (n > 2) unterschiedlichen Quellen zeigen traditionelle Clusteringmethoden jedoch Schwächen. In solchen Fällen bieten Co-clusteringmethoden dadurch Vorteile, dass sie Datensätze gleichzeitig partitionieren können. In dieser Dissertation stelle ich neue Clusteringmethoden vor, die in der Software n-CluE zusammengeführt sind. Diese neuen Methoden wurden aus dem bi-/n-cluster editing heraus entwickelt und lösen durch Transformation der Eingangsdatensätze in n-partite Graphen mit verschiedenen Strategien das zugrundeliegende Clusteringproblem. Diese Dissertation ist in zwei verschiedene Teile gegliedert. Der erste Teil befasst sich eingehend mit der Komplexitätanalyse verschiedener erweiterter bicluster editing Modelle, die sog. ?-bicluster editing Modelle und es wird der Beweis der NP-Schwere erbracht. Basierend auf diesen theoretischen Gesichtspunkten präsentiere ich im zweiten Teil drei unterschiedliche Algorithmen, einen exakten Algorithmus und zwei Heuristiken und demonstriere ihre Leistungsfähigkeit und Robustheit im Vergleich mit anderen algorithmischen Herangehensweisen. Die Stärken von n-CluE werden anhand von drei realen Anwendungsbeispielen untermauert: (a) Die Vorhersage neuartiger Genotyp-Phänotyp-Assoziationen durch Biclustering-Analyse von Daten aus genomweiten Assoziationsstudien (GWAS);(b) Der Vergleich zwischen n-CluE und acht weiteren Softwarepaketen anhand von Bicluster-Analysen von Microarraydaten aus den Gene Expression Omnibus (GEO); (c) Die Vorhersage von Medikamenten-Repositionierung durch integrierte Analyse von Medikamenten-, Gen- und Krankeitsnetzwerken. Die Resultate zeigen eindrucksvoll die Stärken der n-CluE Software. Das Ergebnis ist eine leistungsstarke, robuste und flexibel erweiterbare Implementierung des Biclustering-Theorems zur Integration grosser heterogener Datenmengen für das Extrahieren biologisch relevanter Ergebnisse im Rahmen von bioinformatischen Studien

    MATHEMATICAL PROGRAMMING ALGORITHMS FOR NETWORK OPTIMIZATION PROBLEMS

    Get PDF
    In the thesis we consider combinatorial optimization problems that are defined by means of networks. These problems arise when we need to take effective decisions to build or manage network structures, both satisfying the design constraints and minimizing the costs. In the thesis we focus our attention on the four following problems: - The Multicast Routing and Wavelength Assignment with Delay Constraint in WDM networks with heterogeneous capabilities (MRWADC) problem: this problem arises in the telecommunications industry and it requires to define an efficient way to make multicast transmissions on a WDM optical network. In more formal terms, to solve the MRWADC problem we need to identify, in a given directed graph that models the WDM optical network, a set of arborescences that connect the source of the transmission to all its destinations. These arborescences need to satisfy several quality-of-service constraints and need to take into account the heterogeneity of the electronic devices belonging to the WDM network. - The Homogeneous Area Problem (HAP): this problem arises from a particular requirement of an intermediate level of the Italian government called province. Each province needs to coordinate the common activities of the towns that belong to its territory. To practically perform its coordination role, the province of Milan created a customer care layer composed by a certain number of employees that have the task to support the towns of the province in their administrative works. For the sake of efficiency, the employees of this customer care layer have been partitioned in small groups and each group is assigned to a particular subset of towns that have in common a large number of activities. The HAP requires to identify the set of towns assigned to each group in order to minimize the redundancies generated by the towns that, despite having some activities in common, have been assigned to different groups. Since, for both historical and practical reasons, the towns in a particular subset need to be adjacent, the HAP can be effectively modeled as a particular graph partitioning problem that requires the connectivity of the obtained subgraphs and the satisfaction of nonlinear knapsack constraints. - Knapsack Prize Collecting Steiner Tree Problem (KPCSTP): to implement a Column Generation algorithm for the MRWADC problem and for the HAP, we need also to solve the two corresponding pricing problems. These two problems are very similar, both of them require to find an arborescence, contained in a given directed weighted graph, that minimizes the difference between its cost and the prizes associated with the spanned nodes. The two problems differ in the side constraints that their feasible solutions need to satisfy and in the way in which the cost of an arborescence is defined. The ILP formulations and the resolution methods that we developed to tackle these two problems have many characteristics in common with the ones used to solve other similar problems. To exemplify these similarities and to summarize and extend the techniques that we developed for the MRWADC problem and for the HAP, we also considered the KPCSTP. This problem requires to find a tree that minimizes the difference between the cost of the used arcs and the profits of the spanned nodes. However, not all trees are feasible: the sum of the weights of the nodes spanned by a feasible tree cannot exceed a given weight threshold. In the thesis we propose a computational comparison among several optimization methods for the KPCSTP that have been either already proposed in the literature or obtained modifying our ILP formulations for the two previous pricing problems. - The Train Design Optimization (TDO) problem: this problem was the topic of the second problem solving competition, sponsored in 2011 by the Railway Application Section (RAS) of the Institute for Operations Research and the Management Sciences (INFORMS). We participated to the contest and we won the second prize. After the competition, we continued to work on the TDO problem and in the thesis we describe the improved method that we have obtained at the end of this work. The TDO problem arises in the freight railroad industry. Typically, a freight railroad company receives requests from customers to transport a set of railcars from an origin rail yard to a destination rail yard. To satisfy these requests, the company first aggregates the railcars having the same origin and the same destination in larger blocks, and then it defines a trip plan to transport the obtained blocks to their correct destinations. The TDO problem requires to identify a trip plan that efficiently uses the limited resources of the considered rail company. More formally, given a railway network, a set of blocks and the segments of the network in which a crew can legally drive a train, the TDO problem requires to define a set of trains and the way in which the given blocks can be transported to their destinations by these trains, both satisfying operational constraints and minimizing the transportation costs

    Solving the List Coloring Problem through a Branch-and-Price algorithm

    Full text link
    In this work, we present a branch-and-price algorithm to solve the weighted version of the List Coloring Problem, based on a vertex cover formulation by stable sets. This problem is interesting for its applications and also for the many other problems that it generalizes, including the well-known Graph Coloring Problem. With the introduction of the concept of indistinguishable colors, some theoretical results are presented which are later incorporated into the algorithm. We propose two branching strategies based on others for the Graph Coloring Problem, the first is an adaptation of the one used by Mehrotra and Trick in their pioneering branch-and-price algorithm, and the other is inspired by the one used by M\'endez-D\'iaz and Zabala in their branch-and-cut algorithm. The rich structure of this problem makes both branching strategies robust. Extended computation experimentation on a wide variety of instances shows the effectiveness of this approach and evidences the different behaviors that the algorithm can have according to the structure of each type of instance

    Proceedings of the 8th Cologne-Twente Workshop on Graphs and Combinatorial Optimization

    No full text
    International audienceThe Cologne-Twente Workshop (CTW) on Graphs and Combinatorial Optimization started off as a series of workshops organized bi-annually by either Köln University or Twente University. As its importance grew over time, it re-centered its geographical focus by including northern Italy (CTW04 in Menaggio, on the lake Como and CTW08 in Gargnano, on the Garda lake). This year, CTW (in its eighth edition) will be staged in France for the first time: more precisely in the heart of Paris, at the Conservatoire National d’Arts et Métiers (CNAM), between 2nd and 4th June 2009, by a mixed organizing committee with members from LIX, Ecole Polytechnique and CEDRIC, CNAM

    Models and algorithms for decomposition problems

    Get PDF
    This thesis deals with the decomposition both as a solution method and as a problem itself. A decomposition approach can be very effective for mathematical problems presenting a specific structure in which the associated matrix of coefficients is sparse and it is diagonalizable in blocks. But, this kind of structure may not be evident from the most natural formulation of the problem. Thus, its coefficient matrix may be preprocessed by solving a structure detection problem in order to understand if a decomposition method can successfully be applied. So, this thesis deals with the k-Vertex Cut problem, that is the problem of finding the minimum subset of nodes whose removal disconnects a graph into at least k components, and it models relevant applications in matrix decomposition for solving systems of equations by parallel computing. The capacitated k-Vertex Separator problem, instead, asks to find a subset of vertices of minimum cardinality the deletion of which disconnects a given graph in at most k shores and the size of each shore must not be larger than a given capacity value. Also this problem is of great importance for matrix decomposition algorithms. This thesis also addresses the Chance-Constrained Mathematical Program that represents a significant example in which decomposition techniques can be successfully applied. This is a class of stochastic optimization problems in which the feasible region depends on the realization of a random variable and the solution must optimize a given objective function while belonging to the feasible region with a probability that must be above a given value. In this thesis, a decomposition approach for this problem is introduced. The thesis also addresses the Fractional Knapsack Problem with Penalties, a variant of the knapsack problem in which items can be split at the expense of a penalty depending on the fractional quantity

    Combinatorics, Probability and Computing

    Get PDF
    One of the exciting phenomena in mathematics in recent years has been the widespread and surprisingly effective use of probabilistic methods in diverse areas. The probabilistic point of view has turned out to b
    • …
    corecore