6,230 research outputs found

    Large induced subgraphs via triangulations and CMSO

    Full text link
    We obtain an algorithmic meta-theorem for the following optimization problem. Let \phi\ be a Counting Monadic Second Order Logic (CMSO) formula and t be an integer. For a given graph G, the task is to maximize |X| subject to the following: there is a set of vertices F of G, containing X, such that the subgraph G[F] induced by F is of treewidth at most t, and structure (G[F],X) models \phi. Some special cases of this optimization problem are the following generic examples. Each of these cases contains various problems as a special subcase: 1) "Maximum induced subgraph with at most l copies of cycles of length 0 modulo m", where for fixed nonnegative integers m and l, the task is to find a maximum induced subgraph of a given graph with at most l vertex-disjoint cycles of length 0 modulo m. 2) "Minimum \Gamma-deletion", where for a fixed finite set of graphs \Gamma\ containing a planar graph, the task is to find a maximum induced subgraph of a given graph containing no graph from \Gamma\ as a minor. 3) "Independent \Pi-packing", where for a fixed finite set of connected graphs \Pi, the task is to find an induced subgraph G[F] of a given graph G with the maximum number of connected components, such that each connected component of G[F] is isomorphic to some graph from \Pi. We give an algorithm solving the optimization problem on an n-vertex graph G in time O(#pmc n^{t+4} f(t,\phi)), where #pmc is the number of all potential maximal cliques in G and f is a function depending of t and \phi\ only. We also show how a similar running time can be obtained for the weighted version of the problem. Pipelined with known bounds on the number of potential maximal cliques, we deduce that our optimization problem can be solved in time O(1.7347^n) for arbitrary graphs, and in polynomial time for graph classes with polynomial number of minimal separators

    Pattern matching in compilers

    Get PDF
    In this thesis we develop tools for effective and flexible pattern matching. We introduce a new pattern matching system called amethyst. Amethyst is not only a generator of parsers of programming languages, but can also serve as an alternative to tools for matching regular expressions. Our framework also produces dynamic parsers. Its intended use is in the context of IDE (accurate syntax highlighting and error detection on the fly). Amethyst offers pattern matching of general data structures. This makes it a useful tool for implementing compiler optimizations such as constant folding, instruction scheduling, and dataflow analysis in general. The parsers produced are essentially top-down parsers. Linear time complexity is obtained by introducing the novel notion of structured grammars and regularized regular expressions. Amethyst uses techniques known from compiler optimizations to produce effective parsers.Comment: master thesi

    Computational Molecular Biology

    No full text
    Computational Biology is a fairly new subject that arose in response to the computational problems posed by the analysis and the processing of biomolecular sequence and structure data. The field was initiated in the late 60's and early 70's largely by pioneers working in the life sciences. Physicists and mathematicians entered the field in the 70's and 80's, while Computer Science became involved with the new biological problems in the late 1980's. Computational problems have gained further importance in molecular biology through the various genome projects which produce enormous amounts of data. For this bibliography we focus on those areas of computational molecular biology that involve discrete algorithms or discrete optimization. We thus neglect several other areas of computational molecular biology, like most of the literature on the protein folding problem, as well as databases for molecular and genetic data, and genetic mapping algorithms. Due to the availability of review papers and a bibliography this bibliography

    Sequential matching problem

    Get PDF
    Kurzfassung in englisch We present sequential matching problem (SMP) as the problem of finding maximal matchings in a sequence of bipartite graphs, with a strategy of making maximum number of common edges in two consecutive matchings. One application of SMP is the problem of assigning workers to jobs in different time shifts with a goal of minimizing total number of unnecessary switches between jobs. We analyze various algorithmic techniques for this NP-complete problem. We also analyze the Mixed Integer Programming (MIP)problem formulation with huge number of variables and their solution by branch and price method, a column generation scheme with branch and bound, of implicit pricing of nonbasic variables to generate new columns. We then discuss special branching rules, pricing problems, implementation issues, and computational results. Finally we analyze a simpler version of SMP with only two bipartite graphs which is still NP-complete, and an algorithm to augment the common edges in the maximum matchings

    Microwave-Circuit Optimization with Parallel Enhanced Fast Messy Genetic Algorithm (pefmGA)

    Get PDF
    Fast messy genetic optimization is found suitable for complex microwave circuit design. Increase in computation speed is achieved using several ordinary computers connected to a network. Calculations are running on background so that computers can be used for other purposes at the same time. Dynamic change of bounds, search space segmentation and gradient incorporation have significantly improved convergence rate. The new method has found global minimum in each run, while classic methods failed for some starting points

    Incremental Processing and Optimization of Update Streams

    Get PDF
    Over the recent years, we have seen an increasing number of applications in networking, sensor networks, cloud computing, and environmental monitoring, which monitor, plan, control, and make decisions over data streams from multiple sources. We are interested in extending traditional stream processing techniques to meet the new challenges of these applications. Generally, in order to support genuine continuous query optimization and processing over data streams, we need to systematically understand how to address incremental optimization and processing of update streams for a rich class of queries commonly used in the applications. Our general thesis is that efficient incremental processing and re-optimization of update streams can be achieved by various incremental view maintenance techniques if we cast the problems as incremental view maintenance problems over data streams. We focus on two incremental processing of update streams challenges currently not addressed in existing work on stream query processing: incremental processing of transitive closure queries over data streams, and incremental re-optimization of queries. In addition to addressing these specific challenges, we also develop a working prototype system Aspen, which serves as an end-to-end stream processing system that has been deployed as the foundation for a case study of our SmartCIS application. We validate our solutions both analytically and empirically on top of our prototype system Aspen, over a variety of benchmark workloads such as TPC-H and LinearRoad Benchmarks

    Integration of Alignment and Phylogeny in the Whole-Genome Era

    Get PDF
    With the development of new sequencing techniques, whole genomes of many species have become available. This huge amount of data gives rise to new opportunities and challenges. These new sequences provide valuable information on relationships among species, e.g. genome recombination and conservation. One of the principal ways to investigate such information is multiple sequence alignment (MSA). Currently, there is large amount of MSA data on the internet, such as the UCSC genome database, but how to effectively use this information to solve classical and new problems is still an area lacking of exploration. In this thesis, we explored how to use this information in four problems, i.e. sequence orthology search problem, multiple alignment improvement problem, short read mapping problem, and genome rearrangement inference problem. For the first problem, we developed a EM algorithm to iteratively align a query with a multiple alignment database with the information from a phylogeny relating the query species and the species in the multiple alignment. We also infer the query\u27s location in the phylogeny. We showed that by doing alignment and phylogeny inference together, we can improve the accuracies for both problems. For the second problem, we developed an optimization algorithm to iteratively refine the multiple alignment quality. Experiment results showed our algorithm is very stable in term of resulting alignments. The results showed that our method is more accurate than existing methods, i.e. Mafft, Clustal-O, and Mavid, on test data from three sets of species from the UCSC genome database. For the third problem, we developed a model, PhyMap, to align a read to a multiple alignment allowing mismatches and indels. PhyMap computes local alignments of a query sequence against a fixed multiple-genome alignment of closely related species. PhyMap uses a known phylogenetic tree on the species in the multiple alignment to improve the quality of its computed alignments while also estimating the placement of the query on this tree. Both theoretical computation and experiment results show that our model can differentiate between orthologous and paralogous alignments better than other popular short read mapping tools (BWA, BOWTIE and BLAST). For the fourth problem, we gave a simple genome recombination model which can express insertions, deletions, inversions, translocations and inverted translocations on aligned genome segments. We also developed an MCMC algorithm to infer the order of the query segments. We proved that using any Euclidian metrics to measure distance between two sequence orders in the tree optimization goal function will lead to a degenerated solution where the inferred order will be the order of one of the leaf nodes. We also gave a graph-based formulation of the problem which can represent the probability distribution of the order of the query sequences
    corecore