3 research outputs found

    Average-Case Analyse parametrisierter und probabilistischer Algorithmen

    Get PDF
    In both Theoretical Computer Science and practical work it is a disappointing outcome if the considered problem is NP complete. There is almost no hope for an efficient algorithm. However, many approaches have been developed to overcome this barrier: - The study of parameterized complexity allows in many cases the concentration of the explosion of the running time in a given parameter. - The behavior of problems not only in the worst case but also in average cases are studied. Or the data to work with is slightly perturbed. Then the concept of a smoothed analysis gives new insides - Also sometimes the use of randomness in the computing process can help to circumvent some obstacles. - And maybe an approximation is also nearly as good as an optimal solution. All these approaches are well studied on its own, but interactions between them, and the use of multiple approaches together, is a mostly unstudied field of research. In this thesis we study a part of these interactions for some test problems. We show that the reduction rules, given by Gramm et al., for the Clique-Cover problem with high probability not only reduce yes instances, but solve them entirely. We also consider the paradigm of bounded search trees, which is widely used for parameterizd problems. We find that the expected running time of a simple bounded search tree algorithm is much lower than the worst case bound for FPT problems Vertex-Cover and d-Hitting-Set. For certain sets of parameter values expected FPT running time for the W[1] and W[2] complete problems Clique and Hitting-Set is achieved, too. Furthermore, we study a simple probabilistic generalization of greedy approximation algorithms. For the Vertex-Cover, Hitting-Set, and the Triangle-Vertex-Deletion problem we find that the probabilistic algorithms we give have a substantially smaller expected approximation ratio than their deterministic equivalents. There is also a trade off: With more time one can expect better solutions

    Combinatorial and Probabilistic Approaches to Motif Recognition

    Get PDF
    Short substrings of genomic data that are responsible for biological processes, such as gene expression, are referred to as motifs. Motifs with the same function may not entirely match, due to mutation events at a few of the motif positions. Allowing for non-exact occurrences significantly complicates their discovery. Given a number of DNA strings, the motif recognition problem is the task of detecting motif instances in every given sequence without knowledge of the position of the instances or the pattern shared by these substrings. We describe a novel approach to motif recognition, and provide theoretical and experimental results that demonstrate its efficiency and accuracy. Our algorithm, MCL-WMR, builds an edge-weighted graph model of the given motif recognition problem and uses a graph clustering algorithm to quickly determine important subgraphs that need to be searched further for valid motifs. By considering a weighted graph model, we narrow the search dramatically to smaller problems that can be solved with significantly less computation. The Closest String problem is a subproblem of motif recognition, and it is NP-hard. We give a linear-time algorithm for a restricted version of the Closest String problem, and an efficient polynomial-time heuristic that solves the general problem with high probability. We initiate the study of the smoothed complexity of the Closest String problem, which in turn explains our empirical results that demonstrate the great capability of our probabilistic heuristic. Important to this analysis is the introduction of a perturbation model of the Closest String instances within which we provide a probabilistic analysis of our algorithm. The smoothed analysis suggests reasons why a well-known fixed parameter tractable algorithm solves Closest String instances extremely efficiently in practice. Although the Closest String model is robust to the oversampling of strings in the input, it is severely affected by the existence of outliers. We propose a refined model, the Closest String with Outliers problem, to overcome this limitation. A systematic parameterized complexity analysis accompanies the introduction of this problem, providing a surprising insight into the sensitivity of this problem to slightly different parameterizations. Through the application of probabilistic and combinatorial insights into the Closest String problem, we develop sMCL-WMR, a program that is much faster than its predecessor MCL-WMR. We apply and adapt sMCL-WMR and MCL-WMR to analyze the promoter regions of the canola seed-coat. Our results identify important regions of the canola genome that are responsible for specific biological activities. This knowledge may be used in the long-term aim of developing crop varieties with specific biological characteristics, such as being disease-resistant
    corecore