5 research outputs found

    A comprehensive comparison of comparative RNA structure prediction approaches

    Get PDF
    BACKGROUND: An increasing number of researchers have released novel RNA structure analysis and prediction algorithms for comparative approaches to structure prediction. Yet, independent benchmarking of these algorithms is rarely performed as is now common practice for protein-folding, gene-finding and multiple-sequence-alignment algorithms. RESULTS: Here we evaluate a number of RNA folding algorithms using reliable RNA data-sets and compare their relative performance. CONCLUSIONS: We conclude that comparative data can enhance structure prediction but structure-prediction-algorithms vary widely in terms of both sensitivity and selectivity across different lengths and homologies. Furthermore, we outline some directions for future research

    Packing multiway cuts in capacitated graphs

    Full text link
    We consider the following "multiway cut packing" problem in undirected graphs: we are given a graph G=(V,E) and k commodities, each corresponding to a set of terminals located at different vertices in the graph; our goal is to produce a collection of cuts {E_1,...,E_k} such that E_i is a multiway cut for commodity i and the maximum load on any edge is minimized. The load on an edge is defined to be the number of cuts in the solution crossing the edge. In the capacitated version of the problem the goal is to minimize the maximum relative load on any edge--the ratio of the edge's load to its capacity. Multiway cut packing arises in the context of graph labeling problems where we are given a partial labeling of a set of items and a neighborhood structure over them, and, informally, the goal is to complete the labeling in the most consistent way. This problem was introduced by Rabani, Schulman, and Swamy (SODA'08), who developed an O(log n/log log n) approximation for it in general graphs, as well as an improved O(log^2 k) approximation in trees. Here n is the number of nodes in the graph. We present the first constant factor approximation for this problem in arbitrary undirected graphs. Our approach is based on the observation that every instance of the problem admits a near-optimal laminar solution (that is, one in which no pair of cuts cross each other).Comment: The conference version of this paper is to appear at SODA 2009. This is the full versio

    Approximation algorithms for the fixed-topology phylogenetic number problem

    Get PDF
    In the l-phylogeny problem, one wishes to construct an evolutionary tree for a. set of species represented by characters, in which each state of each character induces no more than l connected components. We consider the fixed-topology version of this problem for fixed-topologies of arbitrary degree. This version of the problem is known to be NP-complete for l greater than or equal to 3 even for degree-3 trees in which no state labels more than l + 1 leaves (and therefore there is a trivial l + 1 phylogeny) We give a 2-approximation algorithm for all l greater than or equal to 3 for arbitrary input topologies and we give an optimal approximation algorithm that constructs a 4-phylogeny when a 3-phylogeny exists. Dynamic programming techniques, which are typically used in fixed-toplogy problems, cannot be applied to l-phylogeny problems. Our 2-approximation algorithm is the first application of linear programming to approximation algorithms for phylogeny problems. We extend our results to a related problem in which characters are polymorphic

    A more Efficient Approximation Scheme for Tree Alignment

    No full text
    Abstract. We present a new polynomial time approximation scheme (PTAS) for tree alignment, which is an important variant of multiple sequence alignment. As in the existing PTASs in the literature, the basic approach of our algorithm is to partition the given tree into overlapping components of a constant size and then apply local optimization on each such component. But the new algorithm uses a clever partitioning strategy and achieves a better efficiency for the same performance ratio. For example, to achieve approximation ratios 1.6 and 1.5, the best existing PTAS has to spend time O(kdn 5) and O(kdn 9), respectively, where n is the length of each leaf sequence and d, k are the depth and number of leaves of the tree, while the new PTAS only has to spend time O(kdn 4) and O(kdn 5). Moreover, the performance of the PTAS is more sensitive to the size of the components, which basically determines the running time, and we obtain an improved approximation ratio for each size. Some experiments of the algorithm on simulated and real data are also given