381 research outputs found

    Regular expression constrained sequence alignment revisited

    Get PDF
    International audienceImposing constraints in the form of a finite automaton or a regular expression is an effective way to incorporate additional a priori knowledge into sequence alignment procedures. With this motivation, the Regular Expression Constrained Sequence Alignment Problem was introduced, which proposed an O(n^2t^4) time and O(n^2t^2) space algorithm for solving it, where n is the length of the input strings and t is the number of states in the input non-deterministic automaton. A faster O(n^2t^3) time algorithm for the same problem was subsequently proposed. In this article, we further speed up the algorithms for Regular Language Constrained Sequence Alignment by reducing their worst case time complexity bound to O(n^2t^3/log t). This is done by establishing an optimal bound on the size of Straight-Line Programs solving the maxima computation subproblem of the basic dynamic programming algorithm. We also study another solution based on a Steiner Tree computation. While it does not improve worst case, our simulations show that both approaches are efficient in practice, especially when the input automata are dense

    GrapeTree : visualization of core genomic relationships among 100,000 bacterial pathogens

    Get PDF
    Current methods struggle to reconstruct and visualise the genomic relationships of ≥100,000 bacterial genomes. GrapeTree facilitates the analyses of allelic profiles from 10,000's of core genomes within a web browser window. GrapeTree implements a novel minimum spanning tree algorithm to reconstruct genetic relationships despite missing data together with a static "GrapeTree Layout" algorithm to render interactive visualisations of large trees. GrapeTree is a stand-along package for investigating Newick trees plus associated metadata and is also integrated into EnteroBase to facilitate cutting edge navigation of genomic relationships among >160,000 genomes from bacterial pathogens. The GrapeTree package was released under the GPL v3.0 Licence

    GrapeTree : visualization of core genomic relationships among 100,000 bacterial pathogens

    Get PDF
    Current methods struggle to reconstruct and visualise the genomic relationships of ≥100,000 bacterial genomes. GrapeTree facilitates the analyses of allelic profiles from 10,000's of core genomes within a web browser window. GrapeTree implements a novel minimum spanning tree algorithm to reconstruct genetic relationships despite missing data together with a static "GrapeTree Layout" algorithm to render interactive visualisations of large trees. GrapeTree is a stand-along package for investigating Newick trees plus associated metadata and is also integrated into EnteroBase to facilitate cutting edge navigation of genomic relationships among >160,000 genomes from bacterial pathogens. The GrapeTree package was released under the GPL v3.0 Licence

    Review of Extreme Multilabel Classification

    Full text link
    Extreme multilabel classification or XML, is an active area of interest in machine learning. Compared to traditional multilabel classification, here the number of labels is extremely large, hence, the name extreme multilabel classification. Using classical one versus all classification wont scale in this case due to large number of labels, same is true for any other classifiers. Embedding of labels as well as features into smaller label space is an essential first step. Moreover, other issues include existence of head and tail labels, where tail labels are labels which exist in relatively smaller number of given samples. The existence of tail labels creates issues during embedding. This area has invited application of wide range of approaches ranging from bit compression motivated from compressed sensing, tree based embeddings, deep learning based latent space embedding including using attention weights, linear algebra based embeddings such as SVD, clustering, hashing, to name a few. The community has come up with a useful set of metrics to identify correctly the prediction for head or tail labels.Comment: 46 pages, 13 figure

    Topological inference in graphs and images

    Get PDF

    Subject index volumes 1–92

    Get PDF

    Algorithmic Approaches to the Steiner Problem in Networks

    Full text link
    Das Steinerproblem in Netzwerken ist das Problem, in einem gewichteten Graphen eine gegebene Menge von Knoten kostenminimal zu verbinden. Es ist ein klassisches NP-schweres Problem und ein fundamentales Problem bei der Netzwerkoptimierung mit vielen praktischen Anwendungen. Wir nehmen dieses Problem mit verschiedenen Mitteln in Angriff: Relaxationen, die die Zulässigkeitsbedingungen lockern, um eine optimale Lösung annähern zu können; Heuristiken, um gute, aber nicht garantiert optimale Lösungen zu finden; und Reduktionen, um die Probleminstanzen zu vereinfachen, ohne eine optimale Lösung zu zerstören. In allen Fällen untersuchen und verbessern wir bestehende Methoden, stellen neue vor und evaluieren sie experimentell. Wir integrieren diese Bausteine in einen exakten Algorithmus, der den Stand der Algorithmik für die optimale Lösung dieses Problems darstellt. Viele der vorgestellten Methoden können auch für verwandte Probleme von Nutzen sein
    corecore