778 research outputs found

    De bruijn partial words

    Get PDF
    In a kn-complex word over an alphabet Σ of size k each of the kn words of length n appear as a subword at least once. Such a word is said to have maximum subword complexity. De Bruijn sequences of order n over Σ are the shortest words of maximum subword complexity and are well known to have length kn+n-1. They are efficiently constructed by finding Eulerian cycles in so-called de Bruijn graphs. In this thesis, we investigate partial words, or sequences with wildcard symbols or hole symbols, of maximum subword complexity. The subword complexity function of a partial word w over a given alphabet of size k assigns to each positive integer n, the number pw(n) of distinct full words over the alphabet that are compatible with factors of length n of w. For positive integers h, k and n, a de Bruijn partial word of order n with h holes over an alphabet Σ of size k is a partial word w with h holes over Σ of minimal length with the property that pw(n)=kn. In some cases, they are efficiently constructed by finding Eulerian paths in modified de Bruijn graphs. We are concerned with the following three questions: (1) What is the length of k-ary de Bruijn partial words of order n with h holes? (2) What is an efficient method for generating such partial words? (3) How many such partial words are there

    On Minimal Sturmian Partial Words

    Get PDF
    Partial words, which are sequences that may have some undefined positions called holes, can be viewed as sequences over an extended alphabet A_diamond=A cup {diamond}wherediamondstandsforaholeandmatches(oriscompatiblewitheveryletterinA.Thesubwordcomplexityofapartialwordw,denotedbypw(n),isthenumberofdistinctfullwords(thosewithoutholes)overthealphabetthatarecompatiblewithfactorsoflengthnofw.Afunctionf:N−>Nis(k,h)−feasibleifforeachintegerNgeq1,thereexistsak−arypartialwordwwithhholessuchthatpw(n)=f(n)foralln,1=3 where {diamond} stands for a hole and matches (or is compatible with every letter in A. The subword complexity of a partial word w, denoted by p_w(n), is the number of distinct full words (those without holes) over the alphabet that are compatible with factors of length n of w. A function f: N -> N is (k,h)-feasible if for each integer N geq 1, there exists a k-ary partial word w with h holes such that p_w(n) = f(n) for all n, 1 = 3holes. Finally, we give upper bounds on the lengths of minimal partial words with respect to f(n)=2n$ which are tight for h=0, 1 or 2

    Condorcet domains of tiling type

    Get PDF
    A Condorcet domain (CD) is a collection of linear orders on a set of candidates satisfying the following property: for any choice of preferences of voters from this collection, a simple majority rule does not yield cycles. We propose a method of constructing "large" CDs by use of rhombus tiling diagrams and explain that this method unifies several constructions of CDs known earlier. Finally, we show that three conjectures on the maximal sizes of those CDs are, in fact, equivalent and provide a counterexample to them.Comment: 16 pages. To appear in Discrete Applied Mathematic

    Optimized Sequence Library Design for Efficient In Vitro Interaction Mapping

    Get PDF
    Sequence libraries that cover all k-mers enable universal, unbiased measurements of binding to both oligonucleotides and peptides. While the number of k-mers grows exponentially in k, space on all experimental platforms is limited. Here, we shrink k-mer library sizes by using joker characters, which represent all characters in the alphabet simultaneously. We present the JokerCAKE (joker covering all k-mers) algorithm for generating a short sequence such that each k-mer appears at least p times with at most one joker character per k-mer. By running our algorithm on a range of parameters and alphabets, we show that JokerCAKE produces near-optimal sequences. Moreover, through comparison with data from hundreds of DNA-protein binding experiments and with new experimental results for both standard and JokerCAKE libraries, we establish that accurate binding scores can be inferred for high-affinity k-mers using JokerCAKE libraries. JokerCAKE libraries allow researchers to search a significantly larger sequence space using the same number of experimental measurements and at the same cost. We present a new compact sequence design that covers all k-mers utilizing joker characters and develop an efficient algorithm to generate such designs. We show through simulations and experimental validation that these sequence designs are useful for identifying high-affinity binding sites at significantly reduced cost and space. Keywords: sequence libraries; microarray design; de Bruijn graphNational Institutes of Health (U.S.) (Grant R01GM081871

    Combinatorics on Words 10th International Conference

    Get PDF
    This volume contains the Local Proceedings of the Tenth International Conference on WORDS, that took place at the Kiel University, Germany, from the 14th to the 17th September 2015. WORDS is the main conference series devoted to the mathematical theory of words, and it takes place every two years. The first conference in the series was organised in 1997 in Rouen, France, with the following editions taking place in Rouen, Palermo,Turku, Montreal, Marseille, Salerno, Prague, and Turku. The main object in the scope of the conference, words, are finite or infinite sequences of symbols over a finite alphabet. They appear as natural and basic mathematical model in many areas, theoretical or applicative. Accordingly, the WORDS conference is open to both theoretical contributions related to combinatorial, algebraic, and algorithmic aspects of words, as well as to contributions presenting application of the theory of words, for instance, in other fields of computer science, inguistics, biology and bioinformatics, or physics. For the second time in the history of WORDS, after the 2013 edition, a refereed proceedings volume was published in Springer’s Lecture Notes in Computer Science series. In addition, this local proceedings volume was published in the Kiel Computer Science Series of the Kiel University. Being a conference at the border between theoretical computer science and mathematics, WORDS tries to capture in its two proceedings volumes the characteristics of the conferences from both these worlds. While the Lecture Notes in Computer Science volume was dedicated to formal contributions, this local proceedings volume allows, in the spirit of mathematics conferences, the publication of several contributions informing on current research and work in progress in areas closely connected to the core topics of WORDS. All the papers, the ones published in the Lecture Notes in Computer Science proceedings volume or the ones from this volume, were refereed to high standards by the members of the Program Committee. Following the conference, a special issue of the Theoretical Computer Science journal will be edited, containing extended versions of papers from both proceedings volumes. In total, the conference hosted 18 contributed talks. The papers on which 14 of these talks were based, were published in th LNCS volume; the other 4 are published in this volume. In addition to the contributed talks, the conference program included six invited talks given by leading experts in the areas covered by the WORDS conference: Jörg Endrullis (Amsterdam), Markus Lohrey (Siegen), Jean Néraud (Rouen), Dominique Perrin (Paris), Michaël Rao (Lyon), Thomas Stoll (Nancy). WORDS 2015 was the tenth conference in the series, so we were extremely happy to welcome, as invited speaker at this anniversary edition, Jean Néraud, one of the initiators of the series and the main organiser of the first two editions of this conference. We thank all the invited speakers and all the authors of submitted papers for their contributions to the the success of the conference. We are grateful to the members of the Program Committee for their work that lead to the selection of the contributed talks, and, implicitly, of the papers published in this volume. They were assisted in their task by a series of external referees, gratefully acknowledged below. The submission and reviewing process used the Easychair system; we thank Andrej Voronkov for this system which facilitated the work of the Programme Committee and the editors considerably. We grateful thank Gheorghe Iosif for designing the logo, poster, and banner of WORDS 2015; the logo of the conference can be seen on the front cover of this book. We also thank the editors of the Kiel Computer Science Series, especially Lasse Kliemann, for their support in editing this volume. Finally, we thank the Organising Committee of WORDS 2015 for ensuring the smooth run of the conference
    • …
    corecore