21,578 research outputs found

    Locating regions in a sequence under density constraints

    Get PDF
    Several biological problems require the identification of regions in a sequence where some feature occurs within a target density range: examples including the location of GC-rich regions, identification of CpG islands, and sequence matching. Mathematically, this corresponds to searching a string of 0s and 1s for a substring whose relative proportion of 1s lies between given lower and upper bounds. We consider the algorithmic problem of locating the longest such substring, as well as other related problems (such as finding the shortest substring or a maximal set of disjoint substrings). For locating the longest such substring, we develop an algorithm that runs in O(n) time, improving upon the previous best-known O(n log n) result. For the related problems we develop O(n log log n) algorithms, again improving upon the best-known O(n log n) results. Practical testing verifies that our new algorithms enjoy significantly smaller time and memory footprints, and can process sequences that are orders of magnitude longer as a result.Comment: 17 pages, 8 figures; v2: minor revisions, additional explanations; to appear in SIAM Journal on Computin

    CD-independent subsets in meet-distributive lattices

    Get PDF
    A subset XX of a finite lattice LL is CD-independent if the meet of any two incomparable elements of XX equals 0. In 2009, Cz\'edli, Hartmann and Schmidt proved that any two maximal CD-independent subsets of a finite distributive lattice have the same number of elements. In this paper, we prove that if LL is a finite meet-distributive lattice, then the size of every CD-independent subset of LL is at most the number of atoms of LL plus the length of LL. If, in addition, there is no three-element antichain of meet-irreducible elements, then we give a recursive description of maximal CD-independent subsets. Finally, to give an application of CD-independent subsets, we give a new approach to count islands on a rectangular board.Comment: 14 pages, 4 figure
    • …
    corecore