76 research outputs found

    An adaptive hybrid pattern-matching algorithm on indeterminate strings

    Get PDF
    We describe a hybrid pattern-matching algorithm that works on both regular and indeterminate strings. This algorithm is inspired by the recently proposed hybrid algorithm FJS and its indeterminate successor. However, as discussed in this paper, because of the special properties of indeterminate strings, it is not straightforward to directly migrate FJS to an indeterminate version. Our new algorithm combines two fast pattern-matching algorithms, ShiftAnd and BMS (the Sunday variant of the Boyer-Moore algorithm), and is highly adaptive to the nature of the text being processed. It avoids using the border array, therefore avoids some of the cases that are awkward for indeterminate strings. Although not always the fastest in individual test cases, our new algorithm is superior in overall performance to its two component algorithms — perhaps a general advantage of hybrid algorithms

    Sorting suffixes of two-pattern strings

    Get PDF
    Recently, several authors presented linear recursive algorithms for sorting suffixes of a string. All these algorithms employ a similar three-step approach, based on an initial division of the suffixes of x into two sets: in step 1 sort the first set using recursive reduction of the problem, in step 2 determine the order of the suffixes in the second set based on the order of the suffixes in the first set, and in step 3 merge the two sets together. To optimize such analgorithm either for space or time, it may not be sufficient to optimize one of the three steps, since in doing so, one might increase the resources required for the others to an unacceptable extent. Franek, Lu, and Smyth introduced two-pattern strings as a generalization of Sturmian strings. Like Sturmian strings, two-pattern strings are generated by iterated morphisms, but they exhibit a much richer structure. In this paper we show that the suffixes of two-pattern strings can be sorted in linear time using a variant of the three step approach outlined above. It turns out that, given the order of the suffixes in a two-pattern string, one can almost directly list in linear time all the suffixes of its expansion under a two-pattern morphism

    Combinatorics of unique maximal factorization families (UMFFs)

    Get PDF
    Suppose a set W of strings contains exactly one rotation (cyclic shift) of every primitive string on some alphabet Σ. Then W is a circ-UMFF if and only if every word in Σ+ has a unique maximal factorization over W. The classic circ-UMFF is the set of Lyndon words based on lexicographic ordering (1958). Duval (1983) designed a linear sequential Lyndon factorization algorithm; a corresponding PRAM parallel algorithm was described by J. Daykin, Iliopoulos and Smyth (1994). Daykin and Daykin defined new circ-UMFFs based on various methods for totally ordering sets of strings (2003), and further described the structure of all circ-UMFFs (2008). Here we prove new combinatorial results for circ-UMFFs, and in particular for the case of Lyndon words. We introduce Acrobat and Flight Deck circ-UMFFs, and describe some of our results in terms of dictionaries. Applications of circ-UMFFs pertain to structured methods for concatenating and factoring strings over ordered alphabets, and those of Lyndon words are wide ranging and multidisciplinary

    Faster algorithms for computing maximal multirepeats in multiple sequences

    Get PDF
    A repeat in a string is a substring that occurs more than once. A repeat is extendible if every occurrence of the repeat has an identical letter either on the left or on the right; otherwise, it is maximal. A multirepeat is a repeat that occurs at least mmin times (mmin greater than/equal to 2) in each of at least q greater than/equal to 1 strings in a given set of strings. In this paper, we describe a family of efficient algorithms based on suffix arrays to compute maximal multirepeats under various constraints. Our algorithms are faster, more flexible and much more space-efficient than algorithms recently proposed for this problem. The results extend recent work by two of the authors computing all maximal repeats in a single string

    Suffix arrays: what are they good for?

    Get PDF
    Recently the theoretical community has displayed a flurry of interest in suffix arrays, and compressed suffix arrays. New, asymptotically optimal algorithms for construction, search, and compression of suffix arrays have been proposed. In this talk we will present our investigations into the practicalities of these latest developments. In particular, we investigate whether suffix arrays can indeed replace inverted files, as suggested in recent literature on suffix arrays

    New complexity results for the k-covers problem

    Get PDF
    The k-covers problem (kCP) asks us to compute a minimum cardinality set of stringsof given length k > 1 that covers a given string. It was shown in a recent paper, by reduction to 3-SAT, that the k-covers problem is NP-complete. In this paper we introduce a new problem, that we call the k-Bounded Relaxed Vertex Cover Problem (RVCPk), which we show is equivalent to k-Bounded Set Cover (SCPk). We show further that kCP is a special case of RVCPk restricted to certain classes Gx,k of graphs that represent all strings x. Thus a minimum k-cover can be approximated to within a factor k in polynomial time. We discuss approximate solutions of kCP, and we state a number of conjectures and open problems related to kCP and Gx,k

    A Systematic Review of Mosquito Coils and Passive Emanators: Defining Recommendations for Spatial Repellency Testing Methodologies.

    Get PDF
    Mosquito coils, vaporizer mats and emanators confer protection against mosquito bites through the spatial action of emanated vapor or airborne pyrethroid particles. These products dominate the pest control market; therefore, it is vital to characterize mosquito responses elicited by the chemical actives and their potential for disease prevention. The aim of this review was to determine effects of mosquito coils and emanators on mosquito responses that reduce human-vector contact and to propose scientific consensus on terminologies and methodologies used for evaluation of product formats that could contain spatial chemical actives, including indoor residual spraying (IRS), long lasting insecticide treated nets (LLINs) and insecticide treated materials (ITMs). PubMed, (National Centre for Biotechnology Information (NCBI), U.S. National Library of Medicine, NIH), MEDLINE, LILAC, Cochrane library, IBECS and Armed Forces Pest Management Board Literature Retrieval System search engines were used to identify studies of pyrethroid based coils and emanators with key-words "Mosquito coils" "Mosquito emanators" and "Spatial repellents". It was concluded that there is need to improve statistical reporting of studies, and reach consensus in the methodologies and terminologies used through standardized testing guidelines. Despite differing evaluation methodologies, data showed that coils and emanators induce mortality, deterrence, repellency as well as reduce the ability of mosquitoes to feed on humans. Available data on efficacy outdoors, dose-response relationships and effective distance of coils and emanators is inadequate for developing a target product profile (TPP), which will be required for such chemicals before optimized implementation can occur for maximum benefits in disease control

    Coupled transcriptome and proteome analysis of human lymphotropic tumor viruses: insights on the detection and discovery of viral genes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Kaposi's sarcoma-associated herpesvirus (KSHV) and Epstein-Barr virus (EBV) are related human tumor viruses that cause primary effusion lymphomas (PEL) and Burkitt's lymphomas (BL), respectively. Viral genes expressed in naturally-infected cancer cells contribute to disease pathogenesis; knowing which viral genes are expressed is critical in understanding how these viruses cause cancer. To evaluate the expression of viral genes, we used high-resolution separation and mass spectrometry coupled with custom tiling arrays to align the viral proteomes and transcriptomes of three PEL and two BL cell lines under latent and lytic culture conditions.</p> <p>Results</p> <p>The majority of viral genes were efficiently detected at the transcript and/or protein level on manipulating the viral life cycle. Overall the correlation of expressed viral proteins and transcripts was highly complementary in both validating and providing orthogonal data with latent/lytic viral gene expression. Our approach also identified novel viral genes in both KSHV and EBV, and extends viral genome annotation. Several previously uncharacterized genes were validated at both transcript and protein levels.</p> <p>Conclusions</p> <p>This systems biology approach coupling proteome and transcriptome measurements provides a comprehensive view of viral gene expression that could not have been attained using each methodology independently. Detection of viral proteins in combination with viral transcripts is a potentially powerful method for establishing virus-disease relationships.</p

    Health, education, and social care provision after diagnosis of childhood visual disability

    Get PDF
    Aim: To investigate the health, education, and social care provision for children newly diagnosed with visual disability.Method: This was a national prospective study, the British Childhood Visual Impairment and Blindness Study 2 (BCVIS2), ascertaining new diagnoses of visual impairment or severe visual impairment and blindness (SVIBL), or equivalent vi-sion. Data collection was performed by managing clinicians up to 1-year follow-up, and included health and developmental needs, and health, education, and social care provision.Results: BCVIS2 identified 784 children newly diagnosed with visual impairment/SVIBL (313 with visual impairment, 471 with SVIBL). Most children had associated systemic disorders (559 [71%], 167 [54%] with visual impairment, and 392 [84%] with SVIBL). Care from multidisciplinary teams was provided for 549 children (70%). Two-thirds (515) had not received an Education, Health, and Care Plan (EHCP). Fewer children with visual impairment had seen a specialist teacher (SVIBL 35%, visual impairment 28%, χ2p < 0.001), or had an EHCP (11% vs 7%, χ2p < 0 . 01).Interpretation: Families need additional support from managing clinicians to access recommended complex interventions such as the use of multidisciplinary teams and educational support. This need is pressing, as the population of children with visual impairment/SVIBL is expected to grow in size and complexity.This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited
    corecore