8 research outputs found

    Linear-time Computation of Minimal Absent Words Using Suffix Array

    Get PDF
    An absent word of a word y of length n is a word that does not occur in y. It is a minimal absent word if all its proper factors occur in y. Minimal absent words have been computed in genomes of organisms from all domains of life; their computation provides a fast alternative for measuring approximation in sequence comparison. There exists an O(n)-time and O(n)-space algorithm for computing all minimal absent words on a fixed-sized alphabet based on the construction of suffix automata (Crochemore et al., 1998). No implementation of this algorithm is publicly available. There also exists an O(n^2)-time and O(n)-space algorithm for the same problem based on the construction of suffix arrays (Pinho et al., 2009). An implementation of this algorithm was also provided by the authors and is currently the fastest available. In this article, we bridge this unpleasant gap by presenting an O(n)-time and O(n)-space algorithm for computing all minimal absent words based on the construction of suffix arrays. Experimental results using real and synthetic data show that the respective implementation outperforms the one by Pinho et al

    Efficient dynamic range minimum query

    Get PDF
    International audienceThe Range Minimum Query problem consists in answering efficiently to a simple question: " what is the minimal element appearing between two specified indices of a given array? ". In this paper we present a novel approach that offers a satisfying trade-off between time and space. Moreover we show how the structure can be easily maintained whenever an insertion, modification or deletion modifies the array

    Parallelising the Computation of Minimal Absent Words

    Get PDF
    International audienceAn absent word of a word y of length n is a word that does not occur in y. It is a minimal absent word if all its proper factors occur in y. Minimal absent words have been computed in genomes of organisms from all domains of life; their computation also provides a fast alternative for measuring approximation in sequence comparison. There exists an O(n)-time and O(n)-space algorithm for computing all minimal absent words on a fixed-sized alphabet based on the construction of suffix array (Barton et al., 2014). An implementation of this algorithm was also provided by the authors and is currently the fastest available. In this article, we present a new O(n)-time and O(n)-space algorithm for computing all minimal absent words; it has the desirable property that, given the indexing data structure at hand, the computation of minimal absent words can be executed in parallel. Experimental results show that a mul-tiprocessing implementation of this algorithm can accelerate the overall computation by more than a factor of two compared to state-of-the-art approaches. By excluding the indexing data structure construction time, we show that the implementation achieves near-optimal speed-ups

    Absent words in a sliding window with applications

    Get PDF
    International audienceAn absent word of a word y is a word that does not occur in y. It is then called minimal if all its proper factors occur in y. In fact, minimal absent words (MAWs) provide useful information about y and thus have several applications. In this paper, we propose an algorithm that maintains the set of MAWs of a fixed-length window sliding over y online. Our algorithm represents MAWs through nodes of the suffix tree. Specifically, the suffix tree of the sliding window is maintained using modified Senft's algorithm (Senft, 2005), itself generalizing Ukkonen's online algorithm (Ukkonen, 1995). We then apply this algorithm to the approximate pattern-matching problem under the Length Weighted Index distance (Chairungsee and Crochemore, 2012). This results in an online -time algorithm for finding approximate occurrences of a word x in y, , where σ is the alphabet size

    Impact of neonatal digestion on the physiology of breast milk bacteria and their immunomodulation capacities

    No full text
    International audienceExclusive breastfeeding in the first months of life has protective effects on infant health compared to formula-fed infants including a lower risk of gastrointestinal and respiratory infections and of metabolic and immune disorders. These observable differences in 'health effect' are likely due to differences in composition between breast milk and infant formula. In particular, breast milk contains a wide variety of bacterial species that are present at low dose. This microbial counterpart contributes to the development of the newborn's gut microbiota after digestion of milk matrix. It was also suggested to influence more largely intestinal homeostasis, acting on the gut immunity and intestinal barrier, and thus to contribute to breastmilk health promoting effects [1]. Several studies have investigated the impact of commensal bacteria on gut homeostasis. However, they generally do not include the different phases of digestion, like gastric (G) or intestinal (I) phases, which could have an impact on the physiological state and properties of the bacteria. The aim of this study was to evaluate the impact of newborn digestion on the physiology of breastmilk bacteria and on their mmunomodulatory potential. For this study, six strains representative of the prevalent bacteria in breast milk were selected. The strains were digested in a static in vitro model of digestion, at full-term infant stage, in a milk matrix. Following the G and I phases, bacterial cultivability was measured and the immunomodulation properties were determined through the quantification of IL-10 and TNF-α released by macrophages (THP-1 line: human monocytic cell line differentiated into macrophages). The impact of the G and I digestion phases on both viability and immunomodulation properties was strain-dependent, pointing out the interest to consider these steps for the determination of ingested bacteria properties.This study is part of the PROLIFIC project and was financially supported by the Régions Bretagne and Pays de Loire and the Milkvalley-BBA consortium

    Distinct Regulation of EZH2 and its Repressive H3K27me3 Mark in Polyomavirus-Positive and -Negative Merkel Cell Carcinoma

    No full text
    International audienceMerkel cell carcinoma (MCC) is an aggressive skin cancer for which Merkel cell polyomavirus integration and expression of viral oncogenes small T and Large T have been identified as major oncogenic determinants. Recently, a component of the PRC2 complex, the histone methyltransferase enhancer of zeste homolog 2 (EZH2) that induces H3K27 trimethylation as a repressive mark has been proposed as a potential therapeutic target in MCC. Because divergent results have been reported for the levels of EZH2 and trimethylation of lysine 27 on histone 3, we analyzed these factors in a large MCC cohort to identify the molecular determinants of EZH2 activity in MCC and to establish MCC cell lines' sensitivity to EZH2 inhibitors. Immunohistochemical expression of EZH2 was observed in 92% of MCC tumors (156 of 170), with higher expression levels in virus-positive than virus-negative tumors (P = 0.026). For the latter, we showed overexpression of EZHIP, a negative regulator of the PRC2 complex. In vitro, ectopic expression of the large T antigen in fibroblasts led to the induction of EZH2 expression, whereas the knockdown of T antigens in MCC cell lines resulted in decreased EZH2 expression. EZH2 inhibition led to selective cytotoxicity on virus-positive MCC cell lines. This study highlights the distinct mechanisms of EZH2 induction between virus-negative and -positive MCC
    corecore