496 research outputs found

    On Quasiperiodic Morphisms

    Full text link
    Weakly and strongly quasiperiodic morphisms are tools introduced to study quasiperiodic words. Formally they map respectively at least one or any non-quasiperiodic word to a quasiperiodic word. Considering them both on finite and infinite words, we get four families of morphisms between which we study relations. We provide algorithms to decide whether a morphism is strongly quasiperiodic on finite words or on infinite words.Comment: 12 page

    Dictionary Matching with One Gap

    Full text link
    The dictionary matching with gaps problem is to preprocess a dictionary DD of dd gapped patterns P1,…,PdP_1,\ldots,P_d over alphabet Ξ£\Sigma, where each gapped pattern PiP_i is a sequence of subpatterns separated by bounded sequences of don't cares. Then, given a query text TT of length nn over alphabet Ξ£\Sigma, the goal is to output all locations in TT in which a pattern Pi∈DP_i\in D, 1≀i≀d1\leq i\leq d, ends. There is a renewed current interest in the gapped matching problem stemming from cyber security. In this paper we solve the problem where all patterns in the dictionary have one gap with at least Ξ±\alpha and at most Ξ²\beta don't cares, where Ξ±\alpha and Ξ²\beta are given parameters. Specifically, we show that the dictionary matching with a single gap problem can be solved in either O(dlog⁑d+∣D∣)O(d\log d + |D|) time and O(dlog⁑Ρd+∣D∣)O(d\log^{\varepsilon} d + |D|) space, and query time O(n(Ξ²βˆ’Ξ±)log⁑log⁑dlog⁑2min⁑{d,log⁑∣D∣}+occ)O(n(\beta -\alpha )\log\log d \log ^2 \min \{ d, \log |D| \} + occ), where occocc is the number of patterns found, or preprocessing time and space: O(d2+∣D∣)O(d^2 + |D|), and query time O(n(Ξ²βˆ’Ξ±)+occ)O(n(\beta -\alpha ) + occ), where occocc is the number of patterns found. As far as we know, this is the best solution for this setting of the problem, where many overlaps may exist in the dictionary.Comment: A preliminary version was published at CPM 201

    Searching of gapped repeats and subrepetitions in a word

    Full text link
    A gapped repeat is a factor of the form uvuuvu where uu and vv are nonempty words. The period of the gapped repeat is defined as ∣u∣+∣v∣|u|+|v|. The gapped repeat is maximal if it cannot be extended to the left or to the right by at least one letter with preserving its period. The gapped repeat is called α\alpha-gapped if its period is not greater than α∣v∣\alpha |v|. A δ\delta-subrepetition is a factor which exponent is less than 2 but is not less than 1+δ1+\delta (the exponent of the factor is the quotient of the length and the minimal period of the factor). The δ\delta-subrepetition is maximal if it cannot be extended to the left or to the right by at least one letter with preserving its minimal period. We reveal a close relation between maximal gapped repeats and maximal subrepetitions. Moreover, we show that in a word of length nn the number of maximal α\alpha-gapped repeats is bounded by O(α2n)O(\alpha^2n) and the number of maximal δ\delta-subrepetitions is bounded by O(n/δ2)O(n/\delta^2). Using the obtained upper bounds, we propose algorithms for finding all maximal α\alpha-gapped repeats and all maximal δ\delta-subrepetitions in a word of length nn. The algorithm for finding all maximal α\alpha-gapped repeats has O(α2n)O(\alpha^2n) time complexity for the case of constant alphabet size and O(nlog⁑n+α2n)O(n\log n + \alpha^2n) time complexity for the general case. For finding all maximal δ\delta-subrepetitions we propose two algorithms. The first algorithm has O(nlog⁑log⁑nδ2)O(\frac{n\log\log n}{\delta^2}) time complexity for the case of constant alphabet size and O(nlog⁑n+nlog⁑log⁑nδ2)O(n\log n +\frac{n\log\log n}{\delta^2}) time complexity for the general case. The second algorithm has O(nlog⁑n+nδ2log⁑1δ)O(n\log n+\frac{n}{\delta^2}\log \frac{1}{\delta}) expected time complexity

    DON content in oat grains in Norway related to weather conditions at different growth stages

    Get PDF
    High concentrations of the mycotoxin deoxynivalenol (DON), produced by Fusarium graminearum have occurred frequently in Norwegian oats recently. Early prediction of DON levels is important for farmers, authorities and the Cereal Industry. In this study, the main weather factors influencing myco-toxin accumulation were identified and two models to predict the risk of DON in oat grains in Norway were developed: (1) as a warning system for farmers to decide if and when to treat with fungicide, and (2) for authorities and industry to use at harvest to identify potential food safety problems. Oat grain samples from farmers’ fields were collected together with weather data (2004–2013) A mathematical model was developed and used to esti- mate phenology windows of growth stages in oats (til- lering, flowering etc.). Weather summarisations were then calculated within these windows, and the Spearman rank correlation factor calculated between DON- contamination in oats at harvest and the weather summarisations for each phenological window. DON contamination was most clearly associated with the weather conditions around flowering and close to har- vest. Warm, rainy and humid weather during and around flowering increased the risk of DON accumulation in oats, as did dry periods during germination/seedling growth and tillering. Prior to harvest, warm and humid weather conditions followed by cool and dry conditions were associated with a decreased risk of DON accumu- lation. A prediction model, including only pre-flowering weather conditions, adequately forecasted risk of DON contamination in oat, and can aid in decisions about fungicide treatments

    Fusarium langsethiae and mycotoxin contamination in oat grain differed with growth stage at inoculation

    Get PDF
    High levels of mycotoxins are occasionally observed in Norwegian oat grain lots. Mycotoxins of primary concern in Norwegian oats are deoxynivalenol (DON) produced by Fusarium graminearum and HT2- and T2-toxins (HT2 + T2) produced by Fusarium langsethiae. Improved understanding of the epidemiology of Fusarium spp. is important for the development of measures to control mycotoxins. We studied the susceptibility to F. langsethiae after inoculation at early (booting, heading, flowering) or late (flowering, milk, dough) growth stages in three oat varieties in greenhouse experiments. The varieties had previously shown different levels of resistance to F. graminearum: Odal, Vinger (both moderately resistant), and Belinda (susceptible). The level of F. langsethiae DNA and HT2 + T2 were measured in harvested grain. In addition, we observed differences in aggressiveness (measured as the level of F. langsethiae DNA in grain) between F. langsethiae isolates after inoculation of oats at flowering. Substantial levels of F. langsethiae DNA (mean β‰₯ 138 pg per ΞΌg plant DNA) and HT2 + T2 (β‰₯348 ΞΌg/kg) were detected in grain harvested from oats that were spray-inoculated at heading or later stages, but not at booting (mean ≀ 10 pg/ΞΌg and ≀ 25 ΞΌg/kg, respectively), suggesting that oats are susceptible to F. langsethiae from heading and onwards. Vinger was the most resistant variety to F. langsethiae/HT2 + T2, whereas Odal and Belinda were relatively susceptible. We observed that late inoculations yielded high levels of other trichothecene A metabolites (mean sum of metabolites of 35–1048 ΞΌg/kg) in addition to HT2 + T2, in harvested grain, an indication that infections close to harvest may pose a further risk to food and feed safety.publishedVersio

    Reduced risk of oat grain contamination with fusarium langsethiae and HT-2 and T-2 toxins with increasing tillage intensity

    Get PDF
    Frequent occurrences of high levels of Fusarium mycotoxins have been recorded in Norwegian oat grain. To elucidate the influence of tillage operations on the development of Fusarium and mycotoxins in oat grain, we conducted tillage trials with continuous oats at two locations in southeast Norway. We have previously presented the content of Fusarium DNA detected in straw residues and air samples from these fields. Grain harvested from ploughed plots had lower levels of Fusarium langsethiae DNA and HT-2 and T-2 toxins (HT2 + T2) compared to grain from harrowed plots. Our results indicate that the risk of F. langsethiae and HT2 + T2 contamination of oats is reduced with increasing tillage intensity. No distinct influence of tillage on the DNA concentration of Fusarium graminearum and Fusarium avenaceum in the harvested grain was observed. In contrast to F. graminearum and F. avenaceum, only limited contents of F. langsethiae DNA were observed in straw residues and air samples. Still, considerable concentrations of F. langsethiae DNA and HT2 + T2 were recorded in oat grain harvested from these fields. We speculate that the life cycle of F. langsethiae differs from those of F. graminearum and F. avenaceum with regard to survival, inoculum production and dispersal

    Computing Covers under Substring Consistent Equivalence Relations

    Full text link
    Covers are a kind of quasiperiodicity in strings. A string CC is a cover of another string TT if any position of TT is inside some occurrence of CC in TT. The shortest and longest cover arrays of TT have the lengths of the shortest and longest covers of each prefix of TT, respectively. The literature has proposed linear-time algorithms computing longest and shortest cover arrays taking border arrays as input. An equivalence relation β‰ˆ\approx over strings is called a substring consistent equivalence relation (SCER) iff Xβ‰ˆYX \approx Y implies (1) ∣X∣=∣Y∣|X| = |Y| and (2) X[i:j]β‰ˆY[i:j]X[i:j] \approx Y[i:j] for all 1≀i≀jβ‰€βˆ£X∣1 \le i \le j \le |X|. In this paper, we generalize the notion of covers for SCERs and prove that existing algorithms to compute the shortest cover array and the longest cover array of a string TT under the identity relation will work for any SCERs taking the accordingly generalized border arrays.Comment: 16 page

    Reconstructing phylogenies from noisy quartets in polynomial time with a high success probability

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In recent years, quartet-based phylogeny reconstruction methods have received considerable attentions in the computational biology community. Traditionally, the accuracy of a phylogeny reconstruction method is measured by simulations on synthetic datasets with known "true" phylogenies, while little theoretical analysis has been done. In this paper, we present a new model-based approach to measuring the accuracy of a quartet-based phylogeny reconstruction method. Under this model, we propose three efficient algorithms to reconstruct the "true" phylogeny with a high success probability.</p> <p>Results</p> <p>The first algorithm can reconstruct the "true" phylogeny from the input quartet topology set without quartet errors in <it>O</it>(<it>n</it><sup>2</sup>) time by querying at most (<it>n </it>- 4) log(<it>n </it>- 1) quartet topologies, where <it>n </it>is the number of the taxa. When the input quartet topology set contains errors, the second algorithm can reconstruct the "true" phylogeny with a probability approximately 1 - <it>p </it>in <it>O</it>(<it>n</it><sup>4 </sup>log <it>n</it>) time, where <it>p </it>is the probability for a quartet topology being an error. This probability is improved by the third algorithm to approximately <inline-formula><m:math name="1748-7188-3-1-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mn>1</m:mn><m:mrow><m:mn>1</m:mn><m:mo>+</m:mo><m:msup><m:mi>q</m:mi><m:mn>2</m:mn></m:msup><m:mo>+</m:mo><m:mfrac><m:mn>1</m:mn><m:mn>2</m:mn></m:mfrac><m:msup><m:mi>q</m:mi><m:mn>4</m:mn></m:msup><m:mo>+</m:mo><m:mfrac><m:mn>1</m:mn><m:mrow><m:mn>16</m:mn></m:mrow></m:mfrac><m:msup><m:mi>q</m:mi><m:mn>5</m:mn></m:msup></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF"> MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaqcfa4aaSaaaeaacqaIXaqmaeaacqaIXaqmcqGHRaWkcqWGXbqCdaahaaqabeaacqaIYaGmaaGaey4kaSYaaSaaaeaacqaIXaqmaeaacqaIYaGmaaGaemyCae3aaWbaaeqabaGaeGinaqdaaiabgUcaRmaalaaabaGaeGymaedabaGaeGymaeJaeGOnaydaaiabdghaXnaaCaaabeqaaiabiwda1aaaaaaaaa@3D5A@</m:annotation></m:semantics></m:math></inline-formula>, where <inline-formula><m:math name="1748-7188-3-1-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi>q</m:mi><m:mo>=</m:mo><m:mfrac><m:mi>p</m:mi><m:mrow><m:mn>1</m:mn><m:mo>βˆ’</m:mo><m:mi>p</m:mi></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF"> MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemyCaeNaeyypa0tcfa4aaSaaaeaacqWGWbaCaeaacqaIXaqmcqGHsislcqWGWbaCaaaaaa@3391@</m:annotation></m:semantics></m:math></inline-formula>, with running time of <it>O</it>(<it>n</it><sup>5</sup>), which is at least 0.984 when <it>p </it>< 0.05.</p> <p>Conclusion</p> <p>The three proposed algorithms are mathematically guaranteed to reconstruct the "true" phylogeny with a high success probability. The experimental results showed that the third algorithm produced phylogenies with a higher probability than its aforementioned theoretical lower bound and outperformed some existing phylogeny reconstruction methods in both speed and accuracy.</p
    • …
    corecore