36 research outputs found

    Real-Time Streaming Multi-Pattern Search for Constant Alphabet

    Get PDF
    In the streaming multi-pattern search problem, which is also known as the streaming dictionary matching problem, a set D={P_1,P_2, . . . ,P_d} of d patterns (strings over an alphabet Sigma), called the dictionary, is given to be preprocessed. Then, a text T arrives one character at a time and the goal is to report, before the next character arrives, the longest pattern in the dictionary that is a current suffix of T. We prove that for a constant size alphabet, there exists a randomized Monte-Carlo algorithm for the streaming dictionary matching problem that takes constant time per character and uses O(d log m) words of space, where m is the length of the longest pattern in the dictionary. In the case where the alphabet size is not constant, we introduce two new randomized Monte-Carlo algorithms with the following complexities: * O(log log |Sigma|) time per character in the worst case and O(d log m) words of space. * O(1/epsilon) time per character in the worst case and O(d |Sigma|^epsilon log m/epsilon) words of space for any 0<epsilon<= 1. These results improve upon the algorithm of [Clifford et al., ESA\u2715] which uses O(d log m) words of space and takes O(log log (m+d)) time per character

    Streaming Pattern Matching with d Wildcards

    Get PDF
    In the pattern matching with d wildcards problem we are given a text T of length n and a pattern P of length m that contains d wildcard characters, each denoted by a special symbol \u27?\u27. A wildcard character matches any other character. The goal is to establish for each m-length substring of T whether it matches P. In the streaming model variant of the pattern matching with d wildcards problem the text T arrives one character at a time and the goal is to report, before the next character arrives, if the last m characters match P while using only o(m) words of space. In this paper we introduce two new algorithms for the d wildcard pattern matching problem in the streaming model. The first is a randomized Monte Carlo algorithm that is parameterized by a constant 0<=delta<=1. This algorithm uses ~O(d^{1-delta}) amortized time per character and ~O(d^{1+delta}) words of space. The second algorithm, which is used as a black box in the first algorithm, is a randomized Monte Carlo algorithm which uses O(d+log m) worst-case time per character and O(d log m) words of space

    Towards Optimal Approximate Streaming Pattern Matching by Matching Multiple Patterns in Multiple Streams

    Get PDF
    Recently, there has been a growing focus in solving approximate pattern matching problems in the streaming model. Of particular interest are the pattern matching with k-mismatches (KMM) problem and the pattern matching with w-wildcards (PMWC) problem. Motivated by reductions from these problems in the streaming model to the dictionary matching problem, this paper focuses on designing algorithms for the dictionary matching problem in the multi-stream model where there are several independent streams of data (as opposed to just one in the streaming model), and the memory complexity of an algorithm is expressed using two quantities: (1) a read-only shared memory storage area which is shared among all the streams, and (2) local stream memory that each stream stores separately. In the dictionary matching problem in the multi-stream model the goal is to preprocess a dictionary D={P_1,P_2,...,P_d} of d=|D| patterns (strings with maximum length m over alphabet Sigma) into a data structure stored in shared memory, so that given multiple independent streaming texts (where characters arrive one at a time) the algorithm reports occurrences of patterns from D in each one of the texts as soon as they appear. We design two efficient algorithms for the dictionary matching problem in the multi-stream model. The first algorithm works when all the patterns in D have the same length m and costs O(d log m) words in shared memory, O(log m log d) words in stream memory, and O(log m) time per character. The second algorithm works for general D, but the time cost per character becomes O(log m+log d log log d). We also demonstrate the usefulness of our first algorithm in solving both the KMM problem and PMWC problem in the streaming model. In particular, we obtain the first almost optimal (up to poly-log factors) algorithm for the PMWC problem in the streaming model. We also design a new algorithm for the KMM problem in the streaming model that, up to poly-log factors, has the same bounds as the most recent results that use different techniques. Moreover, for most inputs, our algorithm for KMM is significantly faster on average

    Locally Consistent Parsing for Text Indexing in Small Space

    Full text link
    We consider two closely related problems of text indexing in a sub-linear working space. The first problem is the Sparse Suffix Tree (SST) construction of a set of suffixes BB using only O(B)O(|B|) words of space. The second problem is the Longest Common Extension (LCE) problem, where for some parameter 1τn1\le\tau\le n, the goal is to construct a data structure that uses O(nτ)O(\frac {n}{\tau}) words of space and can compute the longest common prefix length of any pair of suffixes. We show how to use ideas based on the Locally Consistent Parsing technique, that was introduced by Sahinalp and Vishkin [STOC '94], in some non-trivial ways in order to improve the known results for the above problems. We introduce new Las-Vegas and deterministic algorithms for both problems. We introduce the first Las-Vegas SST construction algorithm that takes O(n)O(n) time. This is an improvement over the last result of Gawrychowski and Kociumaka [SODA '17] who obtained O(n)O(n) time for Monte-Carlo algorithm, and O(nlogB)O(n\sqrt{\log |B|}) time for Las-Vegas algorithm. In addition, we introduce a randomized Las-Vegas construction for an LCE data structure that can be constructed in linear time and answers queries in O(τ)O(\tau) time. For the deterministic algorithms, we introduce an SST construction algorithm that takes O(nlognB)O(n\log \frac{n}{|B|}) time (for B=Ω(logn)|B|=\Omega(\log n)). This is the first almost linear time, O(npolylogn)O(n\cdot poly\log{n}), deterministic SST construction algorithm, where all previous algorithms take at least Ω(min{nB,n2B})\Omega\left(\min\{n|B|,\frac{n^2}{|B|}\}\right) time. For the LCE problem, we introduce a data structure that answers LCE queries in O(τlogn)O(\tau\sqrt{\log^*n}) time, with O(nlogτ)O(n\log\tau) construction time (for τ=O(nlogn)\tau=O(\frac{n}{\log n})). This data structure improves both query time and construction time upon the results of Tanimura et al. [CPM '16].Comment: Extended abstract to appear is SODA 202

    The Streaming k-Mismatch Problem: Tradeoffs Between Space and Total Time

    Get PDF
    We revisit the kk-mismatch problem in the streaming model on a pattern of length mm and a streaming text of length nn, both over a size-σ\sigma alphabet. The current state-of-the-art algorithm for the streaming kk-mismatch problem, by Clifford et al. [SODA 2019], uses O~(k)\tilde O(k) space and O~(k)\tilde O\big(\sqrt k\big) worst-case time per character. The space complexity is known to be (unconditionally) optimal, and the worst-case time per character matches a conditional lower bound. However, there is a gap between the total time cost of the algorithm, which is O~(nk)\tilde O(n\sqrt k), and the fastest known offline algorithm, which costs O~(n+min(nkm,σn))\tilde O\big(n + \min\big(\frac{nk}{\sqrt m},\sigma n\big)\big) time. Moreover, it is not known whether improvements over the O~(nk)\tilde O(n\sqrt k) total time are possible when using more than O(k)O(k) space. We address these gaps by designing a randomized streaming algorithm for the kk-mismatch problem that, given an integer parameter ksmk\le s \le m, uses O~(s)\tilde O(s) space and costs O~(n+min(nk2m,nks,σnms))\tilde O\big(n+\min\big(\frac {nk^2}m,\frac{nk}{\sqrt s},\frac{\sigma nm}s\big)\big) total time. For s=ms=m, the total runtime becomes O~(n+min(nkm,σn))\tilde O\big(n + \min\big(\frac{nk}{\sqrt m},\sigma n\big)\big), which matches the time cost of the fastest offline algorithm. Moreover, the worst-case time cost per character is still O~(k)\tilde O\big(\sqrt k\big).Comment: Extended abstract to appear in CPM 202

    Reference ranges for left modified myocardial performance index: Systematic review and meta-analysis

    Get PDF
    INTRODUCTION: The modified myocardial performance index (mod-MPI) is a noninvasive Doppler-derived metric used to evaluate fetal cardiac function. However, the reference ranges for mod-MPI in normal fetuses are not clearly defined, which limits the use of this technology in fetuses with potential cardiac compromise. Thus, we aimed to perform a systematic review and meta-analysis of published mod-MPI reference ranges across gestation. METHODS: The published literature was systematically searched, and all published articles in any language that provided values for the left ventricular mod-MPI obtained in low-risk, singleton fetuses were considered eligible for further review. All retrieved titles and abstracts were independently reviewed by two researchers. Mean and standard deviation by gestational week was extracted or calculated from published data. DerSimonian-Laird random-effects models were used to estimate pooled means and 95% confidence intervals (CIs). RESULTS: The search resulted in 618 unique citations, of which 583 did not meet inclusion criteria, leaving 35 abstracts selected for full-text review. Review of the references of these 35 articles identified another 5 studies of interest. Of the 40 articles reviewed, six met inclusion criteria. There was significant heterogeneity seen in the mod-MPI results reported. Mod-MPI increased as pregnancy progressed in all studies. The pooled mean mod-MPI at 11 weeks\u27 gestation was 0.400 (95% CI 0.374-0.426) and increased to 0.585 (95% CI 0.533-0.637) at 41 weeks\u27 gestation. The increase was linear in 5 of 6 studies, while in 1 study, the mod-MPI was stable until 27 weeks\u27 gestation, and then increased throughout the third trimester. Despite all having trends increasing over pregnancy, there was no study in which all the weekly means fell within the pooled 95% CI. CONCLUSION: While mod-MPI does increase over gestation, the true reference ranges for fetuses remain elusive. Future efforts to further optimize calculation of time intervals possibly via automation are desperately needed to allow for reproducibility of this potentially very useful tool to assess fetal cardiac function

    Improved Circular k-Mismatch Sketches

    Get PDF
    The shift distance sh(S1,S2)\mathsf{sh}(S_1,S_2) between two strings S1S_1 and S2S_2 of the same length is defined as the minimum Hamming distance between S1S_1 and any rotation (cyclic shift) of S2S_2. We study the problem of sketching the shift distance, which is the following communication complexity problem: Strings S1S_1 and S2S_2 of length nn are given to two identical players (encoders), who independently compute sketches (summaries) sk(S1)\mathtt{sk}(S_1) and sk(S2)\mathtt{sk}(S_2), respectively, so that upon receiving the two sketches, a third player (decoder) is able to compute (or approximate) sh(S1,S2)\mathsf{sh}(S_1,S_2) with high probability. This paper primarily focuses on the more general kk-mismatch version of the problem, where the decoder is allowed to declare a failure if sh(S1,S2)>k\mathsf{sh}(S_1,S_2)>k, where kk is a parameter known to all parties. Andoni et al. (STOC'13) introduced exact circular kk-mismatch sketches of size O~(k+D(n))\widetilde{O}(k+D(n)), where D(n)D(n) is the number of divisors of nn. Andoni et al. also showed that their sketch size is optimal in the class of linear homomorphic sketches. We circumvent this lower bound by designing a (non-linear) exact circular kk-mismatch sketch of size O~(k)\widetilde{O}(k); this size matches communication-complexity lower bounds. We also design (1±ε)(1\pm \varepsilon)-approximate circular kk-mismatch sketch of size O~(min(ε2k,ε1.5n))\widetilde{O}(\min(\varepsilon^{-2}\sqrt{k}, \varepsilon^{-1.5}\sqrt{n})), which improves upon an O~(ε2n)\widetilde{O}(\varepsilon^{-2}\sqrt{n})-size sketch of Crouch and McGregor (APPROX'11)

    Sleep Apnea and Fetal Growth Restriction (SAFER) study: Protocol for a pragmatic randomised clinical trial of positive airway pressure as an antenatal therapy for fetal growth restriction in maternal obstructive sleep apnoea

    Get PDF
    INTRODUCTION: Fetal growth restriction (FGR) is a major contributor to fetal and neonatal morbidity and mortality with intrauterine, neonatal and lifelong complications. This study explores maternal obstructive sleep apnoea (OSA) as a potentially modifiable risk factor for FGR. We hypothesise that, in pregnancies complicated by FGR, treating mothers who have OSA using positive airway pressure (PAP) will improve birth weight and neonatal outcomes. METHODS AND ANALYSIS: The Sleep Apnea and Fetal Growth Restriction study is a prospective, block-randomised, single-blinded, multicentre, pragmatic controlled trial. We enrol pregnant women aged 18-50, between 22 and 31 weeks of gestation, with established FGR based on second trimester ultrasound, who do not have other prespecified known causes of FGR (such as congenital anomalies or intrauterine infection). In stage 1, participants are screened by questionnaire for OSA risk. If OSA risk is identified, participants proceed to stage 2, where they undergo home sleep apnoea testing. Participants are determined to have OSA if they have an apnoea-hypopnoea index (AHI) ≥5 (if the oxygen desaturation index (ODI) is also ≥5) or if they have an AHI ≥10 (even if the ODI is \u3c5). These participants proceed to stage 3, where they are randomised to nightly treatment with PAP or no PAP (standard care control), which is maintained until delivery. The primary outcome is unadjusted birth weight; secondary outcomes include fetal growth velocity on ultrasound, enrolment-to-delivery interval, gestational age at delivery, birth weight corrected for gestational age, stillbirth, Apgar score, rate of admission to higher levels of care (neonatal intensive care unit or special care nursery) and length of neonatal stay. These outcomes are compared between PAP and control using intention-to-treat analysis. ETHICS AND DISSEMINATION: This study has been approved by the Institutional Review Boards at Washington University in St Louis, Missouri; Hadassah Hebrew University Medical Center, Jerusalem; and the University of Rochester, New York. Recruitment began in Washington University in November 2019 but stopped from March to November 2020 due to COVID-19. Recruitment began in Hadassah Hebrew University in March 2021, and in the University of Rochester in May 2021. Dissemination plans include presentations at scientific conferences and scientific publications. TRIAL REGISTRATION NUMBER: NCT04084990

    Alternating Electric Fields (Tumor-Treating Fields Therapy) Can Improve Chemotherapy Treatment Efficacy in Non-Small Cell Lung Cancer Both In Vitro and In Vivo

    Get PDF
    Non-small cell lung cancer (NSCLC) is one of the leading causes of cancer-related deaths worldwide. Common treatment modalities for NSCLC include surgery, radiotherapy, chemotherapy, and, in recent years, the clinical management paradigm has evolved with the advent of targeted therapies. Despite such advances, the impact of systemic therapies for advanced disease remains modest, and as such, the prognosis for patients with NSCLC remains poor. Standard modalities are not without their respective toxicities and there is a clear need to improve both efficacy and safety for current management approaches. Tumor-treating fields (TTFields) are low-intensity, intermediate-frequency alternating electric fields that disrupt proper spindle microtubule arrangement, thereby leading to mitotic arrest and ultimately to cell death. We evaluated the effects of combining TTFields with standard chemotherapeutic agents on several NSCLC cell lines, both in vitro and in vivo. Frequency titration curves demonstrated that the inhibitory effects of TTFields were maximal at 150 kHz for all NSCLC cell lines tested, and that the addition of TTFields to chemotherapy resulted in enhanced treatment efficacy across all cell lines. We investigated the response of Lewis lung carcinoma and KLN205 squamous cell carcinoma in mice treated with TTFields in combination with pemetrexed, cisplatin, or paclitaxel and compared these to the efficacy observed in mice exposed only to the single agents. Combining TTFields with these therapeutic agents enhanced treatment efficacy in comparison with the respective single agents and control groups in all animal models. Together, these findings suggest that combining TTFields therapy with chemotherapy may provide an additive efficacy benefit in the management of NSCLC
    corecore