Search CORE

36 research outputs found

Real-Time Streaming Multi-Pattern Search for Constant Alphabet

Author: Golan Shay
Porat Ely
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 25th Annual European Symposium on Algorithms (ESA 2017)
Publication date: 01/01/2017
Field of study

In the streaming multi-pattern search problem, which is also known as the streaming dictionary matching problem, a set D={P_1,P_2, . . . ,P_d} of d patterns (strings over an alphabet Sigma), called the dictionary, is given to be preprocessed. Then, a text T arrives one character at a time and the goal is to report, before the next character arrives, the longest pattern in the dictionary that is a current suffix of T. We prove that for a constant size alphabet, there exists a randomized Monte-Carlo algorithm for the streaming dictionary matching problem that takes constant time per character and uses O(d log m) words of space, where m is the length of the longest pattern in the dictionary. In the case where the alphabet size is not constant, we introduce two new randomized Monte-Carlo algorithms with the following complexities: * O(log log |Sigma|) time per character in the worst case and O(d log m) words of space. * O(1/epsilon) time per character in the worst case and O(d |Sigma|^epsilon log m/epsilon) words of space for any 0<epsilon<= 1. These results improve upon the algorithm of [Clifford et al., ESA\u2715] which uses O(d log m) words of space and takes O(log log (m+d)) time per character

Dagstuhl Research Online Publication Server

Streaming Pattern Matching with d Wildcards

Author: Golan Shay
Kopelowitz Tsvi
Porat Ely
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 24th Annual European Symposium on Algorithms (ESA 2016)
Publication date: 01/01/2016
Field of study

In the pattern matching with d wildcards problem we are given a text T of length n and a pattern P of length m that contains d wildcard characters, each denoted by a special symbol \u27?\u27. A wildcard character matches any other character. The goal is to establish for each m-length substring of T whether it matches P. In the streaming model variant of the pattern matching with d wildcards problem the text T arrives one character at a time and the goal is to report, before the next character arrives, if the last m characters match P while using only o(m) words of space. In this paper we introduce two new algorithms for the d wildcard pattern matching problem in the streaming model. The first is a randomized Monte Carlo algorithm that is parameterized by a constant 0<=delta<=1. This algorithm uses ~O(d^{1-delta}) amortized time per character and ~O(d^{1+delta}) words of space. The second algorithm, which is used as a black box in the first algorithm, is a randomized Monte Carlo algorithm which uses O(d+log m) worst-case time per character and O(d log m) words of space

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Towards Optimal Approximate Streaming Pattern Matching by Matching Multiple Patterns in Multiple Streams

Author: Golan Shay
Kopelowitz Tsvi
Porat Ely
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)
Publication date: 01/01/2018
Field of study

Recently, there has been a growing focus in solving approximate pattern matching problems in the streaming model. Of particular interest are the pattern matching with k-mismatches (KMM) problem and the pattern matching with w-wildcards (PMWC) problem. Motivated by reductions from these problems in the streaming model to the dictionary matching problem, this paper focuses on designing algorithms for the dictionary matching problem in the multi-stream model where there are several independent streams of data (as opposed to just one in the streaming model), and the memory complexity of an algorithm is expressed using two quantities: (1) a read-only shared memory storage area which is shared among all the streams, and (2) local stream memory that each stream stores separately. In the dictionary matching problem in the multi-stream model the goal is to preprocess a dictionary D={P_1,P_2,...,P_d} of d=|D| patterns (strings with maximum length m over alphabet Sigma) into a data structure stored in shared memory, so that given multiple independent streaming texts (where characters arrive one at a time) the algorithm reports occurrences of patterns from D in each one of the texts as soon as they appear. We design two efficient algorithms for the dictionary matching problem in the multi-stream model. The first algorithm works when all the patterns in D have the same length m and costs O(d log m) words in shared memory, O(log m log d) words in stream memory, and O(log m) time per character. The second algorithm works for general D, but the time cost per character becomes O(log m+log d log log d). We also demonstrate the usefulness of our first algorithm in solving both the KMM problem and PMWC problem in the streaming model. In particular, we obtain the first almost optimal (up to poly-log factors) algorithm for the PMWC problem in the streaming model. We also design a new algorithm for the KMM problem in the streaming model that, up to poly-log factors, has the same bounds as the most recent results that use different techniques. Moreover, for most inputs, our algorithm for KMM is significantly faster on average

Dagstuhl Research Online Publication Server

Locally Consistent Parsing for Text Indexing in Small Space

Author: Birenzwige Or
Golan Shay
Porat Ely
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2020
Field of study

We consider two closely related problems of text indexing in a sub-linear working space. The first problem is the Sparse Suffix Tree (SST) construction of a set of suffixes

B

using only

O(|B|)

words of space. The second problem is the Longest Common Extension (LCE) problem, where for some parameter

1\le\tau\le n

, the goal is to construct a data structure that uses

O(\frac {n}{\tau})

words of space and can compute the longest common prefix length of any pair of suffixes. We show how to use ideas based on the Locally Consistent Parsing technique, that was introduced by Sahinalp and Vishkin [STOC '94], in some non-trivial ways in order to improve the known results for the above problems. We introduce new Las-Vegas and deterministic algorithms for both problems. We introduce the first Las-Vegas SST construction algorithm that takes

O(n)

time. This is an improvement over the last result of Gawrychowski and Kociumaka [SODA '17] who obtained

O(n)

time for Monte-Carlo algorithm, and

O(n\sqrt{\log |B|})

time for Las-Vegas algorithm. In addition, we introduce a randomized Las-Vegas construction for an LCE data structure that can be constructed in linear time and answers queries in

O(\tau)

time. For the deterministic algorithms, we introduce an SST construction algorithm that takes

O(n\log \frac{n}{|B|})

time (for

|B|=\Omega(\log n)

). This is the first almost linear time,

O(n\cdot poly\log{n})

, deterministic SST construction algorithm, where all previous algorithms take at least

\Omega\left(\min\{n|B|,\frac{n^2}{|B|}\}\right)

time. For the LCE problem, we introduce a data structure that answers LCE queries in

O(\tau\sqrt{\log^*n})

time, with

O(n\log\tau)

construction time (for

\tau=O(\frac{n}{\log n})

). This data structure improves both query time and construction time upon the results of Tanimura et al. [CPM '16].Comment: Extended abstract to appear is SODA 202

arXiv.org e-Print Archive

Crossref

The Streaming k-Mismatch Problem: Tradeoffs Between Space and Total Time

Author: Golan Shay
Kociumaka Tomasz
Kopelowitz Tsvi
Porat Ely
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)
Publication date: 01/01/2020
Field of study

We revisit the

k

-mismatch problem in the streaming model on a pattern of length

m

and a streaming text of length

n

, both over a size-

\sigma

alphabet. The current state-of-the-art algorithm for the streaming

k

-mismatch problem, by Clifford et al. [SODA 2019], uses

\tilde O(k)

space and

\tilde O\big(\sqrt k\big)

worst-case time per character. The space complexity is known to be (unconditionally) optimal, and the worst-case time per character matches a conditional lower bound. However, there is a gap between the total time cost of the algorithm, which is

\tilde O(n\sqrt k)

, and the fastest known offline algorithm, which costs

\tilde O\big(n + \min\big(\frac{nk}{\sqrt m},\sigma n\big)\big)

time. Moreover, it is not known whether improvements over the

\tilde O(n\sqrt k)

total time are possible when using more than

O(k)

space. We address these gaps by designing a randomized streaming algorithm for the

k

-mismatch problem that, given an integer parameter

k\le s \le m

, uses

\tilde O(s)

space and costs

\tilde O\big(n+\min\big(\frac {nk^2}m,\frac{nk}{\sqrt s},\frac{\sigma nm}s\big)\big)

total time. For

s=m

, the total runtime becomes

\tilde O\big(n + \min\big(\frac{nk}{\sqrt m},\sigma n\big)\big)

, which matches the time cost of the fastest offline algorithm. Moreover, the worst-case time cost per character is still

\tilde O\big(\sqrt k\big)

.Comment: Extended abstract to appear in CPM 202

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Reference ranges for left modified myocardial performance index: Systematic review and meta-analysis

Author: Bligard Katherine H
Doering Michelle
Porat Shay
Rosenbloom Joshua I
Publication venue: Digital Commons@Becker
Publication date: 01/01/2024
Field of study

INTRODUCTION: The modified myocardial performance index (mod-MPI) is a noninvasive Doppler-derived metric used to evaluate fetal cardiac function. However, the reference ranges for mod-MPI in normal fetuses are not clearly defined, which limits the use of this technology in fetuses with potential cardiac compromise. Thus, we aimed to perform a systematic review and meta-analysis of published mod-MPI reference ranges across gestation. METHODS: The published literature was systematically searched, and all published articles in any language that provided values for the left ventricular mod-MPI obtained in low-risk, singleton fetuses were considered eligible for further review. All retrieved titles and abstracts were independently reviewed by two researchers. Mean and standard deviation by gestational week was extracted or calculated from published data. DerSimonian-Laird random-effects models were used to estimate pooled means and 95% confidence intervals (CIs). RESULTS: The search resulted in 618 unique citations, of which 583 did not meet inclusion criteria, leaving 35 abstracts selected for full-text review. Review of the references of these 35 articles identified another 5 studies of interest. Of the 40 articles reviewed, six met inclusion criteria. There was significant heterogeneity seen in the mod-MPI results reported. Mod-MPI increased as pregnancy progressed in all studies. The pooled mean mod-MPI at 11 weeks\u27 gestation was 0.400 (95% CI 0.374-0.426) and increased to 0.585 (95% CI 0.533-0.637) at 41 weeks\u27 gestation. The increase was linear in 5 of 6 studies, while in 1 study, the mod-MPI was stable until 27 weeks\u27 gestation, and then increased throughout the third trimester. Despite all having trends increasing over pregnancy, there was no study in which all the weekly means fell within the pooled 95% CI. CONCLUSION: While mod-MPI does increase over gestation, the true reference ranges for fetuses remain elusive. Future efforts to further optimize calculation of time intervals possibly via automation are desperately needed to allow for reproducibility of this potentially very useful tool to assess fetal cardiac function

Digital Commons@Becker

Improved Circular k-Mismatch Sketches

Author: Golan Shay
Kociumaka Tomasz
Kopelowitz Tsvi
Porat Ely
Uzna?ski Przemys?aw
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)
Publication date: 01/01/2020
Field of study

The shift distance

\mathsf{sh}(S_1,S_2)

between two strings

S_1

and

S_2

of the same length is defined as the minimum Hamming distance between

S_1

and any rotation (cyclic shift) of

S_2

. We study the problem of sketching the shift distance, which is the following communication complexity problem: Strings

S_1

and

S_2

of length

n

are given to two identical players (encoders), who independently compute sketches (summaries)

\mathtt{sk}(S_1)

and

\mathtt{sk}(S_2)

, respectively, so that upon receiving the two sketches, a third player (decoder) is able to compute (or approximate)

\mathsf{sh}(S_1,S_2)

with high probability. This paper primarily focuses on the more general

k

-mismatch version of the problem, where the decoder is allowed to declare a failure if

\mathsf{sh}(S_1,S_2)>k

, where

k

is a parameter known to all parties. Andoni et al. (STOC'13) introduced exact circular

k

-mismatch sketches of size

\widetilde{O}(k+D(n))

, where

D(n)

is the number of divisors of

n

. Andoni et al. also showed that their sketch size is optimal in the class of linear homomorphic sketches. We circumvent this lower bound by designing a (non-linear) exact circular

k

-mismatch sketch of size

\widetilde{O}(k)

; this size matches communication-complexity lower bounds. We also design

(1\pm \varepsilon)

-approximate circular

k

-mismatch sketch of size

\widetilde{O}(\min(\varepsilon^{-2}\sqrt{k}, \varepsilon^{-1.5}\sqrt{n}))

, which improves upon an

\widetilde{O}(\varepsilon^{-2}\sqrt{n})

-size sketch of Crouch and McGregor (APPROX'11)

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Sleep Apnea and Fetal Growth Restriction (SAFER) study: Protocol for a pragmatic randomised clinical trial of positive airway pressure as an antenatal therapy for fetal growth restriction in maternal obstructive sleep apnoea

Author: Ben Abdallah Arbi
Carter Ebony
Ginosar Yehuda
Hincker Alex
Ju Yo-El S
Karan Suzanne
Lockhart Ellen M
Nadler Jacob
Porat Shay
Warner Barbara
Wilson Elizabeth
Publication venue: Digital Commons@Becker
Publication date: 01/06/2021
Field of study

INTRODUCTION: Fetal growth restriction (FGR) is a major contributor to fetal and neonatal morbidity and mortality with intrauterine, neonatal and lifelong complications. This study explores maternal obstructive sleep apnoea (OSA) as a potentially modifiable risk factor for FGR. We hypothesise that, in pregnancies complicated by FGR, treating mothers who have OSA using positive airway pressure (PAP) will improve birth weight and neonatal outcomes. METHODS AND ANALYSIS: The Sleep Apnea and Fetal Growth Restriction study is a prospective, block-randomised, single-blinded, multicentre, pragmatic controlled trial. We enrol pregnant women aged 18-50, between 22 and 31 weeks of gestation, with established FGR based on second trimester ultrasound, who do not have other prespecified known causes of FGR (such as congenital anomalies or intrauterine infection). In stage 1, participants are screened by questionnaire for OSA risk. If OSA risk is identified, participants proceed to stage 2, where they undergo home sleep apnoea testing. Participants are determined to have OSA if they have an apnoea-hypopnoea index (AHI) ≥5 (if the oxygen desaturation index (ODI) is also ≥5) or if they have an AHI ≥10 (even if the ODI is \u3c5). These participants proceed to stage 3, where they are randomised to nightly treatment with PAP or no PAP (standard care control), which is maintained until delivery. The primary outcome is unadjusted birth weight; secondary outcomes include fetal growth velocity on ultrasound, enrolment-to-delivery interval, gestational age at delivery, birth weight corrected for gestational age, stillbirth, Apgar score, rate of admission to higher levels of care (neonatal intensive care unit or special care nursery) and length of neonatal stay. These outcomes are compared between PAP and control using intention-to-treat analysis. ETHICS AND DISSEMINATION: This study has been approved by the Institutional Review Boards at Washington University in St Louis, Missouri; Hadassah Hebrew University Medical Center, Jerusalem; and the University of Rochester, New York. Recruitment began in Washington University in November 2019 but stopped from March to November 2020 due to COVID-19. Recruitment began in Hadassah Hebrew University in March 2021, and in the University of Rochester in May 2021. Dissemination plans include presentations at scientific conferences and scientific publications. TRIAL REGISTRATION NUMBER: NCT04084990

Directory of Open Access Journals

Digital Commons@Becker

PubMed Central

Alternating Electric Fields (Tumor-Treating Fields Therapy) Can Improve Chemotherapy Treatment Efficacy in Non-Small Cell Lung Cancer Both In Vitro and In Vivo

Author: Blatt Roni
Cahal Shay
Giladi Moshe
Itzhaki Aviran
Kirson Eilon D.
Munster Michal
Onn Amir
Palti Yoram
Porat Yaara
Schneiderman Rosa S.
Voloshin Tali
Weinberg Uri
Publication venue: Published by Elsevier Inc.
Publication date: 31/10/2014
Field of study

Non-small cell lung cancer (NSCLC) is one of the leading causes of cancer-related deaths worldwide. Common treatment modalities for NSCLC include surgery, radiotherapy, chemotherapy, and, in recent years, the clinical management paradigm has evolved with the advent of targeted therapies. Despite such advances, the impact of systemic therapies for advanced disease remains modest, and as such, the prognosis for patients with NSCLC remains poor. Standard modalities are not without their respective toxicities and there is a clear need to improve both efficacy and safety for current management approaches. Tumor-treating fields (TTFields) are low-intensity, intermediate-frequency alternating electric fields that disrupt proper spindle microtubule arrangement, thereby leading to mitotic arrest and ultimately to cell death. We evaluated the effects of combining TTFields with standard chemotherapeutic agents on several NSCLC cell lines, both in vitro and in vivo. Frequency titration curves demonstrated that the inhibitory effects of TTFields were maximal at 150 kHz for all NSCLC cell lines tested, and that the addition of TTFields to chemotherapy resulted in enhanced treatment efficacy across all cell lines. We investigated the response of Lewis lung carcinoma and KLN205 squamous cell carcinoma in mice treated with TTFields in combination with pemetrexed, cisplatin, or paclitaxel and compared these to the efficacy observed in mice exposed only to the single agents. Combining TTFields with these therapeutic agents enhanced treatment efficacy in comparison with the respective single agents and control groups in all animal models. Together, these findings suggest that combining TTFields therapy with chemotherapy may provide an additive efficacy benefit in the management of NSCLC

Elsevier - Publisher Connector