277 research outputs found
Online Pattern Matching for String Edit Distance with Moves
Edit distance with moves (EDM) is a string-to-string distance measure that
includes substring moves in addition to ordinal editing operations to turn one
string to the other. Although optimizing EDM is intractable, it has many
applications especially in error detections. Edit sensitive parsing (ESP) is an
efficient parsing algorithm that guarantees an upper bound of parsing
discrepancies between different appearances of the same substrings in a string.
ESP can be used for computing an approximate EDM as the L1 distance between
characteristic vectors built by node labels in parsing trees. However, ESP is
not applicable to a streaming text data where a whole text is unknown in
advance. We present an online ESP (OESP) that enables an online pattern
matching for EDM. OESP builds a parse tree for a streaming text and computes
the L1 distance between characteristic vectors in an online manner. For the
space-efficient computation of EDM, OESP directly encodes the parse tree into a
succinct representation by leveraging the idea behind recent results of a
dynamic succinct tree. We experimentally test OESP on the ability to compute
EDM in an online manner on benchmark datasets, and we show OESP's efficiency.Comment: This paper has been accepted to the 21st edition of the International
Symposium on String Processing and Information Retrieval (SPIRE2014
Damage detection and quantification in composite beam structure using strain energy and vibration data
10.1088/1742-6596/842/1/012027Journal of Physics: Conference Series84211202
Fast Searching in Packed Strings
Given strings and the (exact) string matching problem is to find all
positions of substrings in matching . The classical Knuth-Morris-Pratt
algorithm [SIAM J. Comput., 1977] solves the string matching problem in linear
time which is optimal if we can only read one character at the time. However,
most strings are stored in a computer in a packed representation with several
characters in a single word, giving us the opportunity to read multiple
characters simultaneously. In this paper we study the worst-case complexity of
string matching on strings given in packed representation. Let be
the lengths and , respectively, and let denote the size of the
alphabet. On a standard unit-cost word-RAM with logarithmic word size we
present an algorithm using time O\left(\frac{n}{\log_\sigma n} + m +
\occ\right). Here \occ is the number of occurrences of in . For this improves the bound of the Knuth-Morris-Pratt algorithm.
Furthermore, if our algorithm is optimal since any
algorithm must spend at least \Omega(\frac{(n+m)\log
\sigma}{\log n} + \occ) = \Omega(\frac{n}{\log_\sigma n} + \occ) time to
read the input and report all occurrences. The result is obtained by a novel
automaton construction based on the Knuth-Morris-Pratt algorithm combined with
a new compact representation of subautomata allowing an optimal
tabulation-based simulation.Comment: To appear in Journal of Discrete Algorithms. Special Issue on CPM
200
Efficient LZ78 factorization of grammar compressed text
We present an efficient algorithm for computing the LZ78 factorization of a
text, where the text is represented as a straight line program (SLP), which is
a context free grammar in the Chomsky normal form that generates a single
string. Given an SLP of size representing a text of length , our
algorithm computes the LZ78 factorization of in time
and space, where is the number of resulting LZ78 factors.
We also show how to improve the algorithm so that the term in the
time and space complexities becomes either , where is the length of the
longest LZ78 factor, or where is a quantity
which depends on the amount of redundancy that the SLP captures with respect to
substrings of of a certain length. Since where
is the alphabet size, the latter is asymptotically at least as fast as
a linear time algorithm which runs on the uncompressed string when is
constant, and can be more efficient when the text is compressible, i.e. when
and are small.Comment: SPIRE 201
Fingerprints in Compressed Strings
The Karp-Rabin fingerprint of a string is a type of hash value that due to its strong properties has been used in many string algorithms. In this paper we show how to construct a data structure for a string S of size N compressed by a context-free grammar of size n that answers fingerprint queries. That is, given indices i and j, the answer to a query is the fingerprint of the substring S[i,j]. We present the first O(n) space data structures that answer fingerprint queries without decompressing any characters. For Straight Line Programs (SLP) we get O(logN) query time, and for Linear SLPs (an SLP derivative that captures LZ78 compression and its variations) we get O(log log N) query time. Hence, our data structures has the same time and space complexity as for random access in SLPs. We utilize the fingerprint data structures to solve the longest common extension problem in query time O(log N log l) and O(log l log log l + log log N) for SLPs and Linear SLPs, respectively. Here, l denotes the length of the LCE
The Nexus of Political Violence and Economic Deprivation: Pakistani Migrants Disrupt the Refugee / Migrant Dichotomy
There have been discussions about how the labels “forced migrants,” related to political violence, and “voluntary migrants,” associated with economic factors, cannot be understood in categorical ways. However, there has been less focus on the specificities of the asylum-migrant nexus from the perspective of migrants. This essay discusses how such factors intersect as understood by Pakistani migrants residing in Germany. Through enacting a critical view of Pakistan, the migrants demonstrate how aspects of corruption, economic deprivation, and political violence come to intersect so that is becomes impossible to classify asylum seekers in binary/dichotomous ways
Cryptosporidium Priming Is More Effective than Vaccine for Protection against Cryptosporidiosis in a Murine Protein Malnutrition Model
Cryptosporidium is a major cause of severe diarrhea, especially in malnourished children. Using a murine model of C. parvum oocyst challenge that recapitulates clinical features of severe cryptosporidiosis during malnutrition, we interrogated the effect of protein malnutrition (PM) on primary and secondary responses to C. parvum challenge, and tested the differential ability of mucosal priming strategies to overcome the PM-induced susceptibility. We determined that while PM fundamentally alters systemic and mucosal primary immune responses to Cryptosporidium, priming with C. parvum (106 oocysts) provides robust protective immunity against re-challenge despite ongoing PM. C. parvum priming restores mucosal Th1-type effectors (CD3+CD8+CD103+ T-cells) and cytokines (IFNγ, and IL12p40) that otherwise decrease with ongoing PM. Vaccination strategies with Cryptosporidium antigens expressed in the S. Typhi vector 908htr, however, do not enhance Th1-type responses to C. parvum challenge during PM, even though vaccination strongly boosts immunity in challenged fully nourished hosts. Remote non-specific exposures to the attenuated S. Typhi vector alone or the TLR9 agonist CpG ODN-1668 can partially attenuate C. parvum severity during PM, but neither as effectively as viable C. parvum priming. We conclude that although PM interferes with basal and vaccine-boosted immune responses to C. parvum, sustained reductions in disease severity are possible through mucosal activators of host defenses, and specifically C. parvum priming can elicit impressively robust Th1-type protective immunity despite ongoing protein malnutrition. These findings add insight into potential correlates of Cryptosporidium immunity and future vaccine strategies in malnourished children
How generalist are these forest specialists? What Sweden's avian indicators indicate
Monitoring of forest biodiversity and habitats is an important part of forest conservation, but due to the impossible task of monitoring all species, indicator species are frequently used. However, reliance on an incorrect indicator of valuable habitat can reduce the efficiency of conservation efforts. Birds are often used as indicators as they are charismatic, relatively easy to survey, and because we often have knowledge of their habitat and resource requirements. In the Swedish government's environmental quality goals, there are a number of bird species identified as being associated with 'older' and 'high natural value' forests. Here we evaluate the occurrence of four of these indicator species using data from 91 production forest stands and 10 forest reserves in southern Sweden. The bird species assessed are willow tit Poecile montanus, coal tit Periparus ater, European crested tit Lophophanes cristatus and Eurasian treecreeper Certhia familiaris. For the production stands assessed, these indicator species exhibited no significant preferences regarding forest composition and structure, indicating a wider range of habitat associations than expected. These species frequently showed territorial behavior in forest stands <60 and even 40 years of age; much younger than the 120-year threshold for 'older forest' as defined by governmental environmental goals. As almost 80% of the production stands >= 10 years old included at least one of the four indicator species, this raises questions regarding the suitability of these species as indictors of forests of high conservational value in southern Sweden. Notably, besides the four species assessed here, none of the additional indicator taxa identified by the government, were recorded in the 10 reserves. This outcome may reflect the difficulties involved in finding bird indicator species indicative of high natural values in this region. Our results highlight the importance of coupling bird surveys with quantified assessments of proximate vegetation cover
- …