850 research outputs found
Online Detection of Repetitions with Backtracking
In this paper we present two algorithms for the following problem: given a
string and a rational , detect in the online fashion the earliest
occurrence of a repetition of exponent in the string.
1. The first algorithm supports the backtrack operation removing the last
letter of the input string. This solution runs in time and
space, where is the maximal length of a string generated during the
execution of a given sequence of read and backtrack operations.
2. The second algorithm works in time and space,
where is the length of the input string and is the number of
distinct letters. This algorithm is relatively simple and requires much less
memory than the previously known solution with the same working time and space.
a string generated during the execution of a given sequence of read and
backtrack operations.Comment: 12 pages, 5 figures, accepted to CPM 201
Computing Runs on a General Alphabet
We describe a RAM algorithm computing all runs (maximal repetitions) of a
given string of length over a general ordered alphabet in
time and linear space. Our algorithm outperforms all
known solutions working in time provided , where is the alphabet size. We conjecture that there
exists a linear time RAM algorithm finding all runs.Comment: 4 pages, 2 figure
Recommended from our members
Limited-memory warping LCSS for real-time low-power pattern recognition in wireless nodes
We present and evaluate a microcontroller-optimized limited-memory implementation of a Warping Longest Common Subsequence algorithm (WarpingLCSS). It permits to spot patterns within noisy sensor data in real-time in resource constrained sensor nodes. It allows variability in the sensed system dynamics through warping; it uses only integer operations; it can be applied to various sensor modalities; and it is suitable for embedded training to recognize new patterns. We illustrate the method on 3 applications from wearable sensing and activity recognition using 3 sensor modalities: spotting the QRS complex in ECG, recognizing gestures in everyday life, and analyzing beach volleyball. We implemented the system on a low-power 8-bit AVR wireless node and a 32-bit ARM Cortex M4 microcontroller. Up to 67 or 140 10-second gestures can be recognized simultaneously in real-time from a 10Hz motion sensor on the AVR and M4 using 8mW and 10mW respectively. A single gesture spotter uses as few as 135μW on the AVR. The method allows low data rate distributed in-network recognition and we show a 100 fold data rate reduction in a complex activity recognition scenario. The versatility and low complexity of the method makes it well suited as a generic pattern recognition method and could be implemented as part of sensor front-ends
Improving Developers\u27 Understanding of Regex Denial of Service Tools through Anti-Patterns and Fix Strategies
Regular expressions are used for diverse purposes, including input validation and firewalls. Unfortunately, they can also lead to a security vulnerability called ReDoS (Regular Expression Denial of Service), caused by a super-linear worst-case execution time during regex matching. Due to the severity and prevalence of ReDoS, past work proposed automatic tools to detect and fix regexes. Although these tools were evaluated in automatic experiments, their usability has not yet been studied; usability has not been a focus of prior work. Our insight is that the usability of existing tools to detect and fix regexes will improve if we complement them with anti-patterns and fix strategies of vulnerable regexes.
We developed novel anti-patterns for vulnerable regexes, and a collection of fix strategies to fix them. We derived our anti-patterns and fix strategies from a novel theory of regex infinite ambiguity—a necessary condition for regexes vulnerable to ReDoS. We proved the soundness and completeness of our theory. We evaluated the effectiveness of our anti-patterns, both in an automatic experiment and when applied manually. Then, we evaluated how much our anti-patterns and fix strategies improve developers’ understanding of the outcome of detection and fixing tools. Our evaluation found that our anti-patterns were effective over a large dataset of regexes (N=209,188): 100% precision and 99% recall, improving the state of the art 50% precision and 87% recall. Our anti-patterns were also more effective than the state of the art when applied manually (N=20): 100% developers applied them effectively vs. 50% for the state of the art. Finally, our anti-patterns and fix strategies increased developers’ understanding using automatic tools (N=9): from median “Very weakly” to median “Strongly” when detecting vulnerabilities, and from median “Very weakly” to median “Very strongly” when fixing them
Optimising Unicode Regular Expression Evaluation with Previews
The jsre regular expression library was designed to provide fast matching of complex expressions over large input streams using user-selectable character encodings. An established design approach was used: a simulated non-deterministic automaton (NFA) implemented as a virtual machine, avoiding exponential cost functions in either space or time. A deterministic automaton (DFA) was chosen as a general dispatching mechanism for Unicode character classes and this also provided the opportunity to use compact DFAs in various optimization strategies. The result was the development of a regular expression Preview which provides a summary of all the matches possible from a given point in a regular expression in a form that can be implemented as a compact DFA and can be used to further improve the performance of the standard NFA simulation algorithm. This paper formally defines a preview and describes and evaluates several optimizations using this construct. They provide significant speed improvements accrued from fast scanning of anchor positions, avoiding retesting of repeated strings in unanchored searches, and efficient searching of multiple alternate expressions which in the case of keyword searching has a time complexity which is logarithmic in the number of words to be searched
Faster Longest Common Extension Queries in Strings over General Alphabets
Longest common extension queries (often called longest common prefix queries)
constitute a fundamental building block in multiple string algorithms, for
example computing runs and approximate pattern matching. We show that a
sequence of LCE queries for a string of size over a general ordered
alphabet can be realized in time making only
symbol comparisons. Consequently, all runs in a string over a general
ordered alphabet can be computed in time making
symbol comparisons. Our results improve upon a solution by Kosolobov
(Information Processing Letters, 2016), who gave an algorithm with running time and conjectured that time is possible. We
make a significant progress towards resolving this conjecture. Our techniques
extend to the case of general unordered alphabets, when the time increases to
. The main tools are difference covers and the
disjoint-sets data structure.Comment: Accepted to CPM 201
A Feasibility Study on the Application of the ScriptGenE Framework as an Anomaly Detection System in Industrial Control Systems
Recent events such as Stuxnet and the Shamoon Aramco have brought to light how vulnerable industrial control systems (ICSs) are to cyber attacks. Modern society relies heavily on critical infrastructure, including the electric power grid, water treatment facilities, and nuclear energy plants. Malicious attempts to disrupt, destroy and disable such systems can have devastating effects on a populations way of life, possibly leading to loss of life. The need to implement security controls in the ICS environment is more vital than ever. ICSs were not originally designed with network security in mind. Today, intrusion detection systems are employed to detect attacks that penetrate the ICS network. This research proposes the use of a novel algorithm known as the ScriptGenE framework as an anomaly-based intrusion detection system. The anomaly detection system (ADS) is implemented between an engineering workstation and programmable logic controller to monitor traffic and alert the operator to anomalous behavior. The ADS achieves true positive rates of 0.9011 and 1.00 with false positive rates of 0 and 0.054. This research demonstrates the viability of using the ScriptGenE framework as an anomaly detection system in a simulated ICS environment
Coarse-graining in retrodictive quantum state tomography
Quantum state tomography often operates in the highly idealised scenario of
assuming perfect measurements. The errors implied by such an approach are
entwined with other imperfections relating to the information processing
protocol or application of interest. We consider the problem of retrodicting
the quantum state of a system, existing prior to the application of random but
known phase errors, allowing those errors to be separated and removed. The
continuously random nature of the errors implies that there is only one click
per measurement outcome -- a feature having a drastically adverse effect on
data-processing times. We provide a thorough analysis of coarse-graining under
various reconstruction algorithms, finding dramatic increases in speed for only
modest sacrifices in fidelity
Time-Independent Planning for Multiple Moving Agents
Typical Multi-agent Path Finding (MAPF) solvers assume that agents move
synchronously, thus neglecting the reality gap in timing assumptions, e.g.,
delays caused by an imperfect execution of asynchronous moves. So far, two
policies enforce a robust execution of MAPF plans taken as input: either by
forcing agents to synchronize or by executing plans while preserving temporal
dependencies. This paper proposes an alternative approach, called
time-independent planning, which is both online and distributed. We represent
reality as a transition system that changes configurations according to atomic
actions of agents, and use it to generate a time-independent schedule.
Empirical results in a simulated environment with stochastic delays of agents'
moves support the validity of our proposal.Comment: 10 pages, 5 figures, to be presented at AAAI-21, Feb 2021, Virtual
Conferenc
- …