430 research outputs found
Hardness of longest common subsequence for sequences with bounded run-lengths
International audienceThe longest common subsequence (LCS) problem is a classic and well-studied problem in computer science with extensive applications in diverse areas ranging from spelling error corrections to molecular biology. This paper focuses on LCS for fixed alphabet size and fixed run-lengths (i.e., maximum number of consecutive occurrences of the same symbol). We show that LCS is NP-complete even when restricted to (i) alphabets of size 3 and run-length at most 1, and (ii) alphabets of size 2 and run-length at most 2 (both results are tight). For the latter case, we show that the problem is approximable within ratio 3/5
Faster STR-IC-LCS Computation via RLE
The constrained LCS problem asks one to find a longest common subsequence of two input strings A and B with some constraints. The STR-IC-LCS problem is a variant of the constrained LCS problem, where the solution must include a given constraint string C as a substring. Given two strings A and B of respective lengths M and N, and a constraint string C of length at most min{M, N}, the best known algorithm for the STR-IC-LCS problem, proposed by Deorowicz (Inf. Process. Lett., 11:423-426, 2012), runs in O(MN) time. In this work, we present an O(mN + nM)-time solution to the STR-IC-LCS problem, where m and n denote the sizes of the run-length encodings of A and B, respectively. Since m <= M and n <= N always hold, our algorithm is always as fast as Deorowicz\u27s algorithm, and is faster when input strings are compressible via RLE
On space efficiency of algorithms working on structural decompositions of graphs
Dynamic programming on path and tree decompositions of graphs is a technique
that is ubiquitous in the field of parameterized and exponential-time
algorithms. However, one of its drawbacks is that the space usage is
exponential in the decomposition's width. Following the work of Allender et al.
[Theory of Computing, '14], we investigate whether this space complexity
explosion is unavoidable. Using the idea of reparameterization of Cai and
Juedes [J. Comput. Syst. Sci., '03], we prove that the question is closely
related to a conjecture that the Longest Common Subsequence problem
parameterized by the number of input strings does not admit an algorithm that
simultaneously uses XP time and FPT space. Moreover, we complete the complexity
landscape sketched for pathwidth and treewidth by Allender et al. by
considering the parameter tree-depth. We prove that computations on tree-depth
decompositions correspond to a model of non-deterministic machines that work in
polynomial time and logarithmic space, with access to an auxiliary stack of
maximum height equal to the decomposition's depth. Together with the results of
Allender et al., this describes a hierarchy of complexity classes for
polynomial-time non-deterministic machines with different restrictions on the
access to working space, which mirrors the classic relations between treewidth,
pathwidth, and tree-depth.Comment: An extended abstract appeared in the proceedings of STACS'16. The new
version is augmented with a space-efficient algorithm for Dominating Set
using the Chinese remainder theore
Multivariate Fine-Grained Complexity of Longest Common Subsequence
We revisit the classic combinatorial pattern matching problem of finding a
longest common subsequence (LCS). For strings and of length , a
textbook algorithm solves LCS in time , but although much effort has
been spent, no -time algorithm is known. Recent work
indeed shows that such an algorithm would refute the Strong Exponential Time
Hypothesis (SETH) [Abboud, Backurs, Vassilevska Williams + Bringmann,
K\"unnemann FOCS'15].
Despite the quadratic-time barrier, for over 40 years an enduring scientific
interest continued to produce fast algorithms for LCS and its variations.
Particular attention was put into identifying and exploiting input parameters
that yield strongly subquadratic time algorithms for special cases of interest,
e.g., differential file comparison. This line of research was successfully
pursued until 1990, at which time significant improvements came to a halt. In
this paper, using the lens of fine-grained complexity, our goal is to (1)
justify the lack of further improvements and (2) determine whether some special
cases of LCS admit faster algorithms than currently known.
To this end, we provide a systematic study of the multivariate complexity of
LCS, taking into account all parameters previously discussed in the literature:
the input size , the length of the shorter string
, the length of an LCS of and , the numbers of
deletions and , the alphabet size, as well as
the numbers of matching pairs and dominant pairs . For any class of
instances defined by fixing each parameter individually to a polynomial in
terms of the input size, we prove a SETH-based lower bound matching one of
three known algorithms. Specifically, we determine the optimal running time for
LCS under SETH as .
[...]Comment: Presented at SODA'18. Full Version. 66 page
Efficient and Reliable Task Scheduling, Network Reprogramming, and Data Storage for Wireless Sensor Networks
Wireless sensor networks (WSNs) typically consist of a large number of resource-constrained nodes. The limited computational resources afforded by these nodes present unique development challenges. In this dissertation, we consider three such challenges. The first challenge focuses on minimizing energy usage in WSNs through intelligent duty cycling. Limited energy resources dictate the design of many embedded applications, causing such systems to be composed of small, modular tasks, scheduled periodically. In this model, each embedded device wakes, executes a task-set, and returns to sleep. These systems spend most of their time in a state of deep sleep to minimize power consumption. We refer to these systems as almost-always-sleeping (AAS) systems. We describe a series of task schedulers for AAS systems designed to maximize sleep time. We consider four scheduler designs, model their performance, and present detailed performance analysis results under varying load conditions. The second challenge focuses on a fast and reliable network reprogramming solution for WSNs based on incremental code updates. We first present VSPIN, a framework for developing incremental code update mechanisms to support efficient reprogramming of WSNs. VSPIN provides a modular testing platform on the host system to plug-in and evaluate various incremental code update algorithms. The framework supports Avrdude, among the most popular Linux-based programming tools for AVR microcontrollers. Using VSPIN, we next present an incremental code update strategy to efficiently reprogram wireless sensor nodes. We adapt a linear space and quadratic time algorithm (Hirschberg\u27s Algorithm) for computing maximal common subsequences to build an edit map specifying an edit sequence required to transform the code running in a sensor network to a new code image. We then present a heuristic-based optimization strategy for efficient edit script encoding to reduce the edit map size. Finally, we present experimental results exploring the reduction in data size that it enables. The approach achieves reductions of 99.987% for simple changes, and between 86.95% and 94.58% for more complex changes, compared to full image transmissions - leading to significantly lower energy costs for wireless sensor network reprogramming. The third challenge focuses on enabling fast and reliable data storage in wireless sensor systems. A file storage system that is fast, lightweight, and reliable across device failures is important to safeguard the data that these devices record. A fast and efficient file system enables sensed data to be sampled and stored quickly and batched for later transmission. A reliable file system allows seamless operation without disruptions due to hardware, software, or other unforeseen failures. While flash technology provides persistent storage by itself, it has limitations that prevent it from being used in mission-critical deployment scenarios. Hybrid memory models which utilize newer non-volatile memory technologies, such as ferroelectric RAM (FRAM), can mitigate the physical disadvantages of flash. In this vein, we present the design and implementation of LoggerFS, a fast, lightweight, and reliable file system for wireless sensor networks, which uses a hybrid memory design consisting of RAM, FRAM, and flash. LoggerFS is engineered to provide fast data storage, have a small memory footprint, and provide data reliability across system failures. LoggerFS adapts a log-structured file system approach, augmented with data persistence and reliability guarantees. A caching mechanism allows for flash wear-leveling and fast data buffering. We present a performance evaluation of LoggerFS using a prototypical in-situ sensing platform and demonstrate between 50% and 800% improvements for various workloads using the FRAM write-back cache over the implementation without the cache
Multivariate Fine-Grained Complexity of Longest Common Subsequence
We revisit the classic combinatorial pattern matching problem of finding a longest common subsequence (LCS). For strings and of length , a textbook algorithm solves LCS in time , but although much effort has been spent, no -time algorithm is known. Recent work indeed shows that such an algorithm would refute the Strong Exponential Time Hypothesis (SETH) [Abboud, Backurs, Vassilevska Williams + Bringmann, K\"unnemann FOCS'15]. Despite the quadratic-time barrier, for over 40 years an enduring scientific interest continued to produce fast algorithms for LCS and its variations. Particular attention was put into identifying and exploiting input parameters that yield strongly subquadratic time algorithms for special cases of interest, e.g., differential file comparison. This line of research was successfully pursued until 1990, at which time significant improvements came to a halt. In this paper, using the lens of fine-grained complexity, our goal is to (1) justify the lack of further improvements and (2) determine whether some special cases of LCS admit faster algorithms than currently known. To this end, we provide a systematic study of the multivariate complexity of LCS, taking into account all parameters previously discussed in the literature: the input size , the length of the shorter string , the length of an LCS of and , the numbers of deletions and , the alphabet size, as well as the numbers of matching pairs and dominant pairs . For any class of instances defined by fixing each parameter individually to a polynomial in terms of the input size, we prove a SETH-based lower bound matching one of three known algorithms. Specifically, we determine the optimal running time for LCS under SETH as . [...
- …