32,577 research outputs found

    Variants of Constrained Longest Common Subsequence

    Full text link
    In this work, we consider a variant of the classical Longest Common Subsequence problem called Doubly-Constrained Longest Common Subsequence (DC-LCS). Given two strings s1 and s2 over an alphabet A, a set C_s of strings, and a function Co from A to N, the DC-LCS problem consists in finding the longest subsequence s of s1 and s2 such that s is a supersequence of all the strings in Cs and such that the number of occurrences in s of each symbol a in A is upper bounded by Co(a). The DC-LCS problem provides a clear mathematical formulation of a sequence comparison problem in Computational Biology and generalizes two other constrained variants of the LCS problem: the Constrained LCS and the Repetition-Free LCS. We present two results for the DC-LCS problem. First, we illustrate a fixed-parameter algorithm where the parameter is the length of the solution. Secondly, we prove a parameterized hardness result for the Constrained LCS problem when the parameter is the number of the constraint strings and the size of the alphabet A. This hardness result also implies the parameterized hardness of the DC-LCS problem (with the same parameters) and its NP-hardness when the size of the alphabet is constant

    Learning Mazes with Aliasing States: An LCS Algorithm with Associative Perception

    Get PDF
    Learning classifier systems (LCSs) belong to a class of algorithms based on the principle of self-organization and have frequently been applied to the task of solving mazes, an important type of reinforcement learning (RL) problem. Maze problems represent a simplified virtual model of real environments that can be used for developing core algorithms of many real-world applications related to the problem of navigation. However, the best achievements of LCSs in maze problems are still mostly bounded to non-aliasing environments, while LCS complexity seems to obstruct a proper analysis of the reasons of failure. We construct a new LCS agent that has a simpler and more transparent performance mechanism, but that can still solve mazes better than existing algorithms. We use the structure of a predictive LCS model, strip out the evolutionary mechanism, simplify the reinforcement learning procedure and equip the agent with the ability of associative perception, adopted from psychology. To improve our understanding of the nature and structure of maze environments, we analyze mazes used in research for the last two decades, introduce a set of maze complexity characteristics, and develop a set of new maze environments. We then run our new LCS with associative perception through the old and new aliasing mazes, which represent partially observable Markov decision problems (POMDP) and demonstrate that it performs at least as well as, and in some cases better than, other published systems

    Stable and Efficient Networks with Farsighted Players: the Largest Consistent Set

    Get PDF
    In this paper we study strategic formation of bilateral networks with farsighted players in the classic framework of Jackson and Wolinsky (1996). We use the largest consistent set (LCS)(Chwe (1994)) as the solution concept for stability. We show that there exists a value function such that for every component balanced and anonymous allocation rule, the corresponding LCS does not contain any strongly efficient network. Using Pareto efficiency, a weaker concept of efficiency, we get a more positive result. However, then also, at least one environment of networks (with a component balanced and anonymous allocation rule) exists for which the largest consistent set does not contain any Pareto efficient network. These confirm that the well-known problem of the incompatibility between the set of stable networks and the set of efficient networks persists even in the environment with farsighted players. Next we study some possibilities of resolving this incompatibility.networks, farsighted, largest consistent set

    Fluctuations of the Longest Common Subsequence for Sequences of Independent Blocks

    Full text link
    The problem of the fluctuation of the Longest Common Subsequence (LCS) of two i.i.d. sequences of length n>0n>0 has been open for decades. There exist contradicting conjectures on the topic. Chvatal and Sankoff conjectured in 1975 that asymptotically the order should be n2/3n^{2/3}, while Waterman conjectured in 1994 that asymptotically the order should be nn. A contiguous substring consisting only of one type of symbol is called a block. In the present work, we determine the order of the fluctuation of the LCS for a special model of sequences consisting of i.i.d. blocks whose lengths are uniformly distributed on the set {l−1,l,l+1}\{l-1,l,l+1\}, with ll a given positive integer. We showed that the fluctuation in this model is asymptotically of order nn, which confirm Waterman's conjecture. For achieving this goal, we developed a new method which allows us to reformulate the problem of the order of the variance as a (relatively) low dimensional optimization problem.Comment: PDFLatex, 40 page

    The Longest Common Subsequence via Generalized Suffix Trees

    Get PDF
    Given two strings S1 and S 2, finding the longest common subsequence (LCS) is a classical problem in computer science. Many algorithms have been proposed to find the longest common subsequence between two strings. The most common and widely used method is the dynamic programming approach, which runs in quadratic time and takes quadratic space. Other algorithms have been introduced later to solve the LCS problem in less time and space. In this work, we present a new algorithm to find the longest common subsequence using the generalized suffix tree and directed acyclic graph.;The Generalized suffix tree (GST) is the combined suffix tree for a set of strings {lcub}S1, S 2, ..., Sn{rcub}. Both the suffix tree and the generalized suffix tree can be calculated in linear time and linear space. One application for generalized suffix tree is to find the longest common substring between two strings. But finding the longest common subsequence is not straight forward using the generalized suffix tree. Here we describe how we can use the GST to find the common substrings between two strings and introduce a new approach to calculate the longest common subsequence (LCS) from the common substrings. This method takes a different view at the LCS problem, shading more light at novel applications of the LCS. We also show how this method can motivate the development of new compression techniques for genome resequencing data

    Sens-BERT: Enabling Transferability and Re-calibration of Calibration Models for Low-cost Sensors under Reference Measurements Scarcity

    Full text link
    Low-cost sensors measurements are noisy, which limits large-scale adaptability in airquality monitoirng. Calibration is generally used to get good estimates of air quality measurements out from LCS. In order to do this, LCS sensors are typically co-located with reference stations for some duration. A calibration model is then developed to transfer the LCS sensor measurements to the reference station measurements. Existing works implement the calibration of LCS as an optimization problem in which a model is trained with the data obtained from real-time deployments; later, the trained model is employed to estimate the air quality measurements of that location. However, this approach is sensor-specific and location-specific and needs frequent re-calibration. The re-calibration also needs massive data like initial calibration, which is a cumbersome process in practical scenarios. To overcome these limitations, in this work, we propose Sens-BERT, a BERT-inspired learning approach to calibrate LCS, and it achieves the calibration in two phases: self-supervised pre-training and supervised fine-tuning. In the pre-training phase, we train Sens-BERT with only LCS data (without reference station observations) to learn the data distributional features and produce corresponding embeddings. We then use the Sens-BERT embeddings to learn a calibration model in the fine-tuning phase. Our proposed approach has many advantages over the previous works. Since the Sens-BERT learns the behaviour of the LCS, it can be transferable to any sensor of the same sensing principle without explicitly training on that sensor. It requires only LCS measurements in pre-training to learn the characters of LCS, thus enabling calibration even with a tiny amount of paired data in fine-tuning. We have exhaustively tested our approach with the Community Air Sensor Network (CAIRSENSE) data set, an open repository for LCS.Comment: 1
    • …
    corecore