Search CORE

551 research outputs found

Quickest Online Selection of an Increasing Subsequence of Specified Size

Author: Arlotto Alessandro
Mossel Elchanan
Steele J. Michael
Publication venue: 'Wiley'
Publication date: 09/08/2015
Field of study

Given a sequence of independent random variables with a common continuous distribution, we consider the online decision problem where one seeks to minimize the expected value of the time that is needed to complete the selection of a monotone increasing subsequence of a prespecified length

n

. This problem is dual to some online decision problems that have been considered earlier, and this dual problem has some notable advantages. In particular, the recursions and equations of optimality lead with relative ease to asymptotic formulas for mean and variance of the minimal selection time.Comment: 17 page

arXiv.org e-Print Archive

ScholarlyCommons@Penn

What Makes the Arc-Preserving Subsequence Problem Hard?

Author: Blin Guillaume
Fertin Guillaume
Rizzi Romeo
Vialette Stéphane
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2005
Field of study

International audienceGiven two arc-annotated sequences (S, P ) and (T, Q) representing RNA structures, the Arc-Preserving Subsequence (APS) problem asks whether (T, Q) can be obtained from (S, P ) by deleting some of its bases (together with their incident arcs, if any). In previous studies [3, 6], this problem has been naturally divided into subproblems reﬂecting intrinsic complexity of arc structures. We show that APS(Crossing, Plain) is NP-complete, thereby answering an open problem [6]. Furthermore, to get more insight into where actual border of APS hardness is, we reﬁne APS classical subproblems in much the same way as in [11] and give a complete categorization among various restrictions of APS problem complexity

Novel Techniques For Model-Code Synchronization

Author: Angyal László
Charaf Hassan
Lengyel László
Publication venue: European Association of Software Science and Technology
Publication date: 01/01/2008
Field of study

The orientation of the current software development practice requires efficient model-based iterative solutions. The high costs of maintenance and evolution during the life cycle of the software can be reduced by using tool-aided iterative development. This paper presents how model-based iterative software development can be supported through efficient model-code change propagation. The presented approach facilitates bi-directional synchronization between the modified source code and the refined initial models. The backgrounds of the synchronization technique are three-way abstract syntax tree (AST) differencing and merging. The AST-based solution enables syntactically correct merge operations. OMG's Model-Driven Architecture describes a proposal for platform-specific model creation and source code generation. We extend this vision with the synchronization feature to assist the iterative development. Furthermore, a case study is also provided

CiteSeerX

Electronic Communications of the EASST (European Association of Software Science and Technology)

Analysis of the Relationships among Longest Common Subsequences, Shortest Common Supersequences and Patterns and its application on Pattern Discovery in Biological Sequences

Author: Leong Hon Wai
Ng Hoong Kee
Ning Kang
Publication venue: 'Inderscience Publishers'
Publication date: 13/03/2009
Field of study

For a set of mulitple sequences, their patterns,Longest Common Subsequences (LCS) and Shortest Common Supersequences (SCS) represent different aspects of these sequences profile, and they can all be used for biological sequence comparisons and analysis. Revealing the relationship between the patterns and LCS,SCS might provide us with a deeper view of the patterns of biological sequences, in turn leading to better understanding of them. However, There is no careful examinaton about the relationship between patterns, LCS and SCS. In this paper, we have analyzed their relation, and given some lemmas. Based on their relations, a set of algorithms called the PALS (PAtterns by Lcs and Scs) algorithms are propsoed to discover patterns in a set of biological sequences. These algorithms first generate the results for LCS and SCS of sequences by heuristic, and consequently derive patterns from these results. Experiments show that the PALS algorithms perform well (both in efficiency and in accuracy) on a variety of sequences. The PALS approach also provides us with a solution for transforming between the heuristic results of SCS and LCS.Comment: Extended version of paper presented in IEEE BIBE 2006 submitted to journal for revie

arXiv.org e-Print Archive

Crossref

ScholarBank@NUS

A genetic algorithm with expansion and exploration operators for the maximum satisfiability problem

Author: Gorbenko A.
Popov V.
Publication venue
Publication date: 01/01/2013
Field of study

There are many problems that standard genetic algorithms fail to solve. Refinements of standard genetic algorithms that can be used to solve hard problems has caused considerable interest. In this paper, we consider genetic algorithms withexpansion and exploration operators for the maximum satisfiability problem

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

A hybrid algorithm for the longest common transposition-invariant subsequence problem

Author: Deorowicz Sebastian
Grabowski Szymon
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 26/01/2012
Field of study

The longest common transposition-invariant subsequence (LCTS) problem is a music information retrieval oriented variation of the classic LCS problem. There are basically only two known efficient approaches to calculate the length of the LCTS, one based on sparse dynamic programming and the other on bit-parallelism. In this work, we propose a hybrid algorithm picking the better of the two algorithms for individual subproblems. Experiments on music (MIDI), with 32-bit and 64-bit implementations, show that the proposed algorithm outperforms the faster of the two component algorithms by a factor of 1.4–2.0, depending on sequence lengths. Similar, if not better, improvements can be observed for random data with Gaussian distribution. Also for uniformly random data, the hybrid algorithm is the winner if the alphabet is neither too small (at least 32 symbols) nor too large (up to 128 symbols). Part of the success of our scheme is attributed to a quite robust component selection heuristic

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Recommended from our members

Minimally supervised induction of morphology through bitexts

Author: Moon Taesun, Ph. D.
Publication venue
Publication date: 01/12/2008
Field of study

textA knowledge of morphology can be useful for many natural language processing systems. Thus, much effort has been expended in developing accurate computational tools for morphology that lemmatize, segment and generate new forms. The most powerful and accurate of these have been manually encoded, such endeavors being without exception expensive and time-consuming. There have been consequently many attempts to reduce this cost in the development of morphological systems through the development of unsupervised or minimally supervised algorithms and learning methods for acquisition of morphology. These efforts have yet to produce a tool that approaches the performance of manually encoded systems. Here, I present a strategy for dealing with morphological clustering and segmentation in a minimally supervised manner but one that will be more linguistically informed than previous unsupervised approaches. That is, this study will attempt to induce clusters of words from an unannotated text that are inflectional variants of each other. Then a set of inflectional suffixes by part-of-speech will be induced from these clusters. This level of detail is made possible by a method known as alignment and transfer (AT), among other names, an approach that uses aligned bitexts to transfer linguistic resources developed for one language–the source language–to another language–the target. This approach has a further advantage in that it allows a reduction in the amount of training data without a significant degradation in performance making it useful in applications targeted at data collected from endangered languages. In the current study, however, I use English as the source and German as the target for ease of evaluation and for certain typlogical properties of German. The two main tasks, that of clustering and segmentation, are approached as sequential tasks with the clustering informing the segmentation to allow for greater accuracy in morphological analysis. While the performance of these methods does not exceed the current roster of unsupervised or minimally supervised approaches to morphology acquisition, it attempts to integrate more learning methods than previous studies. Furthermore, it attempts to learn inflectional morphology as opposed to derivational morphology, which is a crucial distinction in linguistics.Linguistic

Texas ScholarWorks

Beyond Word N-Grams

Author: Pereira Fernando C. N.
Singer Yoram
Tishby Naftali
Publication venue
Publication date: 01/01/1995
Field of study

We describe, analyze, and evaluate experimentally a new probabilistic model for word-sequence prediction in natural language based on prediction suffix trees (PSTs). By using efficient data structures, we extend the notion of PST to unbounded vocabularies. We also show how to use a Bayesian approach based on recursive priors over all possible PSTs to efficiently maintain tree mixtures. These mixtures have provably and practically better performance than almost any single model. We evaluate the model on several corpora. The low perplexity achieved by relatively small PST mixture models suggests that they may be an advantageous alternative, both theoretically and practically, to the widely used n-gram models.Comment: 15 pages, one PostScript figure, uses psfig.sty and fullname.sty. Revised version of a paper in the Proceedings of the Third Workshop on Very Large Corpora, MIT, 199

arXiv.org e-Print Archive

CiteSeerX

On the least exponential growth admitting uncountably many closed permutation classes

Author: Klazar Martin
Publication venue
Publication date: 31/07/2003
Field of study

We show that the least exponential growth of counting functions which admits uncountably many closed permutation classes lies between 2^n and (2.33529...)^n.Comment: 13 page

arXiv.org e-Print Archive

Elsevier - Publisher Connector