Search CORE

247 research outputs found

On Bijective Variants of the Burrows-Wheeler Transform

Author: Kufleitner Manfred
Publication venue
Publication date: 01/01/2009
Field of study

The sort transform (ST) is a modification of the Burrows-Wheeler transform (BWT). Both transformations map an arbitrary word of length n to a pair consisting of a word of length n and an index between 1 and n. The BWT sorts all rotation conjugates of the input word, whereas the ST of order k only uses the first k letters for sorting all such conjugates. If two conjugates start with the same prefix of length k, then the indices of the rotations are used for tie-breaking. Both transforms output the sequence of the last letters of the sorted list and the index of the input within the sorted list. In this paper, we discuss a bijective variant of the BWT (due to Scott), proving its correctness and relations to other results due to Gessel and Reutenauer (1993) and Crochemore, Desarmenien, and Perrin (2005). Further, we present a novel bijective variant of the ST.Comment: 15 pages, presented at the Prague Stringology Conference 2009 (PSC 2009

arXiv.org e-Print Archive

CiteSeerX

Permutation patterns in genome rearrangement problems

Author: Cerbai Giulio
Ferrari Luca
Publication venue
Publication date: 01/01/2018
Field of study

In the context of the genome rearrangement problem, we analyze two well known models, namely the block transposition and the prefix block transposition models, by exploiting the connection with the notion of permutation pattern. More specifically, for any

k

, we provide a characterization of the set of permutations having distance

\leq k

from the identity (which is known to be a permutation class) in terms of what we call generating permutations and we describe some properties of its basis, which allow to compute such a basis for small values of

k

.Comment: 8 pages. In: L. Ferrari, M. Vamvakari (eds.): Proceedings of the GASCom 2018 Workshop, Athens, Greece, 18--20 June 2018, published at http://ceur-ws.or

arXiv.org e-Print Archive

Florence Research

Algebraic aspects of increasing subsequences

Author: Baik Jinho
Rains Eric M.
Publication venue
Publication date: 01/01/1999
Field of study

We present a number of results relating partial Cauchy-Littlewood sums, integrals over the compact classical groups, and increasing subsequences of permutations. These include: integral formulae for the distribution of the longest increasing subsequence of a random involution with constrained number of fixed points; new formulae for partial Cauchy-Littlewood sums, as well as new proofs of old formulae; relations of these expressions to orthogonal polynomials on the unit circle; and explicit bases for invariant spaces of the classical groups, together with appropriate generalizations of the straightening algorithm.Comment: LaTeX+amsmath+eepic; 52 pages. Expanded introduction, new references, other minor change

arXiv.org e-Print Archive

CiteSeerX

Caltech Authors

Bruhat Order in the Full Symmetric $\mathfrak{sl}_n$ Toda Lattice on Partial Flag Space

Author: Chernyakov Yury B.
Sharygin Georgy I.
Sorin Alexander S.
Publication venue: 'SIGMA (Symmetry, Integrability and Geometry: Methods and Application)'
Publication date: 01/01/2016
Field of study

In our previous paper [Comm. Math. Phys. 330 (2014), 367-399] we described the asymptotic behaviour of trajectories of the full symmetric

\mathfrak{sl}_n

Toda lattice in the case of distinct eigenvalues of the Lax matrix. It turned out that it is completely determined by the Bruhat order on the permutation group. In the present paper we extend this result to the case when some eigenvalues of the Lax matrix coincide. In that case the trajectories are described in terms of the projection to a partial flag space where the induced dynamical system verifies the same properties as before: we show that when

t\to\pm\infty

the trajectories of the induced dynamical system converge to a finite set of points in the partial flag space indexed by the Schubert cells so that any two points of this set are connected by a trajectory if and only if the corresponding cells are adjacent. This relation can be explained in terms of the Bruhat order on multiset permutations

arXiv.org e-Print Archive

Наукова електронна бібліотека періодичних видань НАН України (Vernadsky National Library of Ukraine)

Set-to-Sequence Methods in Machine Learning: A Review

Author: Derczynski Leon
Jurewicz Mateusz
Publication venue: 'AI Access Foundation'
Publication date: 17/03/2021
Field of study

Machine learning on sets towards sequential output is an important and ubiquitous task, with applications ranging from language modelling and meta-learning to multi-agent strategy games and power grid optimization. Combining elements of representation learning and structured prediction, its two primary challenges include obtaining a meaningful, permutation invariant set representation and subsequently utilizing this representation to output a complex target permutation. This paper provides a comprehensive introduction to the field as well as an overview of important machine learning methods tackling both of these key challenges, with a detailed qualitative comparison of selected model architectures.Comment: 46 pages of text, with 10 pages of references. Contains 2 tables and 4 figure

arXiv.org e-Print Archive

The IT University of Copenhagen's Repository

A Lower Bound Technique for Communication in BSP

Author: Bilardi Gianfranco
Scquizzato Michele
Silvestri Francesco
Publication venue
Publication date: 25/11/2017
Field of study

Communication is a major factor determining the performance of algorithms on current computing systems; it is therefore valuable to provide tight lower bounds on the communication complexity of computations. This paper presents a lower bound technique for the communication complexity in the bulk-synchronous parallel (BSP) model of a given class of DAG computations. The derived bound is expressed in terms of the switching potential of a DAG, that is, the number of permutations that the DAG can realize when viewed as a switching network. The proposed technique yields tight lower bounds for the fast Fourier transform (FFT), and for any sorting and permutation network. A stronger bound is also derived for the periodic balanced sorting network, by applying this technique to suitable subnetworks. Finally, we demonstrate that the switching potential captures communication requirements even in computational models different from BSP, such as the I/O model and the LPRAM

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova

Discovery of Unconventional Patterns for Sequence Analysis: Theory and Algorithms

Author: BATTAGLIA GIOVANNI
Publication venue: 'Pisa University Press'
Publication date: 19/12/2011
Field of study

The biology community is collecting a large amount of raw data, such as the genome sequences of organisms, microarray data, interaction data such as gene-protein interactions, protein-protein interactions, etc. This amount is rapidly increasing and the process of understanding the data is lagging behind the process of acquiring it. An inevitable first step towards making sense of the data is to study their regularities focusing on the non-random structures appearing surprisingly often in the input sequences: patterns. In this thesis we discuss three incarnations of the pattern discovery task, exploring three types of patterns that can model different regularities of the input dataset. While mask patterns have been designed to model short repeated biological sequences, showing a high conservation of their content at some specific positions, permutation patterns have been designed to detect repeated patterns whose parts maintain their physical adjacency but not their ordering in all the pattern occurrences. Transposons, instead, model mobile sequences in the input dataset, which can be discovered by comparing different copies of the same input string, detecting large insertions and deletions in their alignment

Electronic Thesis and Dissertation Archive - Università di Pisa

Pattern avoidance in forests of binary shrubs

Author: Bevan David
Levin Derek
Nugent Peter
Pantone Jay
Pudwell Lara
Riehl Manda
Tlachac ML
Publication venue
Publication date: 01/07/2016
Field of study

We investigate pattern avoidance in permutations satisfying some additional restrictions. These are naturally considered in terms of avoiding patterns in linear extensions of certain forest-like partially ordered sets, which we call binary shrub forests. In this context, we enumerate forests avoiding patterns of length three. In four of the five non-equivalent cases, we present explicit enumerations by exhibiting bijections with certain lattice paths bounded above by the line y = lx, for some l in Q+, one of these being the celebrated Duchon’s club paths with l = 2/3. In the remaining case, we use the machinery of analytic combinatorics to determine the minimal polynomial of its generating function, and deduce its growth rate

arXiv.org e-Print Archive

University of Strathclyde Institutional Repository

Episciences.org

Directory of Open Access Journals

Open Research Online (The Open University)

Valparaiso University

r-indexing the eBWT

Author: Boucher Christina
Cenzato Davide
Lipták Zsuzsanna
Rossi Massimiliano
Sciortino Marinella
Publication venue: Academic Press Incorporated
Publication date: 01/01/2024
Field of study

The extended Burrows-Wheeler Transform (eBWT) [Mantaci et al. TCS 2007] is a variant of the BWT, introduced for collections of strings. In this paper, we present the extended r-index, an analogous data structure to the r-index [Gagie et al. JACM 2020]. It occupies O(r) words, with r the number of runs of the eBWT, and offers the same functionalities as the r-index. We also show how to efficiently support finding maximal exact matches (MEMs). We implemented the extended r-index and tested it on circular bacterial genomes and plasmids, comparing it to five state-of-the-art compressed text indexes. While our data structure maintains similar time and memory requirements for answering pattern matching queries as the original r-index, it is the only index in the literature that can naturally be used for both circular and linear input collections. This is an extended version of [Boucher et al., r-indexing the eBWT, SPIRE 2021]

Catalogo dei prodotti della ricerca

Archivio istituzionale della ricerca - Università di Palermo