Search CORE

9 research outputs found

Circular pattern matching with k mismatches

Author: A Amir
C Barton
C Barton
C Barton
C Hazay
CS Iliopoulos
GM Landau
Karl Bringmann
Kimmo Fredriksson
KR Abrahamson
LAK Ayad
LAK Ayad
M Crochemore
M Ružić
MAR Azim
ML Fredman
P Bille
P Gawrychowski
P Gawrychowski
R Grossi
Raphael Clifford
T Hirvola
T Kociumaka
V Palazón-González
V Palazón-González
V Palazón-González
WI Chang
Z Galil
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/07/2019
Field of study

The k-mismatch problem consists in computing the Hamming distance between a pattern P of length m and every length-m substring of a text T of length n, if this distance is no more than k. In many real-world applications, any cyclic shift of P is a relevant pattern, and thus one is interested in computing the minimal distance of every length-m substring of T and any cyclic shift of P. This is the circular pattern m

arXiv.org e-Print Archive

Crossref

VU Research Portal

CWI's Institutional Repository

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Circular sequence comparison: algorithms and applications

Author: Ahmad Retha (7168871)
Costas S. Iliopoulos (7168862)
Fatima Vayani (7168874)
Nadia Pisanti (7168865)
Robert Mercas (2835212)
Roberto Grossi (7168859)
Solon P. Pissis (7168868)
Publication venue
Publication date: 01/01/2016
Field of study

Background: Sequence comparison is a fundamental step in many important tasks in bioinformatics; from phylogenetic reconstruction to the reconstruction of genomes. Traditional algorithms for measuring approximation in sequence comparison are based on the notions of distance or similarity, and are generally computed through sequence alignment techniques. As circular molecular structure is a common phenomenon in nature, a caveat of the adaptation of alignment techniques for circular sequence comparison is that they are computationally expensive, requiring from super-quadratic to cubic time in the length of the sequences. Results: In this paper, we introduce a new distance measure based on q-grams, and show how it can be applied effectively and computed efficiently for circular sequence comparison. Experimental results, using real DNA, RNA, and protein sequences as well as synthetic data, demonstrate orders-of-magnitude superiority of our approach in terms of efficiency, while maintaining an accuracy very competitive to the state of the art

Loughborough University Institutional Repository

Network analysis of circular permutations in multidomain proteins reveals functional linkages for uncharacterized proteins.

Author: Adjeroh Donald
Jiang Bing-Hua
Jiang Yue
Lin Jie
Publication venue: Jefferson Digital Commons
Publication date: 01/01/2014
Field of study

Various studies have implicated different multidomain proteins in cancer. However, there has been little or no detailed study on the role of circular multidomain proteins in the general problem of cancer or on specific cancer types. This work represents an initial attempt at investigating the potential for predicting linkages between known cancer-associated proteins with uncharacterized or hypothetical multidomain proteins, based primarily on circular permutation (CP) relationships. First, we propose an efficient algorithm for rapid identification of both exact and approximate CPs in multidomain proteins. Using the circular relations identified, we construct networks between multidomain proteins, based on which we perform functional annotation of multidomain proteins. We then extend the method to construct subnetworks for selected cancer subtypes, and performed prediction of potential link-ages between uncharacterized multidomain proteins and the selected cancer types. We include practical results showing the performance of the proposed methods

Crossref

Directory of Open Access Journals

PubMed Central

Jefferson Digital Commons

Circular pattern matching with k mismatches

Author: Charalampopoulos P. (Panagiotis)
Kociumaka T. (Tomasz)
Pissis S. (Solon)
Radoszewski J. (Jakub)
Rytter W. (Wojciech)
Straszyński J. (Juliusz)
Waleń T. (Tomasz)
Zuba W. (Wiktor)
Publication venue: 'Elsevier BV'
Publication date: 01/02/2021
Field of study

We consider the circular pattern matching with k mismatches (k-CPM) problem in which one is to compute the minimal Hamming distance of every length-m substring of T and any cyclic rotation of P, if this distance is no more than k. It is a variation of the well-studied k-mismatch problem. A multitude of papers has been devoted

VU Research Portal

CWI's Institutional Repository

Efficient sequence comparison via combining alignment and alignment-free techniques:algorithms and bioinformatics research

Author: Ayad Lorraine Abdelmasih Khalil
Publication venue
Publication date: 01/06/2019
Field of study

King's Research Portal

Algorithms for the analysis of molecular sequences

Author: Vayani Fatima
Publication venue
Publication date: 01/12/2019
Field of study

King's Research Portal

Practical algorithms for biological sequence analysis:methods and applications

Author: Retha Ahmad
Publication venue
Publication date: 01/06/2019
Field of study

King's Research Portal

Fast algorithms for approximate circular string matching

Author: Barton C.
Iliopoulos Costas
Pissis S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Background: Circular string matching is a problem which naturally arises in many biological contexts. It consists in finding all occurrences of the rotations of a pattern of length m in a text of length n. There exist optimal average-case algorithms for exact circular string matching. Approximate circular string matching is a rather undeveloped area.Results: In this article, we present a suboptimal average-case algorithm for exact circular string matching requiring time O(n). Based on our solution for the exact case, we present two fast average-case algorithms for approximate circular string matching with k-mismatches, under the Hamming distance model, requiring time O(n) for moderate values of k, that is k = O(m/ logm). We show how the same results can be easily obtained under the edit distancemodel. The presented algorithms are also implemented as library functions. Experimental results demonstrate thatthe functions provided in this library accelerate the computations by more than three orders of magnitude compared to a naïve approach.Conclusions: We present two fast average-case algorithms for approximate circular string matching with k-mismatches; and show that they also perform very well in practice. The importance of our contribution is underlined by the fact that the provided functions may be seamlessly integrated into any biological pipeline. The source code of the library is freely available at http://www.inf.kcl.ac.uk/research/projects/asmf/

Crossref

Springer - Publisher Connector

PubMed Central

King's Research Portal

espace@Curtin