Search CORE

508 research outputs found

New Algorithms for Position Heaps

Author: A. Ehrenfeucht
E.G. Coffman Jr.
H. Bannai
J. Westbrook
M. Salson
M. Salson
O. Berkman
Y. Nakashima
Publication venue
Publication date: 01/01/2013
Field of study

We present several results about position heaps, a relatively new alternative to suffix trees and suffix arrays. First, we show that, if we limit the maximum length of patterns to be sought, then we can also limit the height of the heap and reduce the worst-case cost of insertions and deletions. Second, we show how to build a position heap in linear time independent of the size of the alphabet. Third, we show how to augment a position heap such that it supports access to the corresponding suffix array, and vice versa. Fourth, we introduce a variant of a position heap that can be simulated efficiently by a compressed suffix array with a linear number of extra bits

arXiv.org e-Print Archive

Crossref

Torsion divisors of plane curves and Zariski pairs

Author: Bannai S.
Bartolo E. Artal
Shirane T.
Tokunaga H.
Publication venue
Publication date: 26/05/2020
Field of study

In this paper we study the embedded topology of reducible plane curves having a smooth irreducible component. In previous studies, the relation between the topology and certain torsion classes in the Picard group of degree zero of the smooth component was implicitly considered. We formulate this relation clearly and give a criterion for distinguishing the embedded topology in terms of torsion classes. Furthermore, we give a method of systematically constructing examples of curves where our criterion is applicable, and give new examples of Zariski tuples.Comment: 19 page

arXiv.org e-Print Archive

Finding all maximal perfect haplotype blocks in linear time

Author: Alanko J.
Bannai H.
Cazaux B.
Peterlongo Peter
Stoye J.
Publication venue
Publication date: 01/01/2019
Field of study

Recent large-scale community sequencing efforts allow at an unprecedented level of detail the identification of genomic regions that show signatures of natural selection. Traditional methods for identifying such regions from individuals' haplotype data, however, require excessive computing times and therefore are not applicable to current datasets. In 2019, Cunha et al. (Advances in bioinformatics and computational biology: 11th Brazilian symposium on bioinformatics, BSB 2018, Niteroi, Brazil, October 30 - November 1, 2018, Proceedings, 2018. 10.1007/978-3-030-01722-4_3) suggested the maximal perfect haplotype block as a very simple combinatorial pattern, forming the basis of a new method to perform rapid genome-wide selection scans. The algorithm they presented for identifying these blocks, however, had a worst-case running time quadratic in the genome length. It was posed as an open problem whether an optimal, linear-time algorithm exists. In this paper we give two algorithms that achieve this time bound, one conceptually very simple one using suffix trees and a second one using the positional Burrows-Wheeler Transform, that is very efficient also in practice.Peer reviewe

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Dagstuhl Research Online Publication Server

Publications at Bielefeld University

Helsingin yliopiston digitaalinen arkisto

HAL-Rennes 1

Normal subgroups of triply transitive permutation groups of degree divisible by 3

Author: A. Wagner
C. Hering
C. Jordan
E. Bannai
H. Bender
H. Wielandt
H. Wielandt
J. Saxl
J.G. Thompson
Johannes Siemons
M. Suzuki
M. Suzuki
N. Ito
P. Martineau
P.M. Neumann
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/03/2005
Field of study

Crossref

University of East Anglia digital repository

Role of cystine transport in intracellular glutathione level and cisplatin resistance in human ovarian cancer cell lines

Author: A Johnsson
A Meister
AK Godwin
B Rosenberg
BC Behrens
E Reed
F Tietze
H Hamada
H Sato
H Sato
H Sato
H Timmer-Bosscha
H Wang
H Yoshikawa
HN Christensen
IA Cotgreave
J Li
JI Toohey
K Kuriyama-Matsumura
KS Yao
M Palacin
M Tamba
MK Patterson Jr
P Mistry
PA Andrews
PG Richman
Q Li
RP Perez
S Bannai
S Bannai
S Bannai
S Bannai
S Bannai
S Bannai
S Goto
S Okuno
S Sohda
SW Johnson
T Iida
T Kondo
VM Wasenius
Publication venue: Nature Publishing Group
Publication date
Field of study

Crossref

PubMed Central

Maintaining the size of LZ77 on semi-dynamic strings

Author: Bannai H.
Charalampopoulos Panagiotis
Radoszewski J.
Publication venue: Dagstuhl Publishing
Publication date: 18/06/2024
Field of study

We consider the problem of maintaining the size of the LZ77 factorization of a string S of length at most n under the following operations: (a) appending a given letter to S and (b) deleting the first letter of S. Our main result is an algorithm for this problem with amortized update time Õ(√n). As a corollary, we obtain an Õ(n√n)-time algorithm for computing the most LZ77-compressible rotation of a length-n string - a naive approach for this problem would compute the LZ77 factorization of each possible rotation and would thus take quadratic time in the worst case. We also show an Ω(√n) lower bound for the additive sensitivity of LZ77 with respect to the rotation operation. Our algorithm employs dynamic trees to maintain the longest-previous-factor array information and depends on periodicity-based arguments that bound the number of the required updates and enable their efficient computation

Birkbeck Institutional Research Online

NcPred for accurate nuclear protein prediction using n-mer statistics with various classification algorithms

Author: A. Ganesh
A. Garg
A. Pierleoni
A. Reinhardt
B. Alberts
B. Chan
B. Mathews
D. Xie
E. Marcotte
G. Hutchinson
H. Bannai
M. Hall
M. Kumar
O. Emanuelson
W. Jassem
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Prediction of nuclear proteins is one of the major challenges in genome annotation. A method, NcPred is described, for predicting nuclear proteins with higher accuracy exploiting n-mer statistics with different classification algorithms namely Alternating Decision (AD) Tree, Best First (BF) Tree, Random Tree and Adaptive (Ada) Boost. On BaCello dataset [1], NcPred improves about 20% accuracy with Random Tree and about 10% sensitivity with Ada Boost for Animal proteins compared to existing techniques. It also increases the accuracy of Fungal protein prediction by 20% and recall by 4% with AD Tree. In case of Human protein, the accuracy is improved by about 25% and sensitivity about 10% with BF Tree. Performance analysis of NcPred clearly demonstrates its suitability over the contemporary in-silico nuclear protein classification research

Northumbria University Research Portal

Crossref

Practical Evaluation of Lempel-Ziv-78 and Lempel-Ziv-Welch Tries

Author: A Poyias
D Arroyuelo
D Lemire
D Lemire
D Lemire
G Marsaglia
GH Gonnet
H Bannai
H Luan
J Fischer
J Fischer
J Jansson
J Kärkkäinen
J Ziv
J Ziv
JA Feldman
JG Cleary
K Chung
L Carter
P Tchebychev
RM Karp
RM Robinson
TA Welch
Y Nakashima
Publication venue
Publication date: 09/06/2017
Field of study

We present the first thorough practical study of the Lempel-Ziv-78 and the Lempel-Ziv-Welch computation based on trie data structures. With a careful selection of trie representations we can beat well-tuned popular trie data structures like Judy, m-Bonsai or Cedar

arXiv.org e-Print Archive

Crossref