Search CORE

7,553 research outputs found

Weighted ancestors in suffix trees

Author: D.E. Willard
M. Farach
M.A. Bender
O. Berkman
P. Bille
P. Gawrychowski
T. Kopelowitz
Publication venue
Publication date: 01/01/2014
Field of study

The classical, ubiquitous, predecessor problem is to construct a data structure for a set of integers that supports fast predecessor queries. Its generalization to weighted trees, a.k.a. the weighted ancestor problem, has been extensively explored and successfully reduced to the predecessor problem. It is known that any solution for both problems with an input set from a polynomially bounded universe that preprocesses a weighted tree in O(n polylog(n)) space requires \Omega(loglogn) query time. Perhaps the most important and frequent application of the weighted ancestors problem is for suffix trees. It has been a long-standing open question whether the weighted ancestors problem has better bounds for suffix trees. We answer this question positively: we show that a suffix tree built for a text w[1..n] can be preprocessed using O(n) extra space, so that queries can be answered in O(1) time. Thus we improve the running times of several applications. Our improvement is based on a number of data structure tools and a periodicity-based insight into the combinatorial structure of a suffix tree.Comment: 27 pages, LNCS format. A condensed version will appear in ESA 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Clustering Time Series from Mixture Polynomial Models with Discretised Data

Author: Bagnall AJ
Janacek GJ
Zhang M
Publication venue: University of East Anglia
Publication date: 01/01/2003
Field of study

Clustering time series is an active research area with applications in many fields. One common feature of time series is the likely presence of outliers. These uncharacteristic data can significantly effect the quality of clusters formed. This paper evaluates a method of over-coming the detrimental effects of outliers. We describe some of the alternative approaches to clustering time series, then specify a particular class of model for experimentation with k-means clustering and a correlation based distance metric. For data derived from this class of model we demonstrate that discretising the data into a binary series of above and below the median improves the clustering when the data has outliers. More specifically, we show that firstly discretisation does not significantly effect the accuracy of the clusters when there are no outliers and secondly it significantly increases the accuracy in the presence of outliers, even when the probability of outlier is very low

University of East Anglia digital repository

Testing Generalised Freeness of Words

Author: Gawrychowski Pawel
Manea Florin
Nowotka Dirk
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st International Symposium on Theoretical Aspects of Computer Science (STACS 2014)
Publication date: 01/01/2014
Field of study

Pseudo-repetitions are a natural generalisation of the classical notion of repetitions in sequences: they are the repeated concatenation of a word and its encoding under a certain morphism or antimorphism (anti-/morphism, for short). We approach the problem of deciding efficiently, for a word w and a literal anti-/morphism f, whether w contains an instance of a given pattern involving a variable x and its image under f, i.e., f(x). Our results generalise both the problem of finding fixed repetitive structures (e.g., squares, cubes) inside a word and the problem of finding palindromic structures inside a word. For instance, we can detect efficiently a factor of the form xx^Rxxx^R, or any other pattern of such type. We also address the problem of testing efficiently, in the same setting, whether the word w contains an arbitrary pseudo-repetition of a given exponent

Dagstuhl Research Online Publication Server

MPG.PuRe

Analysing Astronomy Algorithms for GPUs and Beyond

Author: Amdahl
Asanovic
B. R. Barsdell
Bate
Belleman
Blelloch
Brunner
C. J. Fluke
Che
Clark
D. G. Barnes
Hamada
Harris
Högbom
Jonsson
Kayser
Knuth
Levoy
Moore
Nitadori
Schive
Schneider
Schneider
Taylor
Thompson
Wambsganss
Wayth
Publication venue: 'Wiley'
Publication date: 01/01/2010
Field of study

Astronomy depends on ever increasing computing power. Processor clock-rates have plateaued, and increased performance is now appearing in the form of additional processor cores on a single chip. This poses significant challenges to the astronomy software community. Graphics Processing Units (GPUs), now capable of general-purpose computation, exemplify both the difficult learning-curve and the significant speedups exhibited by massively-parallel hardware architectures. We present a generalised approach to tackling this paradigm shift, based on the analysis of algorithms. We describe a small collection of foundation algorithms relevant to astronomy and explain how they may be used to ease the transition to massively-parallel computing architectures. We demonstrate the effectiveness of our approach by applying it to four well-known astronomy problems: Hogbom CLEAN, inverse ray-shooting for gravitational lensing, pulsar dedispersion and volume rendering. Algorithms with well-defined memory access patterns and high arithmetic intensity stand to receive the greatest performance boost from massively-parallel architectures, while those that involve a significant amount of decision-making may struggle to take advantage of the available processing power.Comment: 10 pages, 3 figures, accepted for publication in MNRA

arXiv.org e-Print Archive

CiteSeerX

Crossref

Swinburne Research Bank

Detecting One-variable Patterns

Author: A Amir
A Ehrenfeucht
D Angluin
D Kosolobov
D Kosolobov
E Czeizler
F Manea
G Manacher
J Kärkkäinen
JEF Friedl
M Crochemore
M Crochemore
M Lothaire
M Rubinchik
ML Schmid
P Gawrychowski
Z Galil
Z Xu
Publication venue
Publication date: 01/01/2017
Field of study

Given a pattern

p = s_1x_1s_2x_2\cdots s_{r-1}x_{r-1}s_r

such that

x_1,x_2,\ldots,x_{r-1}\in\{x,\overset{{}_{\leftarrow}}{x}\}

, where

x

is a variable and

\overset{{}_{\leftarrow}}{x}

its reversal, and

s_1,s_2,\ldots,s_r

are strings that contain no variables, we describe an algorithm that constructs in

O(rn)

time a compact representation of all

P

instances of

p

in an input string of length

n

over a polynomially bounded integer alphabet, so that one can report those instances in

O(P)

time.Comment: 16 pages (+13 pages of Appendix), 4 figures, accepted to SPIRE 201

arXiv.org e-Print Archive

Crossref

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

A Linear-Time n 0.4-Approximation for Longest Common Subsequence

Author: Bringmann K.
Cohen-Addad V.
Das D.
Publication venue
Publication date: 01/01/2021
Field of study

We consider the classic problem of computing the Longest Common Subsequence(LCS) of two strings of length

n

. While a simple quadratic algorithm has beenknown for the problem for more than 40 years, no faster algorithm has beenfound despite an extensive effort. The lack of progress on the problem hasrecently been explained by Abboud, Backurs, and Vassilevska Williams [FOCS'15]and Bringmann and K\"unnemann [FOCS'15] who proved that there is nosubquadratic algorithm unless the Strong Exponential Time Hypothesis fails.This has led the community to look for subquadratic approximation algorithmsfor the problem. Yet, unlike the edit distance problem for which a constant-factorapproximation in almost-linear time is known, very little progress has beenmade on LCS, making it a notoriously difficult problem also in the realm ofapproximation. For the general setting, only a naive

O(n^{\varepsilon/2})

-approximation algorithm with running time

\tilde{O}(n^{2-\varepsilon})

has been known, for any constant

0 \varepsilon \le 1

. Recently, a breakthrough result by Hajiaghayi, Seddighin,Seddighin, and Sun [SODA'19] provided a linear-time algorithm that yields a

O(n^{0.497956})

-approximation in expectation; improving upon the naive

O(\sqrt{n})

-approximation for the first time. In this paper, we provide an algorithm that in time

O(n^{2-\varepsilon})

computes an

\tilde{O}(n^{2\varepsilon/5})

-approximation with highprobability, for any

0

\tilde{O}(n^{0.4})

-approximation in linear time, improving upon the bound ofHajiaghayi, Seddighin, Seddighin, and Sun, (2) provides an algorithm whoseapproximation scales with any subquadratic running time

O(n^{2-\varepsilon})

,improving upon the naive bound of

O(n^{\varepsilon/2})

for any

\varepsilon$,and (3) instead of only in expectation, succeeds with high probability.<br

MPG.PuRe

Multifractal analysis of complex networks

Author: Canessa E
Dan-Ling Wang
Eguiluz V M
Erdös P
Falconer K
Feder J
Guo L
Han J J
Liu J X
Mandelbrot B B
Milgram S
Smith T G
Song C
V Anh
Yu Z G
Yu Z G
Yu Z G
Yu Z G
Zang B J
Zhu S M
Zu-Guo Yu
Publication venue: 'IOP Publishing'
Publication date: 01/01/2012
Field of study

Complex networks have recently attracted much attention in diverse areas of science and technology. Many networks such as the WWW and biological networks are known to display spatial heterogeneity which can be characterized by their fractal dimensions. Multifractal analysis is a useful way to systematically describe the spatial heterogeneity of both theoretical and experimental fractal patterns. In this paper, we introduce a new box covering algorithm for multifractal analysis of complex networks. This algorithm is used to calculate the generalized fractal dimensions

D_{q}

of some theoretical networks, namely scale-free networks, small world networks and random networks, and one kind of real networks, namely protein-protein interaction networks of different species. Our numerical results indicate the existence of multifractality in scale-free networks and protein-protein interaction networks, while the multifractal behavior is not clear-cut for small world networks and random networks. The possible variation of

D_{q}

due to changes in the parameters of the theoretical network models is also discussed.Comment: 18 pages, 7 figures, 4 table

arXiv.org e-Print Archive

Crossref

Queensland University of Technology ePrints Archive

Ancilla-based quantum simulation

Author: Brown K L
Ho S Y Rowe D J De Baerdemacker S
Katherine L Brown
Louis S G R
Nielsen M A
Rodrigues D A
Spiller T P
Suvabrata De
Vivien M Kendon
William J Munro
Yu Smirnov A
Publication venue: 'IOP Publishing'
Publication date: 04/03/2011
Field of study

We consider simulating the BCS Hamiltonian, a model of low temperature superconductivity, on a quantum computer. In particular we consider conducting the simulation on the qubus quantum computer, which uses a continuous variable ancilla to generate interactions between qubits. We demonstrate an O(N^3) improvement over previous work conducted on an NMR computer [PRL 89 057904 (2002) & PRL 97 050504 (2006)] for the nearest neighbour and completely general cases. We then go on to show methods to minimise the number of operations needed per time step using the qubus in three cases; a completely general case, a case of exponentially decaying interactions and the case of fixed range interactions. We make these results controlled on an ancilla qubit so that we can apply the phase estimation algorithm, and hence show that when N \geq 5, our qubus simulation requires significantly less operations that a similar simulation conducted on an NMR computer.Comment: 20 pages, 10 figures: V2 added section on phase estimation and performing controlled unitaries, V3 corrected minor typo

arXiv.org e-Print Archive

Crossref

University of Strathclyde Institutional Repository