Search CORE

32 research outputs found

Even faster sorting of (not only) integers

Author: C Hoare
D Knuth
D Musser
D Shell
J Shen
J Williams
M Codish
M Kokot
PM McIlroy
S Deorowicz
T Cormen
Publication venue
Publication date: 02/03/2017
Field of study

In this paper we introduce RADULS2, the fastest parallel sorter based on radix algorithm. It is optimized to process huge amounts of data making use of modern multicore CPUs. The main novelties include: extremely optimized algorithm for handling tiny arrays (up to about a hundred of records) that could appear even billions times as subproblems to handle and improved processing of larger subarrays with better use of non-temporal memory stores

arXiv.org e-Print Archive

Crossref

Parallel String Sample Sort

Author: J. Kärkkäinen
J. Wassenberg
K. Mehlhorn
P. Sanders
P.M. McIlroy
R. Sinha
R. Sinha
R. Sinha
T. Hagerup
W. Ng
Publication venue
Publication date: 01/01/2013
Field of study

arXiv.org e-Print Archive

CiteSeerX

Crossref

KITopen

Engineering Parallel String Sorting

Author: Bingmann Timo
Eberle Andreas
Sanders Peter
Publication venue
Publication date: 09/03/2014
Field of study

We discuss how string sorting algorithms can be parallelized on modern multi-core shared memory machines. As a synthesis of the best sequential string sorting algorithms and successful parallel sorting algorithms for atomic objects, we first propose string sample sort. The algorithm makes effective use of the memory hierarchy, uses additional word level parallelism, and largely avoids branch mispredictions. Then we focus on NUMA architectures, and develop parallel multiway LCP-merge and -mergesort to reduce the number of random memory accesses to remote nodes. Additionally, we parallelize variants of multikey quicksort and radix sort that are also useful in certain situations. Comprehensive experiments on five current multi-core platforms are then reported and discussed. The experiments show that our implementations scale very well on real-world inputs and modern machines.Comment: 46 pages, extension of "Parallel String Sample Sort" arXiv:1305.115

arXiv.org e-Print Archive

CiteSeerX

Crossref

KITopen

On An Improved Parallel Construction Of Suffix Arrays For Low Bandwidth Pc-Cluster.

Author: Abdul Rashid Nur'Aini
Abdullah Rosni
Kok Jun Lee
Md. Ali Norhashidah
Publication venue
Publication date: 01/10/2003
Field of study

An algorithm for the parallel construction of suffix arrays generation for any texts with larger alphabet size on distributed memory architecture is presente

Repository@USM

Medium-Space Algorithms for Inverse BWT

Author: Kärkkäinen Juha
Puglisi Simon J.
Publication venue: Springer
Publication date: 01/01/2010
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Impelementasi Struktur Data Patricia Tree pada Autocomplete Seacrh Box

Author: Zusni Adisya
Publication venue: Universitas Telkom
Publication date: 01/01/2011
Field of study

ABSTRAKSI: Autocomplete pada search box berhubungan dengan data yang begitu besar. Sehingga ketika dilakukan pencarian frase/kata pada database terdapat kendala, dimana ketika semua frase harus ditelusuri untuk mendapatkan hasil dan terdapat hubungan antara server dan client, maka hal seperti ini akan membebani kinerja server. Sehubungan dengan itu, diperlukan metode khusus dalam hal pengambilan data, agar prosesnya ringan dan cepat. Salah satu yang dapat dilakukan yaitu dengan menggunakan suatu struktur data patricia tree.Penggunaan patricia tree didasarkan karena pencarian dilakukan pada frase awal dari keseluruhan frase yang diinginkan. Sehingga ketika dilakukan pencarian pada patricia tree, tidak perlu menelusuri semua struktur patricia tree, cukup pada struktur patricia tree yang karakter awalnya sesuai saja. Pada patricia tree ini node yang dibangun bisa diberi bobot, sehingga pada kasus ini pemunculan suggestion dapat diprioritaskan berdasarkan bobotnya.Setelah dilakukan pengujian, penelitian ini memberikan hasil bahwa patricia tree mampu memberikan respon hasil pencarian yang lebih cepat dibandingkan prefiks tree sebagai struktur data pembanding. Pembentukan tree dengan pemberian bobot juga memberikan hasil yang lebih baik dalam ketepatan pencarian.Kata Kunci : patricia tree, trie, autocomplete, search box.ABSTRACT: Autocomplete in the search box associated with a large of data. So when do the search phrase / word in the database there are constraints, which when all the phrases must be traced to obtain the results and there is a connection between the server and client, this would overload the server\u27s performance. Accordingly, a methods are needed specifically in terms of data retrieval, so that the process is lightweight and fast. One that can be done by using a patricia tree data structure.Patricia tree is based on the use of a search performed on the initial phrase of the whole phrase desired. So when do a search on the patricia tree, no need to browse through all the patricia tree structure, simply on the structure of the character originally patricia tree corresponding course. Patricia tree at this node is built can be weighted, so that in this case the appearance of suggestion can be prioritized based on its weight.After testing, this study provides results that patricia tree capable of providing search results more quickly than a prefix tree data structure as a comparison. The establishment of tree by assigning weights also gives better results in search accuracy.Keyword: patricia tree, trie, autocomplete, search box

Open Library

Efficient large-scale protein sequence comparison and gene matching to identify orthologs and co-orthologs

Author: Altschul
Altschul
Arun S. Konagurthu
Arunachalam
Bandyopadhyay
Bansal
Calabrese
Dehal
Dice
Edgar
Edgar
Flicek
Fukuhara
Geoffrey I. Webb
Gordân
Haas
Hachiya
James C. Whisstock
Jiangning Song
Jun
Khalid Mahmood
Koohy
Koonin
Kriventseva
Kuhn
Kärkkäinen
Li
Mahmood
Needleman
Papadimitriou
Pearson
Pruess
Remm
Sakarya
Sankoff
Santini
Sjolander
Smith
Smith
Sonnhammer
Sorensen
Swidan
Vandepoele
Vinga
Vingron
Widmann
Woolfe
Xu
Yu
Zhi
Publication venue: Oxford University Press
Publication date: 01/01/2012
Field of study

Broadly, computational approaches for ortholog assignment is a three steps process: (i) identify all putative homologs between the genomes, (ii) identify gene anchors and (iii) link anchors to identify best gene matches given their order and context. In this article, we engineer two methods to improve two important aspects of this pipeline [specifically steps (ii) and (iii)]. First, computing sequence similarity data [step (i)] is a computationally intensive task for large sequence sets, creating a bottleneck in the ortholog assignment pipeline. We have designed a fast and highly scalable sort-join method (afree) based on k-mer counts to rapidly compare all pairs of sequences in a large protein sequence set to identify putative homologs. Second, availability of complex genomes containing large gene families with prevalence of complex evolutionary events, such as duplications, has made the task of assigning orthologs and co-orthologs difficult. Here, we have developed an iterative graph matching strategy where at each iteration the best gene assignments are identified resulting in a set of orthologs and co-orthologs. We find that the afree algorithm is faster than existing methods and maintains high accuracy in identifying similar genes. The iterative graph matching strategy also showed high accuracy in identifying complex gene relationships. Standalone afree available from http://vbc.med.monash.edu.au/∼kmahmood/afree. EGM2, complete ortholog assignment pipeline (including afree and the iterative graph matching method) available from http://vbc.med.monash.edu.au/∼kmahmood/EGM2

Crossref

PubMed Central

Monash University Research Portal

University of Melbourne Institutional Repository