Search CORE

11 research outputs found

PERGA: A Paired-End Read Guided De Novo Assembler for Extending Contigs Using SVM and Look Ahead Approach

Author: Chin FYL
Leung HCM
Liu B
Quan G
Wang Y
Yiu SM
Zhu X
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

Since the read lengths of high throughput sequencing (HTS) technologies are short, de novo assembly which plays significant roles in many applications remains a great challenge. Most of the state-of-the-art approaches base on de Bruijn graph strategy and overlap-layout strategy. However, these approaches which depend on k-mers or read overlaps do not fully utilize information of paired-end and single-end reads when resolving branches. Since they treat all single-end reads with overlapped length larger than a fix threshold equally, they fail to use the more confident long overlapped reads for assembling and mix up with the relative short overlapped reads. Moreover, these approaches have not been special designed for handling tandem repeats (repeats occur adjacently in the genome) and they usually break down the contigs near the tandem repeats. We present PERGA (Paired-End Reads Guided Assembler), a novel sequence-reads-guided de novo assembly approach, which adopts greedy-like prediction strategy for assembling reads to contigs and scaffolds using paired-end reads and different read overlap size ranging from Omax to Omin to resolve the gaps and branches. By constructing a decision model using machine learning approach based on branch features, PERGA can determine the correct extension in 99.7% of cases. When the correct extension cannot be determined, PERGA will try to extend the contig by all feasible extensions and determine the correct extension by using look-ahead approach. Many difficult-resolved branches are due to tandem repeats which are close in the genome. PERGA detects such different copies of the repeats to resolve the branches to make the extension much longer and more accurate. We evaluated PERGA on both Illumina real and simulated datasets ranging from small bacterial genomes to large human chromosome, and it constructed longer and more accurate contigs and scaffolds than other state-of-the-art assemblers. PERGA can be freely downloaded at https://github.com/hitbio/PERGA.published_or_final_versio

Directory of Open Access Journals

PubMed Central

HKU Scholars Hub

FigShare

A hybrid method for the exact planted (l, d) motif finding problem and its parallelization

Author: A Brazma
A Price
AM Carvalho
C Huang
C Lawrence
C Lawrence
CJ McInerny
D Gusfield
D Sharma
DJ Galas
E Eskin
E Wingender
FYL Chin
GZ Hertz
H Dinh
Hazem M Bahig
HM Bahig
I Rigoutsos
J Blanchette
J Buhler
J Davila
J Davila
J Van Helden
J Zhu
JM Cherry
L Marsan
M Blanchette
M Gelfand
M Tompa
MF Sagot
MM Abbas
Mohamed Abouelhoda
Mostafa M Abbas
MS Waterman
N Pisanti
P Pevzner
PA Evans
R Staden
S Natesan
S Rajasekaran
S Sinha
T Bailey
Y Fraenkel
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A 1-local 13/9-competitive algorithm for multicoloring hexagonal graphs

Author: Chin FYL
Zhang Y
Zhu H
Publication venue: Germany
Publication date: 01/01/2007
Field of study

In the frequency allocation problem, we are given a mobile telephone network, whose geographical coverage area is divided into cells, wherein phone calls are serviced by assigning frequencies to them so that no two calls emanating from the same or neighboring cells are assigned the same frequency. The problem is to use the frequencies efficiently, i.e., minimize the span of frequencies used. The frequency allocation problem can be regarded as a multicoloring problem on a weighted hexagonal graph. In this paper, we give a 1-local 4/3-competitive distributed algorithm for multicoloring a triangle-free hexagonal graph, which is a special case. Based on this result, we then propose a 1-local 13/9-competitive algorithm for multicoloring the (general-case) hexagonal graph, thereby improving the previous 1-local 3/2-competitive algorithm. © Springer-Verlag Berlin Heidelberg 2007.link_to_subscribed_fulltex

HKU Scholars Hub

A Polynomial Time Solution for Labeling a Rectilinear Map

Author: Chin FYL
Poon CK
Zhu BH
Publication venue: ACM Press.
Publication date: 01/01/1997
Field of study

link_to_subscribed_fulltex

HKU Scholars Hub

Online OVSF code assignment with resource augmentation

Author: Chin FYL
Zhang Y
Zhu H
Publication venue: Germany
Publication date: 01/01/2007
Field of study

Orthogonal Variable Spreading Factor (OVSF) code assignment is a fundamental problem in Wideband Code-Division Multiple-Access (W-CDMA) systems, which play an important role in third generation mobile communications. In the OVSF problem, codes must be assigned to incoming code requests, with different data rate requirements, in such a way that they are mutually orthogonal with respect to an OVSF code tree. An OVSF code tree is a complete binary tree in which each node represents a code associated with the combined bandwidths of its two children. To be mutually orthogonal, each leaf-to-root path must contain at most one assigned code. In this paper, we focus on the online version of the OVSF code assignment problem, in the often-studied context of the single cell as well as in the more general context of the whole multi-cell cellular network (for which there are no known results). With the help of 1/8 and 11/8 extra bandwidth resources, we are able to give a 5-competitive algorithm in the single cell and the multicell context respectively, which means that the competitive ratio is a constant and not a function of the height of the OVSF tree and thereby improving upon past results. © Springer-Verlag Berlin Heidelberg 2007.link_to_subscribed_fulltex

HKU Scholars Hub

Online OVSF code assignment with resource augmentation

Author: Zhang Y
Zhu H
Chin FYL
Publication venue: Germany
Publication date: 01/01/1998
Field of study

Digital Repository @ Iowa State University (ISU)

Crossref

HKU Scholars Hub

Approximate and dynamic rank aggregation

Author: Chin FYL
Deng X
Fang Q
Zhu S
Publication venue: 'Elsevier BV'
Publication date: 01/01/2004
Field of study

Rank aggregation, originally an important issue in social choice theory, has become more and more important in information retrieval applications over the Internet, such as meta-search, recommendation system, etc. In this work, we consider an aggregation function using a weighted version of the normalized Kendall-τ distance. We propose a polynomial time approximation scheme, as well as a practical heuristic algorithm with the approximation ratio two for the NP-hard problem. In addition, we discuss issues and models for the dynamic rank aggregation problem. © 2004 Elsevier B.V. All rights reserved.link_to_subscribed_fulltext9th International Computing and Combinatorics Conference, Big Sky, MT, 25-28 July 2003. In Theoretical Computer Science, 2004, v. 325 n. 3, p. 409-42

Elsevier - Publisher Connector

HKU Scholars Hub

Greedy online frequency allocation in cellular networks

Author: Chan JWT
Chin FYL
Ye D
Zhang Y
Zhu H
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

The online frequency allocation problem for cellular networks has been well studied in these years. Given a mobile telephone network, whose geographical coverage area is divided into cells, phone calls are served by assigning frequencies to them, and no two calls emanating from the same or neighboring cells are assigned the same frequency. Assuming an online setting that the calls arrive one by one, the problem is to minimize the span of the frequencies used. In this paper, we study the greedy approach for the online frequency allocation problem, which assigns the minimal available frequency to a new call so that the call does not interfere with calls of the same cell or neighboring cells. If the calls have infinite duration, the competitive ratio of greedy algorithm has a tight upper bound of 17/7, which closes the gap of [17 / 7, 2.5) in [I. Caragiannis, C. Kaklamanis, E. Papaioannou, Efficient on-line frequency allocation and call control in cellular networks, Theory Comput. Syst. 35 (5) (2002) 521-543. A preliminary version of the paper appeared in SPAA 2000]. If the calls have finite duration, i.e., each call may be terminated at some time, the competitive ratio of the greedy algorithm has a tight upper bound of 3. © 2006 Elsevier B.V. All rights reserved.link_to_subscribed_fulltex

HKU Scholars Hub

Maximizing Throughput in Energy-Harvesting Sensor Nodes

Author: A Borodin
A Kesselman
A Zhu
C Moser
FYL Chin
FYL Chin
H Wang
H Wang
J Lei
M Englert
MH Goldwasser
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

We consider an online throughput maximization problem in sensor nodes that can harvest energy. The sensor nodes generate and forward packets, which cost energy; they can also harvest energy from the environment, but the amount of energy that can be harvested is not known in advance. We give a number of algorithms and lower bounds for the case of a single node. We consider both the general case and some types of ‘non-idling’ adversaries where we can get better bounds. We also consider the case of networks with multiple nodes and demonstrate that some very simple scenarios already admit no competitive algorithms

Crossref

Leicester Research Archive

Comparison-Based Buffer Management in QoS Switches

Author: A Kesselman
A Kesselman
A Zhu
D Sleator
FYL Chin
FYL Chin
Kamal Al-Bawani
M Bienkowski
M Chrobak
M Englert
M Englert
Matthias Englert
Matthias Westermann
MH Goldwasser
N Reingold
TH Cormen
V Paxson
WA Aiello
Ł Jeż
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref