Search CORE

200 research outputs found

MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph

Author: Lam Tak-Wah
Li Dinghua
Liu Chi-Man
Luo Ruibang
Sadakane Kunihiko
Publication venue
Publication date: 23/12/2014
Field of study

MEGAHIT is a NGS de novo assembler for assembling large and complex metagenomics data in a time- and cost-efficient manner. It finished assembling a soil metagenomics dataset with 252Gbps in 44.1 hours and 99.6 hours on a single computing node with and without a GPU, respectively. MEGAHIT assembles the data as a whole, i.e., it avoids pre-processing like partitioning and normalization, which might compromise on result integrity. MEGAHIT generates 3 times larger assembly, with longer contig N50 and average contig length than the previous assembly. 55.8% of the reads were aligned to the assembly, which is 4 times higher than the previous. The source code of MEGAHIT is freely available at https://github.com/voutcn/megahit under GPLv3 license.Comment: 2 pages, 2 tables, 1 figure, submitted to Oxford Bioinformatics as an Application Not

arXiv.org e-Print Archive

HKU Scholars Hub

Potential Uses of Wild Germplasms of Grain Legumes for Crop Improvement

Author: Ailin Liu
Hon-Ming Lam
Leo Kan
Man-Wah Li
Muñoz Nacira Belen
Publication venue: 'MDPI AG'
Publication date: 01/01/2017
Field of study

Challenged by population increase, climatic change, and soil deterioration, crop improvement is always a priority in securing food supplies. Although the production of grain legumes is in general lower than that of cereals, the nutritional value of grain legumes make them important components of food security. Nevertheless, limited by severe genetic bottlenecks during domestication and human selection, grain legumes, like other crops, have suffered from a loss of genetic diversity which is essential for providing genetic materials for crop improvement programs. Illustrated by whole-genome-sequencing, wild relatives of crops adapted to various environments were shown to maintain high genetic diversity. In this review, we focused on nine important grain legumes (soybean, peanut, pea, chickpea, common bean, lentil, cowpea, lupin, and pigeonpea) to discuss the potential uses of their wild relatives as genetic resources for crop breeding and improvement, and summarized the various genetic/genomic approaches adopted for these purposes.Instituto de Fisiología y Recursos Genéticos VegetalesFil: Muñoz, Nacira Belen. Chinese University of Hong Kong. Centre for Soybean Research of the Partner State Key Laboratory of Agrobiotechnology and School of Life Sciences; China. Instituto Nacional de Tecnología Agropecuaria (INTA). Instituto de Fisiología y Recursos Genéticos Vegetales; Argentina. Universidad Nacional de Córdoba. Facultad de Ciencias Exactas Físicas y Naturales. Cátedra de Fisiología Vegetal; ArgentinaFil: Ailin, Liu. Chinese University of Hong Kong. Centre for Soybean Research of the Partner State Key Laboratory of Agrobiotechnology and School of Life Sciences; ChinaFil: Leo, Kan. Chinese University of Hong Kong. Centre for Soybean Research of the Partner State Key Laboratory of Agrobiotechnology and School of Life Sciences; ChinaFil: Man-Wah, Li. Chinese University of Hong Kong. Centre for Soybean Research of the Partner State Key Laboratory of Agrobiotechnology and School of Life Sciences; ChinaFil: Hon-Ming, Lam. Chinese University of Hong Kong. Centre for Soybean Research of the Partner State Key Laboratory of Agrobiotechnology and School of Life Sciences; Chin

Repositorio Institucional – Biblioteca Digital

BASE: a practical de novo assembler for large genomes using long NGS reads

Author: Binghang Liu
Chi-Man Liu
Dinghua Li
Hing-Fung Ting
Ruibang Luo
Siu-Ming Yiu
Tak-Wah Lam
Yingrui Li
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

Â© 2016 The Author(s). Background: De novo genome assembly using NGS data remains a computation-intensive task especially for large genomes. In practice, efficiency is often a primary concern and favors using a more efficient assembler like SOAPdenovo2. Yet SOAPdenovo2, based on de Bruijn graph, fails to take full advantage of longer NGS reads (say, 150 bp to 250 bp from Illumina HiSeq and MiSeq). Assemblers that are based on string graphs (e.g., SGA), though less popular and also very slow, are more favorable for longer reads. Methods: This paper shows a new de novo assembler called BASE. It enhances the classic seed-extension approach by indexing the reads efficiently to generate adaptive seeds that have high probability to appear uniquely in the genome. Such seeds form the basis for BASE to build extension trees and then to use reverse validation to remove the branches based on read coverage and paired-end information, resulting in high-quality consensus sequences of reads sharing the seeds. Such consensus sequences are then extended to contigs. Results: Experiments on two bacteria and four human datasets shows the advantage of BASE in both contig quality and speed in dealing with longer reads. In the experiment on bacteria, two datasets with read length of 100 bp and 250 bp were used. Especially for the 250 bp dataset, BASE gives much better quality than SOAPdenovo2 and SGA and is simlilar to SPAdes. Regarding speed, BASE is consistently a few times faster than SPAdes and SGA, but still slower than SOAPdenovo2. BASE and Soapdenov2 are further compared using human datasets with read length 100 bp, 150 bp and 250 bp. BASE shows a higher N50 for all datasets, while the improvement becomes more significant when read length reaches 250 bp. Besides, BASE is more-meory efficent than SOAPdenovo2 when sequencing data with error rate. Conclusions: BASE is a practically efficient tool for constructing contig, with significant improvement in quality for long NGS reads. It is relatively easy to extend BASE to include scaffolding.published_or_final_versio

Springer - Publisher Connector

PubMed Central

HKU Scholars Hub

Molecular Responses to Osmotic Stresses in Soybean

Author: Ching Chan
Chun-Chiu Cheng
Fuk-Ling Wong
Hon-Ming Lam
Man-Wah Li
Tsui-Hung Phang
Publication venue: 'IntechOpen'
Publication date: 11/04/2011
Field of study

IntechOpen

Rice Hypersensitive Induced Reaction Protein 1 (OsHIR1) associates with plasma membrane and triggers hypersensitive cell death

Author: Cheung Ming-Yan
Fu Yaping
Lam Hon-Ming
Li Man-Wah
Sun Sai-Ming
Sun Zongxiu
Zhou Liang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background In plants, HIR (Hypersensitive Induced Reaction) proteins, members of the PID (Proliferation, Ion and Death) superfamily, have been shown to play a part in the development of spontaneous hypersensitive response lesions in leaves, in reaction to pathogen attacks. The levels of HIR proteins were shown to correlate with localized host cell deaths and defense responses in maize and barley. However, not much was known about the HIR proteins in rice. Since rice is an important cereal crop consumed by more than 50% of the populations in Asia and Africa, it is crucial to understand the mechanisms of disease responses in this plant. We previously identified the rice HIR1 (OsHIR1) as an interacting partner of the OsLRR1 (rice Leucine-Rich Repeat protein 1). Here we show that OsHIR1 triggers hypersensitive cell death and its localization to the plasma membrane is enhanced by OsLRR1. Result Through electron microscopy studies using wild type rice plants, OsHIR1 was found to mainly localize to the plasma membrane, with a minor portion localized to the tonoplast. Moreover, the plasma membrane localization of OsHIR1 was enhanced in transgenic rice plants overexpressing its interacting protein partner, OsLRR1. Co-localization of OsHIR1 and OsLRR1 to the plasma membrane was confirmed by double-labeling electron microscopy. Pathogen inoculation studies using transgenic <it>Arabidopsis thaliana </it>expressing either OsHIR1 or OsLRR1 showed that both transgenic lines exhibited increased resistance toward the bacterial pathogen <it>Pseudomonas syringae </it>pv. <it>tomato </it>DC3000. However, <it>OsHIR1 </it>transgenic plants produced more extensive spontaneous hypersensitive response lesions and contained lower titers of the invading pathogen, when compared to <it>OsLRR1 </it>transgenic plants. Conclusion The OsHIR1 protein is mainly localized to the plasma membrane, and its subcellular localization in that compartment is enhanced by OsLRR1. The expression of OsHIR1 may sensitize the plant so that it is more prone to HR and hence can react more promptly to limit the invading pathogens' spread from the infection sites.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

SOAP3-dp: Fast, Accurate and Sensitive GPU-based Short Read Aligner

Author: Chang Yu
Chi-Man Liu
David W Cheung
Edward Wu
Haoxiang Lin
Hing-Fung Ting
Jianqiao Zhu
Lap-Kei Lee
Ruibang Luo
Ruiqiang Li
Shaoliang Peng
Siu-Ming Yiu
Tak-Wah Lam
Thomas Wong
Wenjuan Zhu
Xiaoqian Zhu
Yingrui Li
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

To tackle the exponentially increasing throughput of Next-Generation Sequencing (NGS), most of the existing short-read aligners can be configured to favor speed in trade of accuracy and sensitivity. SOAP3-dp, through leveraging the computational power of both CPU and GPU with optimized algorithms, delivers high speed and sensitivity simultaneously. Compared with widely adopted aligners including BWA, Bowtie2, SeqAlto, GEM and GPU-based aligners including BarraCUDA and CUSHAW, SOAP3-dp is two to tens of times faster, while maintaining the highest sensitivity and lowest false discovery rate (FDR) on Illumina reads with different lengths. Transcending its predecessor SOAP3, which does not allow gapped alignment, SOAP3-dp by default tolerates alignment similarity as low as 60 percent. Real data evaluation using human genome demonstrates SOAP3-dp's power to enable more authentic variants and longer Indels to be discovered. Fosmid sequencing shows a 9.1 percent FDR on newly discovered deletions. SOAP3-dp natively supports BAM file format and provides a scoring scheme same as BWA, which enables it to be integrated into existing analysis pipelines. SOAP3-dp has been deployed on Amazon-EC2, NIH-Biowulf and Tianhe-1A.Comment: 21 pages, 6 figures, submitted to PLoS ONE, additional files available at "https://www.dropbox.com/sh/bhclhxpoiubh371/O5CO_CkXQE". Comments most welcom

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

HKU Scholars Hub

FigShare

Drought Stress and Tolerance in Soybean

Author: Au-Yeung Wan-Kin
Ku Yee-Shan
Lam Hon-Ming
Li Man-Wah
Liu Xueyi
Wen Chao-Qing
Yung Yuk-Lin
Publication venue: 'IntechOpen'
Publication date: 02/01/2013
Field of study

IntechOpen

Crossref

Laboratório de indicadores de Governança Pública: uma proposta para mensurar a efetividade dos gastos na Segurança Pública Municipal

Author: Chun Jason Xue
Dai Kui Wang
Lit Man Poon
Pui Ngan Lau
Wei Shen
Wing Tai Cheung
Yi Min Liang
Yun Wah Lam
Zhou Fang Li
Publication venue: Proex/Unila
Publication date: 01/01/2013
Field of study

Anais do 35º Seminário de Extensão Universitária da Região Sul - Área temática: EducaçãoPressões por maior transparência e accountability tem sido o mote de muitas mudanças no setor público. No entanto, parece existir uma dificuldade de colocar tais conceitos em prática na área de segurança pública. Este trabalho apresenta algumas iniciativas do Laboratório de Indicadores de Governança Pública, do CESFI-UDESC, na criação de indicadores de efetividade dos gastos dos municípios do Estado de Santa Catarina, em segurança pública. São apresentados no trabalho o que foi feito até o momento e quais os desafios na mensuração das ações de políticas públicas para esta ár

Directory of Open Access Journals

PubMed Central

Repositório Institucional da UNILA

HKU Scholars Hub

FigShare

Bleeding-Related Hospital Admissions and 30-Day Re-Admissions in Patients with Nonvalvular Atrial Fibrillation Treated with Dabigatran versus Warfarin

Author: Chan Esther W
Lau Wallis C Y
Leung Wai K
Li Xue
Lip Gregory Y H
Man Kenneth K C
Siu Chung-Wah
Wong Ian C K
Publication venue: 'Wiley'
Publication date: 01/01/2017
Field of study

VBN

MICA: A fast short-read aligner that takes full advantage of Many Integrated Core Architecture (MIC)

Author: Chan Sze-Hang
Cheung Jeanno
He Guangzhu
Lam Tak-Wah
Law Wai-Chun
Li Ruiqiang
Li Yingrui
Liu Chi-Man
Luo Ruibang
Peng Shaoliang
Wang Heng
Wang Jun
Wu Edward
Yu Chang
Zhou Dazong
Zhu Xiaoqian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Background: Short-read aligners have recently gained a lot of speed by exploiting the massive parallelism of GPU. An uprising alterative to GPU is Intel MIC; supercomputers like Tianhe-2, currently top of TOP500, is built with 48,000 MIC boards to offer ~55 PFLOPS. The CPU-like architecture of MIC allows CPU-based software to be parallelized easily; however, the performance is often inferior to GPU counterparts as an MIC card contains only ~60 cores (while a GPU card typically has over a thousand cores). Results: To better utilize MIC-enabled computers for NGS data analysis, we developed a new short-read aligner MICA that is optimized in view of MIC's limitation and the extra parallelism inside each MIC core. By utilizing the 512-bit vector units in the MIC and implementing a new seeding strategy, experiments on aligning 150 bp paired-end reads show that MICA using one MIC card is 4.9 times faster than BWA-MEM (using 6 cores of a top-end CPU), and slightly faster than SOAP3-dp (using a GPU). Furthermore, MICA's simplicity allows very efficient scale-up when multiple MIC cards are used in a node (3 cards give a 14.1-fold speedup over BWA-MEM). Summary: MICA can be readily used by MIC-enabled supercomputers for production purpose. We have tested MICA on Tianhe-2 with 90 WGS samples (17.47 Tera-bases), which can be aligned in an hour using 400 nodes. MICA has impressive performance even though MIC is only in its initial stage of development. Availability and implementation: MICA's source code is freely available at http://sourceforge.net/projects/mica-aligner under GPL v3. Supplementary information: Supplementary information is available as "Additional File 1". Datasets are available at www.bio8.cs.hku.hk/dataset/mica.published_or_final_versio

Crossref

PubMed Central

HKU Scholars Hub

University of Queensland eSpace