Search CORE

9 research outputs found

Evaluation of next-generation sequencing software in mapping and assembly

Author: A Bashir
A Bateman
AC McHardy
AD Smith
B Langmead
BinBin Wang
C Trapnell
CA Tilford
D Campagna
D Hernandez
D Weese
DR Bentley
DR Zerbino
DS Horner
DW Bryant Jr
ER Mardis
ER Mardis
ES Lander
EW Myers
F Sanger
H Jiang
H Li
H Li
H Li
H Lin
HL Eaves
J Butler
JC Dohm
JC Venter
JO Korbel
JR Miller
JR Miller
JT Simpson
JT Simpson
K Chen
KE Holt
L Engstrand
L Noe
M Margulies
M Pop
M Pop
MC Schatz
MJ Chaisson
ML Metzker
MS Hossain
N Homer
N Malhis
NL Clement
O Morozova
O Morozova
P Flicek
P Flicek
P Medvedev
PA Pevzner
PJ Campbell
PJ Hurd
R Staden
RF Service
RL Warren
RQ Li
RQ Li
Rui Jiang
SC Schuster
SM Rumble
Suying Bao
WingKeung Kwan
WJ Ansorge
WR Jeck
Xu Ma
Y Chen
YJ Kim
You-Qiang Song
Z Ning
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Next-generation high-throughput DNA sequencing technologies have advanced progressively in sequence-based genomic research and novel biological applications with the promise of sequencing DNA at unprecedented speed. These new non-Sanger-based technologies feature several advantages when compared with traditional sequencing methods in terms of higher sequencing speed, lower per run cost and higher accuracy. However, reads from next-generation sequencing (NGS) platforms, such as 454/Roche, ABI/SOLiD and Illumina/Solexa, are usually short, thereby restricting the applications of NGS platforms in genome assembly and annotation. We presented an overview of the challenges that these novel technologies meet and particularly illustrated various bioinformatics attempts on mapping and assembly for problem solving. We then compared the performance of several programs in these two fields, and further provided advices on selecting suitable tools for specific biological applications.published_or_final_versio

Crossref

HKU Scholars Hub

Deciphering the mechanisms of genetic disorders by high throughput genomic data

Author: Bao Suying
鲍素莹
Publication venue: 'The University of Hong Kong Libraries'
Publication date: 01/01/2013
Field of study

A new generation of non-Sanger-based sequencing technologies, so called “next-generation” sequencing (NGS), has been changing the landscape of genetics at unprecedented speed. In particular, our capacity in deciphering the genotypes underlying phenotypes, such as diseases, has never been greater. However, before fully applying NGS in medical genetics, researchers have to bridge the widening gap between the generation of massively parallel sequencing output and the capacity to analyze the resulting data. In addition, even a list of candidate genes with potential causal variants can be obtained from an effective NGS analysis, to pinpoint disease genes from the long list remains a challenge. The issue becomes especially difficult when the molecular basis of the disease is not fully elucidated. New NGS users are always bewildered by a plethora of options in mapping, assembly, variant calling and filtering programs and may have no idea about how to compare these tools and choose the “right” ones. To get an overview of various bioinformatics attempts in mapping and assembly, a series of performance evaluation work was conducted by using both real and simulated NGS short reads. For NGS variant detection, the performances of two most widely used toolkits were assessed, namely, SAM tools and GATK. Based on the results of systematic evaluation, a NGS data processing and analysis pipeline was constructed. And this pipeline was proved a success with the identification of a mutation (a frameshift deletion on Hnrnpa1, p.Leu181Valfs*6) related to congenital heart defect (CHD) in procollagen type IIA deficient mice. In order to prioritize risk genes for diseases, especially those with limited prior knowledge, a network-based gene prioritization model was constructed. It consists of two parts: network analysis on known disease genes (seed-based network strategy)and network analysis on differential expression (DE-based network strategy). Case studies of various complex diseases/traits demonstrated that the DE-based network strategy can greatly outperform traditional gene expression analysis in predicting disease-causing genes. A series of simulation work indicated that the DE-based strategy is especially meaningful to diseases with limited prior knowledge, and the model’s performance can be further advanced by integrating with seed-based network strategy. Moreover, a successful application of the network-based gene prioritization model in influenza host genetic study further demonstrated the capacity of the model in identifying promising candidates and mining of new risk genes and pathways not biased toward our current knowledge. In conclusion, an efficient NGS analysis framework from the steps of quality control and variant detection, to those of result analysis and gene prioritization has been constructed for medical genetics. The novelty in this framework is an encouraging attempt to prioritize risk genes for not well-characterized diseases by network analysis on known disease genes and differential expression data. The successful applications in detecting genetic factors associated with CHD and influenza host resistance demonstrated the efficacy of this framework. And this may further stimulate more applications of high throughput genomic data in dissecting the genetic components of human disorders in the near future.published_or_final_versionBiochemistryDoctoralDoctor of Philosoph

HKU Scholars Hub

Retraction Note: Evaluation of next-generation sequencing software in mapping and assembly

Author: BinBin Wang
Rui Jiang
SuYing Bao
WingKeung Kwan
Xu Ma
You-Qiang Song
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref