Search CORE

67 research outputs found

The costs of traumatic brain injury due to motorcycle accidents in Hanoi, Vietnam

Author: Doran Christopher M
Hill Peter S
Hoang Hanh TM
Nguyen Phuong K
Pham Tran L
Vo Thuy TN
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Background: Road traffic accidents are the leading cause of fatal and non-fatal injuries in Vietnam. The purpose of this study is to estimate the costs, in the first year post-injury, of non-fatal traumatic brain injury (TBI) in motorcycle users not wearing helmets in Hanoi, Vietnam. The costs are calculated from the perspective of the injured patients and their families, and include quantification of direct, indirect and intangible costs, using years lost due to disability as a proxy. Methods: The study was a retrospective cross-sectional study. Data on treatment and rehabilitation costs, employment and support were obtained from patients and their families using a structured questionnaire and The European Quality of Life instrument (EQ6D). Results: Thirty-five patients and their families were interviewed. On average, patients with severe, moderate and minor TBI incurred direct costs at USD 2,365, USD 1,390 and USD 849, with time lost for normal activities averaging 54 weeks, 26 weeks and 17 weeks and years lived with disability (YLD) of 0.46, 0.25 and 0.15 year, respectively. Conclusion: All three component costs of TBI were high; the direct cost accounted for the largest proportion, with costs rising with the severity of TBI. The results suggest that the burden of TBI can be catastrophic for families because of high direct costs, significant time off work for patients and caregivers, and impact on health-related quality of life. Further research is warranted to explore the actual social and economic benefits of mandatory helmet use

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Queensland eSpace

Effective selection of informative SNPs and classification on the HapMap genotype data

Author: A Gusev
B Halldrsson
B Wu
E Halperin
I Guyon
I Levner
J Devore
J Jaeger
J Park
L Wang
Lipo Wang
LP Wang
LP Wang
M Stephens
NA Rosenberg
NA Rosenberg
Nina Zhou
R Tibshirani
S Wright
TM Phuong
V Bafna
V Vapnik
WM Trochim
Y Su
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Since the single nucleotide polymorphisms (SNPs) are genetic variations which determine the difference between any two unrelated individuals, the SNPs can be used to identify the correct source population of an individual. For efficient population identification with the HapMap genotype data, as few informative SNPs as possible are required from the original 4 million SNPs. Recently, Park <it>et al.</it> (2006) adopted the nearest shrunken centroid method to classify the three populations, i.e., Utah residents with ancestry from Northern and Western Europe (CEU), Yoruba in Ibadan, Nigeria in West Africa (YRI), and Han Chinese in Beijing together with Japanese in Tokyo (CHB+JPT), from which 100,736 SNPs were obtained and the top 82 SNPs could completely classify the three populations. Results In this paper, we propose to first rank each feature (SNP) using a ranking measure, i.e., a modified t-test or F-statistics. Then from the ranking list, we form different feature subsets by sequentially choosing different numbers of features (e.g., 1, 2, 3, ..., 100.) with top ranking values, train and test them by a classifier, e.g., the support vector machine (SVM), thereby finding one subset which has the highest classification accuracy. Compared to the classification method of Park <it>et al.</it>, we obtain a better result, i.e., good classification of the 3 populations using on average 64 SNPs. Conclusion Experimental results show that the both of the modified t-test and F-statistics method are very effective in ranking SNPs about their classification capabilities. Combined with the SVM classifier, a desirable feature subset (with the minimum size and most informativeness) can be quickly found in the greedy manner after ranking all SNPs. Our method is able to identify a very small number of important SNPs that can determine the populations of individuals.</p

Crossref

Directory of Open Access Journals

PubMed Central

DR-NTU (Digital Repository of NTU)

Optimizing substitution matrix choice and gap parameters for sequence alignment

Author: CB Do
CB Do
CN Dewey
D Gusfield
DT Jones
E Kim
G Blackshields
GA Price
GH Gonnet
I Van Walle
J Flannick
J Kececioglu
J Pei
JD Thompson
JD Thompson
JG Henikoff
K Katoh
M Box
MA Larkin
MO Dayhoff
MP Styczynski
MS Waterman
O Chapelle
RC Edgar
RC Edgar
Robert C Edgar
S Henikoff
T Lassmann
T Muller
T Muller
TM Phuong
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background While substitution matrices can readily be computed from reference alignments, it is challenging to compute optimal or approximately optimal gap penalties. It is also not well understood which substitution matrices are the most effective when alignment accuracy is the goal rather than homolog recognition. Here a new parameter optimization procedure, POP, is described and applied to the problems of optimizing gap penalties and selecting substitution matrices for pair-wise global protein alignments. Results POP is compared to a recent method due to Kim and Kececioglu and found to achieve from 0.2% to 1.3% higher accuracies on pair-wise benchmarks extracted from BALIBASE. The VTML matrix series is shown to be the most accurate on several global pair-wise alignment benchmarks, with VTML200 giving best or close to the best performance in all tests. BLOSUM matrices are found to be slightly inferior, even with the marginal improvements in the bug-fixed RBLOSUM series. The PAM series is significantly worse, giving accuracies typically 2% less than VTML. Integer rounding is found to cause slight degradations in accuracy. No evidence is found that selecting a matrix based on sequence divergence improves accuracy, suggesting that the use of this heuristic in CLUSTALW may be ineffective. Using VTML200 is found to improve the accuracy of CLUSTALW by 8% on BALIBASE and 5% on PREFAB. Conclusion The hypothesis that more accurate alignments of distantly related sequences may be achieved using low-identity matrices is shown to be false for commonly used matrix types. Source code and test data is freely available from the author's web site at <url>http://www.drive5.com/pop</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A random forest approach to the detection of epistatic interactions in case-control studies

Author: A Bureau
A Collins
AG Heidema
AM Glazier
BA McKinney
CT Tsai
E Lander
HC Fung
J Hoh
J Marchini
J Millstein
J Simon-Sanchez
JH Moore
JK Pritchard
L Breiman
L Kruglyak
L Tiret
MD Ritchie
MP Martin
MR Nelson
N Chatterjee
NJ Risch
R Culverhouse
R Diaz-Uriarte
R Jiang
R Jiang
RJ Klein
RO Duda
Rui Jiang
SM Williams
TM Phuong
Wanwan Tang
Wenhui Fu
X Chen
Xuebing Wu
Y Ye
Y Zhang
YM Cho
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The key roles of epistatic interactions between multiple genetic variants in the pathogenesis of complex diseases notwithstanding, the detection of such interactions remains a great challenge in genome-wide association studies. Although some existing multi-locus approaches have shown their successes in small-scale case-control data, the "combination explosion" course prohibits their applications to genome-wide analysis. It is therefore indispensable to develop new methods that are able to reduce the search space for epistatic interactions from an astronomic number of all possible combinations of genetic variants to a manageable set of candidates. Results We studied case-control data from the viewpoint of binary classification. More precisely, we treated single nucleotide polymorphism (SNP) markers as categorical features and adopted the random forest to discriminate cases against controls. On the basis of the gini importance given by the random forest, we designed a sliding window sequential forward feature selection (SWSFS) algorithm to select a small set of candidate SNPs that could minimize the classification error and then statistically tested up to three-way interactions of the candidates. We compared this approach with three existing methods on three simulated disease models and showed that our approach is comparable to, sometimes more powerful than, the other methods. We applied our approach to a genome-wide case-control dataset for Age-related Macular Degeneration (AMD) and successfully identified two SNPs that were reported to be associated with this disease. Conclusion Besides existing pure statistical approaches, we demonstrated the feasibility of incorporating machine learning methods into genome-wide case-control studies. The gini importance offers yet another measure for the associations between SNPs and complex diseases, thereby complementing existing statistical measures to facilitate the identification of epistatic interactions and the understanding of epistasis in the pathogenesis of complex diseases.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Identification of microRNA-mRNA modules using microarray data

Author: A Cimmino
A Krek
A Subramanian
B Efron
B John
BM Bolstad
BP Lewis
C Welch
CJ Guo
D Bonci
David DF Ma
de Broek IV
Emily E. Bosco
F Wang
GA Calin
GA Calin
GK Smyth
H Zhang
I Satzger
J Lu
JG Joung
JG Joung
L Breiman
L He
L Kaufman
L Xia
M Megraw
Mark Lutherborrow
MR Segal
NC Gutierrez
Q Liu
R Edgar
RA Irizarry
RW Chen
S Bandyopadhyay
S Griffiths-Jones
SR Yoon
T Barrett
T Hastie
T Xu
T. Yoshida
TM Phuong
V. Jayaswal
Vivek Jayaswal
William Ritchie
XX Peng
Y Benjamini
Yee H Yang
Yuanyuan Xiao
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background MicroRNAs (miRNAs) are post-transcriptional regulators of mRNA expression and are involved in numerous cellular processes. Consequently, miRNAs are an important component of gene regulatory networks and an improved understanding of miRNAs will further our knowledge of these networks. There is a many-to-many relationship between miRNAs and mRNAs because a single miRNA targets multiple mRNAs and a single mRNA is targeted by multiple miRNAs. However, most of the current methods for the identification of regulatory miRNAs and their target mRNAs ignore this biological observation and focus on miRNA-mRNA pairs. Results We propose a two-step method for the identification of many-to-many relationships between miRNAs and mRNAs. In the first step, we obtain miRNA and mRNA clusters using a combination of miRNA-target mRNA prediction algorithms and microarray expression data. In the second step, we determine the associations between miRNA clusters and mRNA clusters based on changes in miRNA and mRNA expression profiles. We consider the miRNA-mRNA clusters with statistically significant associations to be potentially regulatory and, therefore, of biological interest. Conclusions Our method reduces the interactions between several hundred miRNAs and several thousand mRNAs to a few miRNA-mRNA groups, thereby facilitating a more meaningful biological analysis and a more targeted experimental validation.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Identification of Yeast Transcriptional Regulation Networks Using Multivariate Random Forests

Author: A Boorsma
A Bureau
A Gasch
A Kundaje
A Tanay
A Thalamuthu
B Futcher
C Koch
CJ McInerny
CT Harbison
D Das
D Das
E Segal
Eric P. Xing
H Althoefer
HM Bussemaker
HW Mewes
J Bähler
J Ernst
JD Hughes
JR Quinlan
JR Warner
JS Chang
KJ Archer
KL Lunetta
L Breiman
L Breiman
L Breiman
L Kaufman
M Kato
M Segal
Mark R. Segal
MB Eisen
N Zhang
P Sudarsanam
PT Spellman
R Diaz-Uriarte
R Tibshirani
RAM de Bruin
RJ Cho
S Chu
S Dudoit
S Keles
S Tavazioe
SA Burchett
SA Raithatha
TM Phuong
U Schlecht
Y Benjamini
Y Pilpel
Yuanyuan Xiao
Publication venue: Public Library of Science
Publication date: 01/06/2009
Field of study

The recent availability of whole-genome scale data sets that investigate complementary and diverse aspects of transcriptional regulation has spawned an increased need for new and effective computational approaches to analyze and integrate these large scale assays. Here, we propose a novel algorithm, based on random forest methodology, to relate gene expression (as derived from expression microarrays) to sequence features residing in gene promoters (as derived from DNA motif data) and transcription factor binding to gene promoters (as derived from tiling microarrays). We extend the random forest approach to model a multivariate response as represented, for example, by time-course gene expression measures. An analysis of the multivariate random forest output reveals complex regulatory networks, which consist of cohesive, condition-dependent regulatory cliques. Each regulatory clique features homogeneous gene expression profiles and common motifs or synergistic motif groups. We apply our method to several yeast physiological processes: cell cycle, sporulation, and various stress conditions. Our technique displays excellent performance with regard to identifying known regulatory motifs, including high order interactions. In addition, we present evidence of the existence of an alternative MCB-binding pathway, which we confirm using data from two independent cell cycle studies and two other physioloigical processes. Finally, we have uncovered elaborate transcription regulation refinement mechanisms involving PAC and mRRPE motifs that govern essential rRNA processing. These include intriguing instances of differing motif dosages and differing combinatorial motif control that promote regulatory specificity in rRNA metabolism under differing physiological processes

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

progressiveMauve: Multiple Genome Alignment with Gene Gain, Loss and Rearrangement

Multiple genome alignment remains a challenging problem. Effects of recombination including rearrangement, segmental duplication, gain, and loss can create a mosaic pattern of homology even among closely related organisms.We describe a new method to align two or more genomes that have undergone rearrangements due to recombination and substantial amounts of segmental gain and loss (flux). We demonstrate that the new method can accurately align regions conserved in some, but not all, of the genomes, an important case not handled by our previous work. The method uses a novel alignment objective score called a sum-of-pairs breakpoint score, which facilitates accurate detection of rearrangement breakpoints when genomes have unequal gene content. We also apply a probabilistic alignment filtering method to remove erroneous alignments of unrelated sequences, which are commonly observed in other genome alignment methods. We describe new metrics for quantifying genome alignment accuracy which measure the quality of rearrangement breakpoint predictions and indel predictions. The new genome alignment algorithm demonstrates high accuracy in situations where genomes have undergone biologically feasible amounts of genome rearrangement, segmental gain and loss. We apply the new algorithm to a set of 23 genomes from the genera Escherichia, Shigella, and Salmonella. Analysis of whole-genome multiple alignments allows us to extend the previously defined concepts of core- and pan-genomes to include not only annotated genes, but also non-coding regions with potential regulatory roles. The 23 enterobacteria have an estimated core-genome of 2.46Mbp conserved among all taxa and a pan-genome of 15.2Mbp. We document substantial population-level variability among these organisms driven by segmental gain and loss. Interestingly, much variability lies in intergenic regions, suggesting that the Enterobacteriacae may exhibit regulatory divergence.The multiple genome alignments generated by our software provide a platform for comparative genomic and population genomic studies. Free, open-source software implementing the described genome alignment approach is available from http://gel.ahabs.wisc.edu/mauve

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

OPUS - University of Technology Sydney

PubMed Central

Arboviral Etiologies of Acute Febrile Illnesses in Western South America, 2000–2007

Author: A Balmaseda
AC Brault
AC Morrison
Alberto Gianella
AM Powers
Amy C. Morrison
Ana Maria Morales
AR Bharti
BL Innis
BM Forshey
Brett M. Forshey
BW Johnson
C Domingo
C Ramal
Carolina Guevara
Claudio Rocha
César Madrid
DI Ortiz
DJ Gubler
DM Watts
DM Watts
Eduardo Gotuzzo
Efrain Vallejo
ER Caceda
F Rivas
G Chowell
G Kuno
GC Smith
HL Phuong
I Bosch
IM Rocco
IP Greene
James G. Olson
Jorge Vargas
JP Kondig
JS Mackenzie
JT Roehrig
Juan Perez
JW LeDuc
JW LeDuc
KA Tsetsarkin
KA Tsetsarkin
Kevin L. Russell
LI Spinsanti
Luis Beingolea
M Anishchenko
M Sihuincha
MA Johnson
MA Morales
Manuel Cespedes
MF Saeed
MG Bruce
MJ Turell
Monica Negrete
MS Oberste
MZ Ansari
Nicolas Aguayo
Nora Reyes
Patrick J. Blair
PV Aguilar
PV Aguilar
PV Aguilar
RB Tesh
RS Azevedo
RS Lanciotti
SB Halstead
SB Halstead
SC Cabezas
SC Weaver
SC Weaver
SC Weaver
SC Weaver
Scott B. Halstead
SE Robertson
SR Manock
T Kochel
Tadeusz J. Kochel
TM Yuill
V. Alberto Laguna-Torres
Victor Suarez
X de Lamballerie
Y Makino
Y Montoya
Publication venue: Public Library of Science
Publication date: 10/08/2010
Field of study

Over recent decades, the variety and quantity of diseases caused by viruses transmitted to humans by mosquitoes and other arthropods (also known as arboviruses) have increased around the world. One difficulty in studying these diseases is the fact that the symptoms are often non-descript, with patients reporting such symptoms as low-grade fever and headache. Our goal in this study was to use laboratory tests to determine the causes of such non-descript illnesses in sites in four countries in South America, focusing on arboviruses. We established a surveillance network in 13 locations in Ecuador, Peru, Bolivia, and Paraguay, where patient samples were collected and then sent to a central laboratory for testing. Between May 2000 and December 2007, blood serum samples were collected from more than 20,000 participants with fever, and recent arbovirus infection was detected for nearly one third of them. The most common viruses were dengue viruses (genera Flavivirus). We also detected infection by viruses from other genera, including Alphavirus and Orthobunyavirus. This data is important for understanding how such viruses might emerge as significant human pathogens

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central