99 research outputs found
Meta-analysis of haplotype-association studies: comparison of methods and empirical evaluation of the literature
<p>Abstract</p> <p>Background</p> <p>Meta-analysis is a popular methodology in several fields of medical research, including genetic association studies. However, the methods used for meta-analysis of association studies that report haplotypes have not been studied in detail. In this work, methods for performing meta-analysis of haplotype association studies are summarized, compared and presented in a unified framework along with an empirical evaluation of the literature.</p> <p>Results</p> <p>We present multivariate methods that use summary-based data as well as methods that use binary and count data in a generalized linear mixed model framework (logistic regression, multinomial regression and Poisson regression). The methods presented here avoid the inflation of the type I error rate that could be the result of the traditional approach of comparing a haplotype against the remaining ones, whereas, they can be fitted using standard software. Moreover, formal global tests are presented for assessing the statistical significance of the overall association. Although the methods presented here assume that the haplotypes are directly observed, they can be easily extended to allow for such an uncertainty by weighting the haplotypes by their probability.</p> <p>Conclusions</p> <p>An empirical evaluation of the published literature and a comparison against the meta-analyses that use single nucleotide polymorphisms, suggests that the studies reporting meta-analysis of haplotypes contain approximately half of the included studies and produce significant results twice more often. We show that this excess of statistically significant results, stems from the sub-optimal method of analysis used and, in approximately half of the cases, the statistical significance is refuted if the data are properly re-analyzed. Illustrative examples of code are given in Stata and it is anticipated that the methods developed in this work will be widely applied in the meta-analysis of haplotype association studies.</p
Algorithms for incorporating prior topological information in HMMs: application to transmembrane proteins
BACKGROUND: Hidden Markov Models (HMMs) have been extensively used in computational molecular biology, for modelling protein and nucleic acid sequences. In many applications, such as transmembrane protein topology prediction, the incorporation of limited amount of information regarding the topology, arising from biochemical experiments, has been proved a very useful strategy that increased remarkably the performance of even the top-scoring methods. However, no clear and formal explanation of the algorithms that retains the probabilistic interpretation of the models has been presented so far in the literature. RESULTS: We present here, a simple method that allows incorporation of prior topological information concerning the sequences at hand, while at the same time the HMMs retain their full probabilistic interpretation in terms of conditional probabilities. We present modifications to the standard Forward and Backward algorithms of HMMs and we also show explicitly, how reliable predictions may arise by these modifications, using all the algorithms currently available for decoding HMMs. A similar procedure may be used in the training procedure, aiming at optimizing the labels of the HMM's classes, especially in cases such as transmembrane proteins where the labels of the membrane-spanning segments are inherently misplaced. We present an application of this approach developing a method to predict the transmembrane regions of alpha-helical membrane proteins, trained on crystallographically solved data. We show that this method compares well against already established algorithms presented in the literature, and it is extremely useful in practical applications. CONCLUSION: The algorithms presented here, are easily implemented in any kind of a Hidden Markov Model, whereas the prediction method (HMM-TM) is freely available for academic users at , offering the most advanced decoding options currently available
Evaluation of methods for predicting the topology of β-barrel outer membrane proteins and a consensus prediction method
BACKGROUND: Prediction of the transmembrane strands and topology of β-barrel outer membrane proteins is of interest in current bioinformatics research. Several methods have been applied so far for this task, utilizing different algorithmic techniques and a number of freely available predictors exist. The methods can be grossly divided to those based on Hidden Markov Models (HMMs), on Neural Networks (NNs) and on Support Vector Machines (SVMs). In this work, we compare the different available methods for topology prediction of β-barrel outer membrane proteins. We evaluate their performance on a non-redundant dataset of 20 β-barrel outer membrane proteins of gram-negative bacteria, with structures known at atomic resolution. Also, we describe, for the first time, an effective way to combine the individual predictors, at will, to a single consensus prediction method. RESULTS: We assess the statistical significance of the performance of each prediction scheme and conclude that Hidden Markov Model based methods, HMM-B2TMR, ProfTMB and PRED-TMBB, are currently the best predictors, according to either the per-residue accuracy, the segments overlap measure (SOV) or the total number of proteins with correctly predicted topologies in the test set. Furthermore, we show that the available predictors perform better when only transmembrane β-barrel domains are used for prediction, rather than the precursor full-length sequences, even though the HMM-based predictors are not influenced significantly. The consensus prediction method performs significantly better than each individual available predictor, since it increases the accuracy up to 4% regarding SOV and up to 15% in correctly predicted topologies. CONCLUSIONS: The consensus prediction method described in this work, optimizes the predicted topology with a dynamic programming algorithm and is implemented in a web-based application freely available to non-commercial users at
OMPdb: a database of β-barrel outer membrane proteins from Gram-negative bacteria
We describe here OMPdb, which is currently the most complete and comprehensive collection of integral β-barrel outer membrane proteins from Gram-negative bacteria. The database currently contains 69â354 proteins, which are classified into 85 families, based mainly on structural and functional criteria. Although OMPdb follows the annotation scheme of Pfam, many of the families included in the database were not previously described or annotated in other publicly available databases. There are also cross-references to other databases, references to the literature and annotation for sequence features, like transmembrane segments and signal peptides. Furthermore, via the web interface, the user can not only browse the available data, but submit advanced text searches and run BLAST queries against the database protein sequences or domain searches against the collection of profile Hidden Markov Models that represent each familyâs domain organization as well. The database is freely accessible for academic users at http://bioinformatics.biol.uoa.gr/OMPdb and we expect it to be useful for genome-wide analyses, comparative genomics as well as for providing training and test sets for predictive algorithms regarding transmembrane β-barrels
A database for G proteins and their interaction with GPCRs
BACKGROUND: G protein-coupled receptors (GPCRs) transduce signals from extracellular space into the cell, through their interaction with G proteins, which act as switches forming hetero-trimers composed of different subunits (ι,β,γ). The ι subunit of the G protein is responsible for the recognition of a given GPCR. Whereas specialised resources for GPCRs, and other groups of receptors, are already available, currently, there is no publicly available database focusing on G Proteins and containing information about their coupling specificity with their respective receptors. DESCRIPTION: gpDB is a publicly accessible G proteins/GPCRs relational database. Including species homologs, the database contains detailed information for 418 G protein monomers (272 Gι, 87 Gβ and 59 Gγ) and 2782 GPCRs sequences belonging to families with known coupling to G proteins. The GPCRs and the G proteins are classified according to a hierarchy of different classes, families and sub-families, based on extensive literature searchs. The main innovation besides the classification of both G proteins and GPCRs is the relational model of the database, describing the known coupling specificity of the GPCRs to their respective ι subunit of G proteins, a unique feature not available in any other database. There is full sequence information with cross-references to publicly available databases, references to the literature concerning the coupling specificity and the dimerization of GPCRs and the user may submit advanced queries for text search. Furthermore, we provide a pattern search tool, an interface for running BLAST against the database and interconnectivity with PRED-TMR, PRED-GPCR and TMRPres2D. CONCLUSIONS: The database will be very useful, for both experimentalists and bioinformaticians, for the study of G protein/GPCR interactions and for future development of predictive algorithms. It is available for academics, via a web browser at the URL
A Hidden Markov Model method, capable of predicting and discriminating β-barrel outer membrane proteins
BACKGROUND: Integral membrane proteins constitute about 20â30% of all proteins in the fully sequenced genomes. They come in two structural classes, the Îą-helical and the β-barrel membrane proteins, demonstrating different physicochemical characteristics, structure and localization. While transmembrane segment prediction for the Îą-helical integral membrane proteins appears to be an easy task nowadays, the same is much more difficult for the β-barrel membrane proteins. We developed a method, based on a Hidden Markov Model, capable of predicting the transmembrane β-strands of the outer membrane proteins of gram-negative bacteria, and discriminating those from water-soluble proteins in large datasets. The model is trained in a discriminative manner, aiming at maximizing the probability of correct predictions rather than the likelihood of the sequences. RESULTS: The training has been performed on a non-redundant database of 14 outer membrane proteins with structures known at atomic resolution; it has been tested with a jacknife procedure, yielding a per residue accuracy of 84.2% and a correlation coefficient of 0.72, whereas for the self-consistency test the per residue accuracy was 88.1% and the correlation coefficient 0.824. The total number of correctly predicted topologies is 10 out of 14 in the self-consistency test, and 9 out of 14 in the jacknife. Furthermore, the model is capable of discriminating outer membrane from water-soluble proteins in large-scale applications, with a success rate of 88.8% and 89.2% for the correct classification of outer membrane and water-soluble proteins respectively, the highest rates obtained in the literature. That test has been performed independently on a set of known outer membrane proteins with low sequence identity with each other and also with the proteins of the training set. CONCLUSION: Based on the above, we developed a strategy, that enabled us to screen the entire proteome of E. coli for outer membrane proteins. The results were satisfactory, thus the method presented here appears to be suitable for screening entire proteomes for the discovery of novel outer membrane proteins. A web interface available for non-commercial users is located at: , and it is the only freely available HMM-based predictor for β-barrel outer membrane protein topology
Multiple outcome meta-analysis of gene-expression data in inflammatory bowel disease
We performed a multivariate meta-analysis of microarray data in Crohn's disease (CD) and Ulcerative colitis (UC), which are the main forms of inflammatory bowel disease (IBD). They share similar symptoms but differ in the location and extent of inflammation and in complications. We identified 249 differentially expressed genes (DEGs) in CD and 38 in UC at a false discovery rate of 1%. 20 of the DEGs were common to both diseases. A multivariate test identified 260 DEGs associated with IBD, 53 of which were not found in any of the disorders. We identified important molecular pathways implicated in the pathogenesis of IBD, such as the JAK/STAT and interferon-gamma signaling pathways, genes involved in cell adhesion, apoptosis and carcinogenesis. Among others, BCAT1 and GZMB are interesting novel DEGs that deserve further investigation in experimental models. The method could also be useful to other cases of meta-analysis of gene expression data
Multivariate meta-analysis of the association of G-protein beta 3 gene (GNB3) haplotypes with cardiovascular phenotypes
The objective of the present study was to review previous investigations on the association of haplotypes in the G-protein β3 subunit (GNB3) gene with representative cardiovascular risk factors/phenotypes: hypertension, overweight, and variation in the systolic and diastolic blood pressures (SBP and DBP, respectively) and as well as body mass index (BMI). A comprehensive literature search was undertaken in Pubmed, Web of Science, EMBASE, Biological Abstracts, LILACS and Google Scholar to identify potentially relevant articles published up to April 2011. Six genetic association studies encompassing 16,068 participants were identified. Individual participant data were obtained for all studies. The three most investigated GNB3 polymorphisms (G-350A, C825T and C1429T) were considered. Expectationâmaximization and generalized linear models were employed to estimate haplotypic effects from data with uncertain phase while adjusting for covariates. Study-specific results were combined through a random-effects multivariate meta-analysis. After carefully adjustments for relevant confounding factors, our analysis failed to support a role for GNB3 haplotypes in any of the investigated phenotypes. Sensitivity analyses excluding studies violating HardyâWeinberg expectations, considering gender-specific effects or more extreme phenotypes (e.g. obesity only) as well as a fixed-effects âpooledâ analysis also did not disclose a significant influence of GNB3 haplotypes on cardiovascular phenotypes. We conclude that the previous cumulative evidence does not support the proposal that haplotypes formed by common GNB3 polymorphisms might contribute either to the development of hypertension and obesity, or to the variation in the SBP, DBP and BMI.This work was supported by the Fundação de Amparo Ă Pesquisa do Estado de SĂŁo Paulo (FAPESP, Brazil, to T.V.P.). This work was also supported in part by the Global Center of Excellence Program (No. F03, to M.D.) founded by the Japan Society for the Promotion of Science, Japan (to Y.S.) and Grants-in-Aid for Scientific Research from the Ministry of Education, Culture, Sports, Science, and Technology of Japan (Numbers 18209023, 18018021, and 19659149 to Y.Y.)
Bivariate genome-wide association meta-analysis of pediatric musculoskeletal traits reveals pleiotropic effects at the SREBF1/TOM1L2 locus
Bone mineral density is known to be a heritable, polygenic trait whereas genetic variants contributing to lean mass variation remain largely unknown. We estimated the shared SNP heritability and performed a bivariate GWAS meta-analysis of total-body lean mass (TB-LM) and total-body less head bone mineral density (TBLH-BMD) regions in 10,414 children. The estimated SNP heritability is 43% for TBLH-BMD, and 39% for TB-LM, with a shared genetic component of 43%. We identify variants with pleiotropic effects in eight loci, including seven established bone mineral density loci: _WNT4, GALNT3, MEPE, CPED1/WNT16, TNFSF11, RIN3, and PPP6R3/LRP5_. Variants in the _TOM1L2/SREBF1_ locus exert opposing effects TB-LM and TBLH-BMD, and have a stronger association with the former trait. We show that _SREBF1_ is expressed in murine and human osteoblasts, as well as in human muscle tissue. This is the first bivariate GWAS meta-analysis to demonstrate genetic factors with pleiotropic effects on bone mineral density and lean mass
An Ecological Study of the Determinants of Differences in 2009 Pandemic Influenza Mortality Rates between Countries in Europe
Pandemic A (H1N1) 2009 mortality rates varied widely from one country to another. Our aim was to identify potential socioeconomic determinants of pandemic mortality and explain between-country variation.Based on data from a total of 30 European countries, we applied random-effects Poisson regression models to study the relationship between pandemic mortality rates (May 2009 to May 2010) and a set of representative environmental, health care-associated, economic and demographic country-level parameters. The study was completed by June 2010.Most regression approaches indicated a consistent, statistically significant inverse association between pandemic influenza-related mortality and per capita government expenditure on health. The findings were similar in univariable [coefficient: -0.00028, 95% Confidence Interval (CI): -0.00046, -0.00010, pâ=â0.002] and multivariable analyses (including all covariates, coefficient: -0.00107, 95% CI: -0.00196, -0.00018, pâ=â0.018). The estimate was barely insignificant when the multivariable model included only significant covariates from the univariate step (coefficient: -0.00046, 95% CI: -0.00095, 0.00003, pâ=â0.063).Our findings imply a significant inverse association between public spending on health and pandemic influenza mortality. In an attempt to interpret the estimated coefficient (-0.00028) for the per capita government expenditure on health, we observed that a rise of 100 international dollars was associated with a reduction in the pandemic influenza mortality rate by approximately 2.8%. However, further work needs to be done to unravel the mechanisms by which reduced government spending on health may have affected the 2009 pandemic influenza mortality
- âŚ