30 research outputs found
Exploiting HIV-1 protease and reverse transcriptase cross-resistance information for improved drug resistance prediction by means of multi-label classification
Gini impurity PIs. (PDF 8 kb
Fetal-Adult Cardiac Transcriptome Analysis in Rats with Contrasting Left Ventricular Mass Reveals New Candidates for Cardiac Hypertrophy
Reactivation of fetal gene expression patterns has been implicated in common
cardiac diseases in adult life including left ventricular (LV) hypertrophy
(LVH) in arterial hypertension. Thus, increased wall stress and neurohumoral
activation are discussed to induce the return to expression of fetal genes
after birth in LVH. We therefore aimed to identify novel potential candidates
for LVH by analyzing fetal-adult cardiac gene expression in a genetic rat
model of hypertension, i.e. the stroke-prone spontaneously hypertensive rat
(SHRSP). To this end we performed genome-wide transcriptome analysis in SHRSP
to identify differences in expression patterns between day 20 of fetal
development (E20) and adult animals in week 14 in comparison to a normotensive
rat strain with contrasting low LV mass, i.e. Fischer (F344). 15232 probes
were detected as expressed in LV tissue obtained from rats at E20 and week 14
(p < 0.05) and subsequently screened for differential expression. We
identified 24 genes with SHRSP specific up-regulation and 21 genes with down-
regulation as compared to F344. Further bioinformatic analysis presented
Efcab6 as a new candidate for LVH that showed only in the hypertensive SHRSP
rat differential expression during development (logFC = 2.41, p < 0.001) and
was significantly higher expressed in adult SHRSP rats compared with adult
F344 (+ 76%) and adult normotensive Wistar-Kyoto rats (+ 82%). Thus, it
represents an interesting new target for further functional analyses and the
elucidation of mechanisms leading to LVH. Here we report a new approach to
identify candidate genes for cardiac hypertrophy by combining the analysis of
gene expression differences between strains with a contrasting cardiac
phenotype with a comparison of fetal-adult cardiac expression patterns
A systematic SNP selection approach to identify mechanisms underlying disease aetiology: Linking height to post-menopausal breast and colorectal cancer risk
Data from GWAS suggest that SNPs associated with complex diseases or traits tend to co-segregate in regions of low recombination, harbouring functionally linked gene clusters. This phenomenon allows for selecting a limited number of SNPs from GWAS repositories for large-scale studies investigating shared mechanisms between diseases. For example, we were interested in shared mechanisms between adult-attained height and post-menopausal breast cancer (BC) and colorectal cancer (CRC) risk, because height is a risk factor for these cancers, though likely not a causal factor. Using SNPs from public GWAS repositories at p-values < 1 Ă— 10-5 and a genomic sliding window of 1 mega base pair, we identified SNP clusters including at least one SNP associated with height and one SNP associated with either post-menopausal BC or CRC risk (or both). SNPs were annotated to genes using HapMap and GRAIL and analysed for significantly overrepresented pathways using ConsensuspathDB. Twelve clusters including 56 SNPs annotated to 26 genes were prioritised because these included at least one height- and one BC risk- or CRC risk-associated SNP annotated to the same gene. Annotated genes were involved in Indian hedgehog signalling (p-value = 7.78 Ă— 10-7) and several cancer site-specific pathways. This systematic approach identified a limited number of clustered SNPs, which pinpoint potential shared mechanisms linking together the complex phenotypes height, post-menopausal BC and CRC
Improved Bevirimat resistance prediction by combination of structural and sequence-based classifiers
<p>Abstract</p> <p>Background</p> <p>Maturation inhibitors such as Bevirimat are a new class of antiretroviral drugs that hamper the cleavage of HIV-1 proteins into their functional active forms. They bind to these preproteins and inhibit their cleavage by the HIV-1 protease, resulting in non-functional virus particles. Nevertheless, there exist mutations in this region leading to resistance against Bevirimat. Highly specific and accurate tools to predict resistance to maturation inhibitors can help to identify patients, who might benefit from the usage of these new drugs.</p> <p>Results</p> <p>We tested several methods to improve Bevirimat resistance prediction in HIV-1. It turned out that combining structural and sequence-based information in classifier ensembles led to accurate and reliable predictions. Moreover, we were able to identify the most crucial regions for Bevirimat resistance computationally, which are in line with experimental results from other studies.</p> <p>Conclusions</p> <p>Our analysis demonstrated the use of machine learning techniques to predict HIV-1 resistance against maturation inhibitors such as Bevirimat. New maturation inhibitors are already under development and might enlarge the arsenal of antiretroviral drugs in the future. Thus, accurate prediction tools are very useful to enable a personalized therapy.</p
Evolutionary Dynamics of Co-Segregating Gene Clusters Associated with Complex Diseases
BACKGROUND: The distribution of human disease-associated mutations is not random across the human genome. Despite the fact that natural selection continually removes disease-associated mutations, an enrichment of these variants can be observed in regions of low recombination. There are a number of mechanisms by which such a clustering could occur, including genetic perturbations or demographic effects within different populations. Recent genome-wide association studies (GWAS) suggest that single nucleotide polymorphisms (SNPs) associated with complex disease traits are not randomly distributed throughout the genome, but tend to cluster in regions of low recombination. PRINCIPAL FINDINGS: Here we investigated whether deleterious mutations have accumulated in regions of low recombination due to the impact of recent positive selection and genetic hitchhiking. Using publicly available data on common complex diseases and population demography, we observed an enrichment of hitchhiked disease associations in conserved gene clusters subject to selection pressure. Evolutionary analysis revealed that these conserved gene clusters arose by multiple concerted rearrangements events across the vertebrate lineage. We observed distinct clustering of disease-associated SNPs in evolutionary rearranged regions of low recombination and high gene density, which harbor genes involved in immunity, that is, the interleukin cluster on 5q31 or RhoA on 3p21. CONCLUSIONS: Our results suggest that multiple lineage specific rearrangements led to a physical clustering of functionally related and linked genes exhibiting an enrichment of susceptibility loci for complex traits. This implies that besides recent evolutionary adaptations other evolutionary dynamics have played a role in the formation of linked gene clusters associated with complex disease traits
eccCL: parallelized GPU implementation of Ensemble Classifier Chains
Abstract Background Multi-label classification has recently gained great attention in diverse fields of research, e.g., in biomedical application such as protein function prediction or drug resistance testing in HIV. In this context, the concept of Classifier Chains has been shown to improve prediction accuracy, especially when applied as Ensemble Classifier Chains. However, these techniques lack computational efficiency when applied on large amounts of data, e.g., derived from next-generation sequencing experiments. By adapting algorithms for the use of graphics processing units, computational efficiency can be greatly improved due to parallelization of computations. Results Here, we provide a parallelized and optimized graphics processing unit implementation (eccCL) of Classifier Chains and Ensemble Classifier Chains. Additionally to the OpenCL implementation, we provide an R-Package with an easy to use R-interface for parallelized graphics processing unit usage. Conclusion eccCL is a handy implementation of Classifier Chains on GPUs, which is able to process up to over 25,000 instances per second, and thus can be used efficiently in high-throughput experiments. The software is available at http://www.heiderlab.de
Enrichment of disease variants in regions of low recombination.
<p>(<b>A</b>) Boxplots displaying local recombination rates for sliding windows of 500 kb harboring a different number of disease variants (0, 1–15,>15). (<b>B</b>) Ratio of windows showing an enrichment of disease variants (>15 disease variants) compared to windows without such a clustering (<15 disease variants) for different bins of local recombination rates.</p
Plots of Crohn’s disease risk locus at chromosome 5q31.
<p>(<b>A</b>) Map of the 5q31 risk locus containing –log(P) values of SNPs (CD SNPs), LD blocks defined by Proxy SNP with r<sup>2</sup> >0.8 as well as positions of SNPs considered iHS signals (light colour) or strong iHS signals (darker colour) for the three HapMap populations (blue: CEU, yellow: ASN, brown: YRI). (<b>B</b>) Reference allele frequencies of SNPs showing allele frequency differences in the 95<sup>th</sup> percentile between at least two of three populations according to 1000 Genomes data. (<b>C</b>) Percentages of SNPs associated with Crohn’s disease, which are iHS signals (left) or show allele frequency difference in the 95<sup>th</sup> percentile between populations (right).</p
Clustering of iHS signals in regions enriched with disease variants.
<p>Boxplots highlighting (<b>A</b>) the distribution of mean iHS signals in regions enriched with disease variants (>15) compared to regions with a moderate number of disease associations (1–15) and (<b>B</b>) the ratio of strong iHS signals |iHS >2| for these regions.</p