470 research outputs found
Financial Time series: motif discovery and analysis using VALMOD
Motif discovery and analysis in time series data-sets have a wide-range of applications from genomics to finance. In consequence, development and critical evaluation of these algorithms is required with the focus not just detection but rather evaluation and interpretation of overall significance. Our focus here is the specific algorithm, VALMOD, but algorithms in wide use for motif discovery are summarised and briefly compared, as well as typical evaluation methods with strengths. Additionally, Taxonomy diagrams for motif discovery and evaluation techniques are constructed to illustrate the relationship between different approaches as well as inter-dependencies. Finally evaluation measures based upon results obtained from VALMOD analysis of a GBP-USD foreign exchange (F/X) rate data-set are presented, in illustration
Gene set of nuclear-encoded mitochondrial regulators is enriched for common inherited variation in obesity
There are hints of an altered mitochondrial function in obesity. Nuclear-encoded genes are relevant for mitochondrial function (3 gene sets of known relevant pathways: (1) 16 nuclear regulators of mitochondrial genes, (2) 91 genes for oxidative phosphorylation and (3) 966 nuclear-encoded mitochondrial genes). Gene set enrichment analysis (GSEA) showed no association with type 2 diabetes mellitus in these gene sets. Here we performed a GSEA for the same gene sets for obesity. Genome wide association study (GWAS) data from a case-control approach on 453 extremely obese children and adolescents and 435 lean adult controls were used for GSEA. For independent confirmation, we analyzed 705 obesity GWAS trios (extremely obese child and both biological parents) and a population-based GWAS sample (KORA F4, n = 1,743). A meta-analysis was performed on all three samples. In each sample, the distribution of significance levels between the respective gene set and those of all genes was compared using the leading-edge-fraction-comparison test (cut-offs between the 50(th) and 95(th) percentile of the set of all gene-wise corrected p-values) as implemented in the MAGENTA software. In the case-control sample, significant enrichment of associations with obesity was observed above the 50(th) percentile for the set of the 16 nuclear regulators of mitochondrial genes (p(GSEA,50) = 0.0103). This finding was not confirmed in the trios (p(GSEA,50) = 0.5991), but in KORA (p(GSEA,50) = 0.0398). The meta-analysis again indicated a trend for enrichment (p(MAGENTA,50) = 0.1052, p(MAGENTA,75) = 0.0251). The GSEA revealed that weak association signals for obesity might be enriched in the gene set of 16 nuclear regulators of mitochondrial genes
Evidence for the role of EPHX2 gene variants in anorexia nervosa.
Anorexia nervosa (AN) and related eating disorders are complex, multifactorial neuropsychiatric conditions with likely rare and common genetic and environmental determinants. To identify genetic variants associated with AN, we pursued a series of sequencing and genotyping studies focusing on the coding regions and upstream sequence of 152 candidate genes in a total of 1205 AN cases and 1948 controls. We identified individual variant associations in the Estrogen Receptor-ß (ESR2) gene, as well as a set of rare and common variants in the Epoxide Hydrolase 2 (EPHX2) gene, in an initial sequencing study of 261 early-onset severe AN cases and 73 controls (P=0.0004). The association of EPHX2 variants was further delineated in: (1) a pooling-based replication study involving an additional 500 AN patients and 500 controls (replication set P=0.00000016); (2) single-locus studies in a cohort of 386 previously genotyped broadly defined AN cases and 295 female population controls from the Bogalusa Heart Study (BHS) and a cohort of 58 individuals with self-reported eating disturbances and 851 controls (combined smallest single locus P<0.01). As EPHX2 is known to influence cholesterol metabolism, and AN is often associated with elevated cholesterol levels, we also investigated the association of EPHX2 variants and longitudinal body mass index (BMI) and cholesterol in BHS female and male subjects (N=229) and found evidence for a modifying effect of a subset of variants on the relationship between cholesterol and BMI (P<0.01). These findings suggest a novel association of gene variants within EPHX2 to susceptibility to AN and provide a foundation for future study of this important yet poorly understood condition
Modular reorganization of the global network of gene regulatory interactions during perinatal human brain development.
BACKGROUND
During early development of the nervous system, gene expression patterns are known to vary widely depending on the specific developmental trajectories of different structures. Observable changes in gene expression profiles throughout development are determined by an underlying network of precise regulatory interactions between individual genes. Elucidating the organizing principles that shape this gene regulatory network is one of the central goals of developmental biology. Whether the developmental programme is the result of a dynamic driven by a fixed architecture of regulatory interactions, or alternatively, the result of waves of regulatory reorganization is not known.
RESULTS
Here we contrast these two alternative models by examining existing expression data derived from the developing human brain in prenatal and postnatal stages. We reveal a sharp change in gene expression profiles at birth across brain areas. This sharp division between foetal and postnatal profiles is not the result of pronounced changes in level of expression of existing gene networks. Instead we demonstrate that the perinatal transition is marked by the widespread regulatory rearrangement within and across existing gene clusters, leading to the emergence of new functional groups. This rearrangement is itself organized into discrete blocks of genes, each targeted by a distinct set of transcriptional regulators and associated to specific biological functions.
CONCLUSIONS
Our results provide evidence of an acute modular reorganization of the regulatory architecture of the brain transcriptome occurring at birth, reflecting the reassembly of new functional associations required for the normal transition from prenatal to postnatal brain development
Non-homologous end-joining pathway associated with occurrence of myocardial infarction: gene set analysis of genome-wide association study data
<p>Purpose: DNA repair deficiencies have been postulated to play a role in the development and progression of cardiovascular disease (CVD). The hypothesis is that DNA damage accumulating with age may induce cell death, which promotes formation of unstable plaques. Defects in DNA repair mechanisms may therefore increase the risk of CVD events. We examined whether the joints effect of common genetic variants in 5 DNA repair pathways may influence the risk of CVD events.</p>
<p>Methods: The PLINK set-based test was used to examine the association to myocardial infarction (MI) of the DNA repair pathway in GWAS data of 866 subjects of the GENetic DEterminants of Restenosis (GENDER) study and 5,244 subjects of the PROspective Study of Pravastatin in the Elderly at Risk (PROSPER) study. We included the main DNA repair pathways (base excision repair, nucleotide excision repair, mismatch repair, homologous recombination and non-homologous end-joining (NHEJ)) in the analysis.</p>
<p>Results: The NHEJ pathway was associated with the occurrence of MI in both GENDER (P = 0.0083) and PROSPER (P = 0.014). This association was mainly driven by genetic variation in the MRE11A gene (PGENDER = 0.0001 and PPROSPER = 0.002). The homologous recombination pathway was associated with MI in GENDER only (P = 0.011), for the other pathways no associations were observed.</p>
<p>Conclusion: This is the first study analyzing the joint effect of common genetic variation in DNA repair pathways and the risk of CVD events, demonstrating an association between the NHEJ pathway and MI in 2 different cohorts.</p>
wKinMut: An integrated tool for the analysis and interpretation of mutations in human protein kinases
BACKGROUND: Protein kinases are involved in relevant physiological functions and a broad number of mutations in this superfamily have been reported in the literature to affect protein function and stability. Unfortunately, the exploration of the consequences on the phenotypes of each individual mutation remains a considerable challenge. RESULTS: The wKinMut web-server offers direct prediction of the potential pathogenicity of the mutations from a number of methods, including our recently developed prediction method based on the combination of information from a range of diverse sources, including physicochemical properties and functional annotations from FireDB and Swissprot and kinase-specific characteristics such as the membership to specific kinase groups, the annotation with disease-associated GO terms or the occurrence of the mutation in PFAM domains, and the relevance of the residues in determining kinase subfamily specificity from S3Det. This predictor yields interesting results that compare favourably with other methods in the field when applied to protein kinases. Together with the predictions, wKinMut offers a number of integrated services for the analysis of mutations. These include: the classification of the kinase, information about associations of the kinase with other proteins extracted from iHop, the mapping of the mutations onto PDB structures, pathogenicity records from a number of databases and the classification of mutations in large-scale cancer studies. Importantly, wKinMut is connected with the SNP2L system that extracts mentions of mutations directly from the literature, and therefore increases the possibilities of finding interesting functional information associated to the studied mutations. CONCLUSIONS: wKinMut facilitates the exploration of the information available about individual mutations by integrating prediction approaches with the automatic extraction of information from the literature (text mining) and several state-of-the-art databases. wKinMut has been used during the last year for the analysis of the consequences of mutations in the context of a number of cancer genome projects, including the recent analysis of Chronic Lymphocytic Leukemia cases and is publicly available at http://wkinmut.bioinfo.cnio.es
Co-expression network of neural-differentiation genes shows specific pattern in schizophrenia
Background: Schizophrenia is a neurodevelopmental disorder with genetic and environmental factors contributing to its pathogenesis, although the mechanism is unknown due to the difficulties in accessing diseased tissue during human neurodevelopment. The aim of this study was to find neuronal differentiation genes disrupted in schizophrenia and to evaluate those genes in post-mortem brain tissues from schizophrenia cases and controls.
Methods: We analyzed differentially expressed genes (DEG), copy number variation (CNV) and differential methylation in human induced pluripotent stem cells (hiPSC) derived from fibroblasts from one control and one schizophrenia patient and further differentiated into neuron (NPC). Expression of the DEG were analyzed with microarrays of post-mortem brain tissue (frontal cortex) cohort of 29 schizophrenia cases and 30 controls. A Weighted Gene Co-expression Network Analysis (WGCNA) using the DEG was used to detect clusters of co-expressed genes that werenon-conserved between adult cases and controls brain samples.
Results: We identified methylation alterations potentially involved with neuronal differentiation in schizophrenia, which displayed an over-representation of genes related to chromatin remodeling complex (adjP = 0.04). We found 228 DEG associated with neuronal differentiation. These genes were involved with metabolic processes, signal transduction, nervous system development, regulation of neurogenesis and neuronal differentiation. Between adult brain samples from cases and controls there were 233 DEG, with only four genes overlapping with the 228 DEG, probably because we compared single cell to tissue bulks and more importantly, the cells were at different stages of development. The comparison of the co-expressed network of the 228 genes in adult brain samples between cases and controls revealed a less conserved module enriched for genes associated with oxidative stress and negative regulation of cell differentiation.
Conclusion: This study supports the relevance of using cellular approaches to dissect molecular aspects of neurogenesis with impact in the schizophrenic brain. We showed that, although generated by different approaches, both sets of DEG associated to schizophrenia were involved with neocortical development. The results add to the hypothesis that critical metabolic changes may be occurring during early neurodevelopment influencing faulty development of the brain and potentially contributing to further vulnerability to the illness.We thank the patients, doctors and nurses involved with sample collection and the Stanley Medical Research Institute. This research was supported by either Conselho Nacional de Desenvolvimento Cientifico e Tecnologico (CNPq #17/2008) and Fundação Carlos Chagas Filho de Amparo a Pesquisa do Estado do Rio de Janeiro (FAPERJ). MM (CNPq 304429/2014-7), ACT (FAPESP 2014/00041-1), LL (CAPES 10682/13-9) HV (CAPES) and BP (PPSUS 137270) were supported by their fellowshipsinfo:eu-repo/semantics/publishedVersio
ProKinO: An Ontology for Integrative Analysis of Protein Kinases in Cancer
Protein kinases are a large and diverse family of enzymes that are genomically altered in many human cancers. Targeted cancer genome sequencing efforts have unveiled the mutational profiles of protein kinase genes from many different cancer types. While mutational data on protein kinases is currently catalogued in various databases, integration of mutation data with other forms of data on protein kinases such as sequence, structure, function and pathway is necessary to identify and characterize key cancer causing mutations. Integrative analysis of protein kinase data, however, is a challenge because of the disparate nature of protein kinase data sources and data formats., where the mutations are spread over 82 distinct kinases. We also provide examples of how ontology-based data analysis can be used to generate testable hypotheses regarding cancer mutations.
A Classifier-based approach to identify genetic similarities between diseases
Motivation: Genome-wide association studies are commonly used to identify possible associations between genetic variations and diseases. These studies mainly focus on identifying individual single nucleotide polymorphisms (SNPs) potentially linked with one disease of interest. In this work, we introduce a novel methodology that identifies similarities between diseases using information from a large number of SNPs. We separate the diseases for which we have individual genotype data into one reference disease and several query diseases. We train a classifier that distinguishes between individuals that have the reference disease and a set of control individuals. This classifier is then used to classify the individuals that have the query diseases. We can then rank query diseases according to the average classification of the individuals in each disease set, and identify which of the query diseases are more similar to the reference disease. We repeat these classification and comparison steps so that each disease is used once as reference disease
- …
