205 research outputs found
Widespread sex differences in gene expression and splicing in the adult human brain
There is strong evidence to show that men and women differ in terms of neurodevelopment, neurochemistry and susceptibility to neurodegenerative and neuropsychiatric disease. The molecular basis of these differences remains unclear. Progress in this field has been hampered by the lack of genome-wide information on sex differences in gene expression and in particular splicing in the human brain. Here we address this issue by using post-mortem adult human brain and spinal cord samples originating from 137 neuropathologically confirmed control individuals to study whole-genome gene expression and splicing in 12 CNS regions. We show that sex differences in gene expression and splicing are widespread in adult human brain, being detectable in all major brain regions and involving 2.5% of all expressed genes. We give examples of genes where sex-biased expression is both disease-relevant and likely to have functional consequences, and provide evidence suggesting that sex biases in expression may reflect sex-biased gene regulatory structures
UNCLES: Method for the identification of genes differentially consistently co-expressed in a specific subset of datasets
Background: Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Results: Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few previously unknown genes in those focused clusters are drawn. Conclusions: The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach will have wide application for the comprehensive analysis of genomic and other sources of multiple complex biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future functional studies.The National Institute for Health Research (NIHR) under its Programme Grants for Applied Research
Programme (Grant Reference Number RP-PG-0310-1004)
Cross-Platform Microarray Data Normalisation for Regulatory Network Inference
Background
Inferring Gene Regulatory Networks (GRNs) from time course microarray data suffers from the dimensionality problem created by the short length of available time series compared to the large number of genes in the network. To overcome this, data integration from diverse sources is mandatory. Microarray data from different sources and platforms are publicly available, but integration is not straightforward, due to platform and experimental differences.
Methods
We analyse here different normalisation approaches for microarray data integration, in the context of reverse engineering of GRN quantitative models. We introduce two preprocessing approaches based on existing normalisation techniques and provide a comprehensive comparison of normalised datasets.
Conclusions
Results identify a method based on a combination of Loess normalisation and iterative K-means as best for time series normalisation for this problem
Exaggerated CpH methylation in the autism-affected brain
BACKGROUND: The etiology of autism, a complex, heritable, neurodevelopmental disorder, remains largely unexplained. Given the unexplained risk and recent evidence supporting a role for epigenetic mechanisms in the development of autism, we explored the role of CpG and CpH (H = A, C, or T) methylation within the autism-affected cortical brain tissue. METHODS: Reduced representation bisulfite sequencing (RRBS) was completed, and analysis was carried out in 63 post-mortem cortical brain samples (Brodmann area 19) from 29 autism-affected and 34 control individuals. Analyses to identify single sites that were differentially methylated and to identify any global methylation alterations at either CpG or CpH sites throughout the genome were carried out. RESULTS: We report that while no individual site or region of methylation was significantly associated with autism after multi-test correction, methylated CpH dinucleotides were markedly enriched in autism-affected brains (~2-fold enrichment at p < 0.05 cutoff, p = 0.002). CONCLUSIONS: These results further implicate epigenetic alterations in pathobiological mechanisms that underlie autism. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13229-017-0119-y) contains supplementary material, which is available to authorized users
Detection of regulator genes and eQTLs in gene networks
Genetic differences between individuals associated to quantitative phenotypic
traits, including disease states, are usually found in non-coding genomic
regions. These genetic variants are often also associated to differences in
expression levels of nearby genes (they are "expression quantitative trait
loci" or eQTLs for short) and presumably play a gene regulatory role, affecting
the status of molecular networks of interacting genes, proteins and
metabolites. Computational systems biology approaches to reconstruct causal
gene networks from large-scale omics data have therefore become essential to
understand the structure of networks controlled by eQTLs together with other
regulatory genes, and to generate detailed hypotheses about the molecular
mechanisms that lead from genotype to phenotype. Here we review the main
analytical methods and softwares to identify eQTLs and their associated genes,
to reconstruct co-expression networks and modules, to reconstruct causal
Bayesian gene and module networks, and to validate predicted networks in
silico.Comment: minor revision with typos corrected; review article; 24 pages, 2
figure
Genetic determinants of co-accessible chromatin regions in activated T cells across humans.
Over 90% of genetic variants associated with complex human traits map to non-coding regions, but little is understood about how they modulate gene regulation in health and disease. One possible mechanism is that genetic variants affect the activity of one or more cis-regulatory elements leading to gene expression variation in specific cell types. To identify such cases, we analyzed ATAC-seq and RNA-seq profiles from stimulated primary CD4+ T cells in up to 105 healthy donors. We found that regions of accessible chromatin (ATAC-peaks) are co-accessible at kilobase and megabase resolution, consistent with the three-dimensional chromatin organization measured by in situ Hi-C in T cells. Fifteen percent of genetic variants located within ATAC-peaks affected the accessibility of the corresponding peak (local-ATAC-QTLs). Local-ATAC-QTLs have the largest effects on co-accessible peaks, are associated with gene expression and are enriched for autoimmune disease variants. Our results provide insights into how natural genetic variants modulate cis-regulatory elements, in isolation or in concert, to influence gene expression
Regularized gene selection in cancer microarray meta-analysis
<p>Abstract</p> <p>Background</p> <p>In cancer studies, it is common that multiple microarray experiments are conducted to measure the same clinical outcome and expressions of the same set of genes. An important goal of such experiments is to identify a subset of genes that can potentially serve as predictive markers for cancer development and progression. Analyses of individual experiments may lead to unreliable gene selection results because of the small sample sizes. Meta analysis can be used to pool multiple experiments, increase statistical power, and achieve more reliable gene selection. The meta analysis of cancer microarray data is challenging because of the high dimensionality of gene expressions and the differences in experimental settings amongst different experiments.</p> <p>Results</p> <p>We propose a Meta Threshold Gradient Descent Regularization (MTGDR) approach for gene selection in the meta analysis of cancer microarray data. The MTGDR has many advantages over existing approaches. It allows different experiments to have different experimental settings. It can account for the joint effects of multiple genes on cancer, and it can select the same set of cancer-associated genes across multiple experiments. Simulation studies and analyses of multiple pancreatic and liver cancer experiments demonstrate the superior performance of the MTGDR.</p> <p>Conclusion</p> <p>The MTGDR provides an effective way of analyzing multiple cancer microarray studies and selecting reliable cancer-associated genes.</p
- …