34 research outputs found
Mutations and Binding Sites of Human Transcription Factors
Mutations in any genome may lead to phenotype characteristics that determine ability of an individual to cope with adaptation to environmental challenges. In studies of human biology, among the most interesting ones are phenotype characteristics that determine responses to drug treatments, response to infections, or predisposition to specific inherited diseases. Most of the research in this field has been focused on the studies of mutation effects on the final gene products, peptides, and their alterations. Considerably less attention was given to the mutations that may affect regulatory mechanism(s) of gene expression, although these may also affect the phenotype characteristics. In this study we make a pilot analysis of mutations observed in the regulatory regions of 24,667 human RefSeq genes. Our study reveals that out of eight studied mutation types, “insertions” are the only one that in a statistically significant manner alters predicted transcription factor binding sites (TFBSs). We also find that 25 families of TFBSs have been altered by mutations in a statistically significant manner in the promoter regions we considered. Moreover, we find that the related transcription factors are, for example, prominent in processes related to intracellular signaling; cell fate; morphogenesis of organs and epithelium; development of urogenital system, epithelium, and tube; neuron fate commitment. Our study highlights the significance of studying mutations within the genes regulatory regions and opens way for further detailed investigations on this topic, particularly on the downstream affected pathways
Core Microbial Functional Activities in Ocean Environments Revealed by Global Metagenomic Profiling Analyses
Metagenomics-based functional profiling analysis is an effective means of gaining deeper insight into the composition of marine microbial populations and developing a better understanding of the interplay between the functional genome content of microbial communities and abiotic factors. Here we present a comprehensive analysis of 24 datasets covering surface and depth-related environments at 11 sites around the world's oceans. The complete datasets comprises approximately 12 million sequences, totaling 5,358 Mb. Based on profiling patterns of Clusters of Orthologous Groups (COGs) of proteins, a core set of reference photic and aphotic depth-related COGs, and a collection of COGs that are associated with extreme oxygen limitation were defined. Their inferred functions were utilized as indicators to characterize the distribution of light- and oxygen-related biological activities in marine environments. The results reveal that, while light level in the water column is a major determinant of phenotypic adaptation in marine microorganisms, oxygen concentration in the aphotic zone has a significant impact only in extremely hypoxic waters. Phylogenetic profiling of the reference photic/aphotic gene sets revealed a greater variety of source organisms in the aphotic zone, although the majority of individual photic and aphotic depth-related COGs are assigned to the same taxa across the different sites. This increase in phylogenetic and functional diversity of the core aphotic related COGs most probably reflects selection for the utilization of a broad range of alternate energy sources in the absence of light.This work was supported by King Abdullah University for Science and Technology Global Collaborative Partners (GCR) program. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
Enforced Expression of the Transcriptional Coactivator OBF1 Impairs B Cell Differentiation at the Earliest Stage of Development
OBF1, also known as Bob.1 or OCA-B, is a B lymphocyte-specific transcription factor which coactivates Oct1 and Oct2 on B cell specific promoters. So far, the function of OBF1 has been mainly identified in late stage B cell populations. The central defect of OBF1 deficient mice is a severely reduced immune response to T cell-dependent antigens and a lack of germinal center formation in the spleen. Relatively little is known about a potential function of OBF1 in developing B cells. Here we have generated transgenic mice overexpressing OBF1 in B cells under the control of the immunoglobulin heavy chain promoter and enhancer. Surprisingly, these mice have greatly reduced numbers of follicular B cells in the periphery and have a compromised immune response. Furthermore, B cell differentiation is impaired at an early stage in the bone marrow: a first block is observed during B cell commitment and a second differentiation block is seen at the large preB2 cell stage. The cells that succeed to escape the block and to differentiate into mature B cells have post-translationally downregulated the expression of transgene, indicating that expression of OBF1 beyond the normal level early in B cell development is deleterious. Transcriptome analysis identified genes deregulated in these mice and Id2 and Id3, two known negative regulators of B cell differentiation, were found to be upregulated in the EPLM and preB cells of the transgenic mice. Furthermore, the Id2 and Id3 promoters contain octamer-like sites, to which OBF1 can bind. These results provide evidence that tight regulation of OBF1 expression in early B cells is essential to allow efficient B lymphocyte differentiation
Impact of safety-related dose reductions or discontinuations on sustained virologic response in HCV-infected patients: Results from the GUARD-C Cohort
BACKGROUND:
Despite the introduction of direct-acting antiviral agents for chronic hepatitis C virus (HCV) infection, peginterferon alfa/ribavirin remains relevant in many resource-constrained settings. The non-randomized GUARD-C cohort investigated baseline predictors of safety-related dose reductions or discontinuations (sr-RD) and their impact on sustained virologic response (SVR) in patients receiving peginterferon alfa/ribavirin in routine practice.
METHODS:
A total of 3181 HCV-mono-infected treatment-naive patients were assigned to 24 or 48 weeks of peginterferon alfa/ribavirin by their physician. Patients were categorized by time-to-first sr-RD (Week 4/12). Detailed analyses of the impact of sr-RD on SVR24 (HCV RNA <50 IU/mL) were conducted in 951 Caucasian, noncirrhotic genotype (G)1 patients assigned to peginterferon alfa-2a/ribavirin for 48 weeks. The probability of SVR24 was identified by a baseline scoring system (range: 0-9 points) on which scores of 5 to 9 and <5 represent high and low probability of SVR24, respectively.
RESULTS:
SVR24 rates were 46.1% (754/1634), 77.1% (279/362), 68.0% (514/756), and 51.3% (203/396), respectively, in G1, 2, 3, and 4 patients. Overall, 16.9% and 21.8% patients experienced 651 sr-RD for peginterferon alfa and ribavirin, respectively. Among Caucasian noncirrhotic G1 patients: female sex, lower body mass index, pre-existing cardiovascular/pulmonary disease, and low hematological indices were prognostic factors of sr-RD; SVR24 was lower in patients with 651 vs. no sr-RD by Week 4 (37.9% vs. 54.4%; P = 0.0046) and Week 12 (41.7% vs. 55.3%; P = 0.0016); sr-RD by Week 4/12 significantly reduced SVR24 in patients with scores <5 but not 655.
CONCLUSIONS:
In conclusion, sr-RD to peginterferon alfa-2a/ribavirin significantly impacts on SVR24 rates in treatment-naive G1 noncirrhotic Caucasian patients. Baseline characteristics can help select patients with a high probability of SVR24 and a low probability of sr-RD with peginterferon alfa-2a/ribavirin
Simplified Method to Predict Mutual Interactions of Human Transcription Factors Based on Their Primary Structure
Background: Physical interactions between transcription factors (TFs) are necessary for forming regulatory protein complexes and thus play a crucial role in gene regulation. Currently, knowledge about the mechanisms of these TF interactions is incomplete and the number of known TF interactions is limited. Computational prediction of such interactions can help identify potential new TF interactions as well as contribute to better understanding the complex machinery involved in gene regulation. Methodology: We propose here such a method for the prediction of TF interactions. The method uses only the primary sequence information of the interacting TFs, resulting in a much greater simplicity of the prediction algorithm. Through an advanced feature selection process, we determined a subset of 97 model features that constitute the optimized model in the subset we considered. The model, based on quadratic discriminant analysis, achieves a prediction accuracy of 85.39 % on a blind set of interactions. This result is achieved despite the selection for the negative data set of only those TF from the same type of proteins, i.e. TFs that function in the same cellular compartment (nucleus) and in the same type of molecular process (transcription initiation). Such selection poses significant challenges for developing models with high specificity, but at the same time better reflects real-world problems. Conclusions: The performance of our predictor compares well to those of much more complex approaches for predicting TF and general protein-protein interactions, particularly when taking the reduced complexity of model utilisation into account
The genetic architecture of the human cerebral cortex
The cerebral cortex underlies our complex cognitive capabilities, yet little is known about the specific genetic loci that influence human cortical structure. To identify genetic variants that affect cortical structure, we conducted a genome-wide association meta-analysis of brain magnetic resonance imaging data from 51,665 individuals. We analyzed the surface area and average thickness of the whole cortex and 34 regions with known functional specializations. We identified 199 significant loci and found significant enrichment for loci influencing total surface area within regulatory elements that are active during prenatal cortical development, supporting the radial unit hypothesis. Loci that affect regional surface area cluster near genes in Wnt signaling pathways, which influence progenitor expansion and areal identity. Variation in cortical structure is genetically correlated with cognitive function, Parkinson's disease, insomnia, depression, neuroticism, and attention deficit hyperactivity disorder
The Constrained Maximal Expression Level Owing to Haploidy Shapes Gene Content on the Mammalian X Chromosome.
X chromosomes are unusual in many regards, not least of which is their nonrandom gene content. The causes of this bias are commonly discussed in the context of sexual antagonism and the avoidance of activity in the male germline. Here, we examine the notion that, at least in some taxa, functionally biased gene content may more profoundly be shaped by limits imposed on gene expression owing to haploid expression of the X chromosome. Notably, if the X, as in primates, is transcribed at rates comparable to the ancestral rate (per promoter) prior to the X chromosome formation, then the X is not a tolerable environment for genes with very high maximal net levels of expression, owing to transcriptional traffic jams. We test this hypothesis using The Encyclopedia of DNA Elements (ENCODE) and data from the Functional Annotation of the Mammalian Genome (FANTOM5) project. As predicted, the maximal expression of human X-linked genes is much lower than that of genes on autosomes: on average, maximal expression is three times lower on the X chromosome than on autosomes. Similarly, autosome-to-X retroposition events are associated with lower maximal expression of retrogenes on the X than seen for X-to-autosome retrogenes on autosomes. Also as expected, X-linked genes have a lesser degree of increase in gene expression than autosomal ones (compared to the human/Chimpanzee common ancestor) if highly expressed, but not if lowly expressed. The traffic jam model also explains the known lower breadth of expression for genes on the X (and the Z of birds), as genes with broad expression are, on average, those with high maximal expression. As then further predicted, highly expressed tissue-specific genes are also rare on the X and broadly expressed genes on the X tend to be lowly expressed, both indicating that the trend is shaped by the maximal expression level not the breadth of expression per se. Importantly, a limit to the maximal expression level explains biased tissue of expression profiles of X-linked genes. Tissues whose tissue-specific genes are very highly expressed (e.g., secretory tissues, tissues abundant in structural proteins) are also tissues in which gene expression is relatively rare on the X chromosome. These trends cannot be fully accounted for in terms of alternative models of biased expression. In conclusion, the notion that it is hard for genes on the Therian X to be highly expressed, owing to transcriptional traffic jams, provides a simple yet robustly supported rationale of many peculiar features of X's gene content, gene expression, and evolution
A novel method for improved accuracy of transcription factor binding site prediction
Identifying transcription factor (TF) binding sites (TFBSs) is important in the computational inference of gene regulation. Widely used computational methods of TFBS prediction based on position weight matrices (PWMs) usually have high false positive rates. Moreover, computational studies of transcription regulation in eukaryotes frequently require numerous PWM models of TFBSs due to a large number of TFs involved. To overcome these problems we developed DRAF, a novel method for TFBS prediction that requires only 14 prediction models for 232 human TFs, while at the same time significantly improves prediction accuracy. DRAF models use more features than PWM models, as they combine information from TFBS sequences and physicochemical properties of TF DNA-binding domains into machine learning models. Evaluation of DRAF on 98 human ChIP-seq datasets shows on average 1.54-, 1.96- and 5.19-fold reduction of false positives at the same sensitivities compared to models from HOCOMOCO, TRANSFAC and DeepBind, respectively. This observation suggests that one can efficiently replace the PWM models for TFBS prediction by a small number of DRAF models that significantly improve prediction accuracy. The DRAF method is implemented in a web tool and in a stand-alone software freely available at http://cbrc.kaust.edu.sa/DRAF
Simplified method for predicting a functional class of proteins in transcription factor complexes.
BACKGROUND: Initiation of transcription is essential for most of the cellular responses to environmental conditions and for cell and tissue specificity. This process is regulated through numerous proteins, their ligands and mutual interactions, as well as interactions with DNA. The key such regulatory proteins are transcription factors (TFs) and transcription co-factors (TcoFs). TcoFs are important since they modulate the transcription initiation process through interaction with TFs. In eukaryotes, transcription requires that TFs form different protein complexes with various nuclear proteins. To better understand transcription regulation, it is important to know the functional class of proteins interacting with TFs during transcription initiation. Such information is not fully available, since not all proteins that act as TFs or TcoFs are yet annotated as such, due to generally partial functional annotation of proteins. In this study we have developed a method to predict, using only sequence composition of the interacting proteins, the functional class of human TF binding partners to be (i) TF, (ii) TcoF, or (iii) other nuclear protein. This allows for complementing the annotation of the currently known pool of nuclear proteins. Since only the knowledge of protein sequences is required in addition to protein interaction, the method should be easily applicable to many species. RESULTS: Based on experimentally validated interactions between human TFs with different TFs, TcoFs and other nuclear proteins, our two classification systems (implemented as a web-based application) achieve high accuracies in distinguishing TFs and TcoFs from other nuclear proteins, and TFs from TcoFs respectively. CONCLUSION: As demonstrated, given the fact that two proteins are capable of forming direct physical interactions and using only information about their sequence composition, we have developed a completely new method for predicting a functional class of TF interacting protein partners with high precision and accuracy